Back to home page

Project CMSSW displayed by LXR

 
 

    


Warning, /RecoTracker/MkFitCore/standalone/validation-desc.txt is written in an unsupported language. File is not indexed.

0001 EDIT HISTORY
0002 ** KPM 16/09/18: move id + label explanation to index-desc.txt
0003 ** KPM 06/08/18: moved preface to README **
0004 ** KPM 11/07/18: added a preface but still need to update this for newest methods/revisions... **
0005 ** KPM 25/02/18: still need to update methods for setting mc/cmssw track id **
0006 
0007 PREFACE: This file is a compendium on how the validation runs within mkFit, which makes use of the TTreeValidation class and other supporting macros.  
0008 
0009 ===================
0010  Table of Contents
0011 ===================
0012 
0013 A. Overview of code
0014 B. Overview of routine calls in mkFit
0015 C. Explanation of validation routines
0016   I. Tracks and Extras Prep
0017   II. Track Association Routines
0018   III. TTree Filling
0019 D. Definitions of efficiency, fake rate, and duplicate rate
0020 E. Overview of scripts
0021 F. Hit map/remapping logic
0022 G. Extra info on ID and mask assignments
0023 H. Special note about duplicate rate
0024 
0025 =====================
0026  A. Overview of code
0027 =====================
0028 
0029 TTreeValidation will only compile the necessary ROOT code with WITH_ROOT:=1 enabled (either manually editting Makefile.config, or at the command line). Always do a make clean before compiling with ROOT, as the code is ifdef'ed. To hide the heavy-duty functions from the main code, TTreeValidation inherits from the virtual class "Validation", and overrides the common functions.  The TTreeValidation object is created once per number of events in flight. The Event object obtains a reference to the validation object (store as a data member "validation_"), so it is up to the Event object to reset some of the data members of the TTreeValidation object on every event.
0030 
0031 Three types of validation exist within TTreeValidation:
0032 [1.] "Building validation", enabled with Config::sim_val, via the command line option: --sim-val [--read-sim-trackstates, for pulls]
0033 [2.] "CMSSW external tracks building validation", enabled with Config::cmssw_val, via a minimum of the command line options: --cmssw-val --read-cmssw-tracks --geom CMS-2017 --seed-input cmssw [and potentially a seed cleaning --seed-cleaning <str>, and also specifying which cmssw matching --cmssw-matching <str>]
0034 [3.] "Fit validation", enabled with Config::fit_val, via the command line option: --fit-val
0035 
0036 We will ignore fit validation for the moment. The main idea behind the other two is that the validation routines are called outside of the standard timed sections, and as such, we do not care too much about performance, as long as it takes a reasonable amount of time to complete. Of course, the full wall clock time matters when running multiple events in flight, and because there is a lot of I/O as well as moves and stores that would hurt the performance with the validation enabled, these routines are ignored if the command line option "--silent" is enabled.
0037 
0038 The building validation takes advantage of filling two trees per event per track, namely:
0039 [1.] 
0040   - efftree (filled once per sim track) 
0041   - frtree (filled once per seed track)
0042 [2.] 
0043   - cmsswefftree (filled once per cmssw external track)
0044   - cmsswfrtree (filled once per mkFit build track)
0045 
0046 [1.] validation exists in the following combinations of geometry seed source:
0047   - ToyMC, with --seed-input ["sim", "find]
0048   - CMSSW, with --seed-input ["sim", "cmssw"] (+ --seed-cleaning <str> [<str>: "n2", "badlabel", "pure", "none"], + --cmssw-matching <str> [<str>: "trkparam", "hit", "label"])
0049 
0050 Upon instantiation of the TTreeValidation object, the respective ROOT trees are defined and allocated on the heap, along with setting the addresses of all the branches. After the building is completed in mkFit, we have to have the tracks in their standard event containers, namely: seedTracks_, candidateTracks_, and fitTracks_. In the standard combinatorial or clone engine, we have to copy out the built tracks from event of combined candidates into candidateTracks_ via: builder.quality_store_tracks() in mkFit/buildtestMPlex.cc.  Since we do not yet have fitting after building, we just set the fitTracks_ equal to the candidateTracks_.  For ease, I will from now on refer to the candidateTracks_ as buildTracks_.
0051 
0052 As a reminder, the sim tracks are stored in the Event.cc as simTracks_, while the CMSSW reco tracks are stored as cmsswTracks_. Each track collection has an associated TrackExtra collection, which is stored as {trackname}Extra_ inside the Event object.  It is indexed the same as the collection it references, i.e. track[0] has an associated extra extra[0]. The TrackExtra object contains the mcTrackID, seedID, and cmsswTrackID each mkFit track is associated to. The validation also makes use of simHitsInfo_ (container for storing mcTrackID for each hit), layerHits_, and simTrackStates_ (used for pulls).  See Section B and C for explanations on how the track matching is performed and track information is saved.  Essentially, we store two sets of maps, one which has a key that is an index to the reference track (MC or CMSSW) and a vector of indices for those that match it (for seeds, build tracks, and fit tracks), and the second map which maps the seed track index to its corresponding build and fit tracks.  The reason for having a sim match map for seeds, build tracks, and fit tracks is to keep track of how well the efficiency/fake rate/duplicate improves/degrades with potential cuts between them. And the same reason for having a map of seed to build as well as seed to fit. 
0053 
0054 Following each event, each of the track and extra objects are cleared. In addition, the association maps are cleared and reset. After the main loop over events expires, the ROOT file is written out with the TTrees saved via: val.saveTTrees() in mkFit.cc. The destructor for the validation then deletes the trees. The output is "valtree.root", appended by the thread number if using multiple events in flight.  From here, we then take advantage of the following files:
0055 
0056 - runValidation.C // macro used for turning TTrees into efficiency/fake rate/duplicate rate plots
0057 - PlotValidation.cpp/.hh // source code for doing calculations
0058 - makeValidation.C // plots on a single canvas results for Best Hit (BH), Standard Combinatorial (STD), and Clone Engine (CE)
0059 
0060 ======================================
0061  B. Outline of routine calls in mkFit
0062 ======================================
0063 
0064 The following routines are then called after the building (MkBuilder.cc, Event.cc, TTreeValidation.cc, Track.cc):
0065 
0066 [1.] builder.sim_val()
0067    : (actually run clean_cms_simtracks() when using CMS geom and using sim tracks as reference set)
0068    : remap_seed_hits()
0069    : remap_cand_hits()
0070    : prep_recotracks()
0071      : prep_tracks(seedtracks,seedextras)
0072        : m_event->validation_.alignTracks(tracks,extras,false)
0073      : prep_tracks(buildtracks,buildextras)
0074        : m_event->validation_.alignTracks(tracks,extras,false)
0075      : prep_tracks(fittracks,fitextras)
0076        : m_event->validation_.alignTracks(tracks,extras,false)
0077    : if (cmssw-seeds) m_event->clean_cms_simtracks() // label which simtracks are not findable: already set if using sim seeds
0078    : m_event->Validate()
0079      : validation_.setTrackExtras(*this) 
0080        : if (sim seeds) extra.setMCTrackIDInfoByLabel() // Require 50% of found hits after seed to match label of seed/sim track
0081          : modifyRecTrackID()
0082        : if (--seed-input cmssw || find) extra.setMCTrackIDInfo() // Require 75% of found hits to match a single sim track
0083          : modifyRecTrackID()
0084      : validation_.makeSimTkToRecoTksMaps(*this)
0085        : mapRefTkToRecoTks(seedtracks,seedextras,simToSeedMap) // map key = mcTrackID, map value = vector of seed labels
0086        : mapRefTkToRecoTks(buildtracks,buildextras,simToBuildMap) // map key = mcTrackID, map value = vector of build labels
0087        : mapRefTkToRecoTks(fittracks,fitextras,simToFitMap) // map key = mcTrackID, map value = vector of fit labels
0088      : validation_.makeSeedTkToRecoTkMaps(*this)
0089        : mapSeedTkToRecoTk(buildtracks,buildextras,seedToBuildMap) // map key = seedID, map value = build track label
0090        : mapSeedTkToRecoTk(fittracks,fitextras,seedToFitMap) // map key = seedID, map value = fit track label
0091      : validation_.fillEfficiencyTree(*this)
0092      : validation_.fillFakeRateTree(*this)      
0093 
0094 [2.] builder.cmssw_val()
0095    : (actually runs m_event->validation.makeSeedTkToCMSSWTkMap() from MkBuilder::prepare_seeds())
0096    : (when using N^2 cleanings, Event::clean_cms_seedtracks(), or if not using N^2 cleaning, Event::use_seeds_from_cmsswtracks())
0097    : remap_cand_hits()
0098    : prep_recotracks()
0099      : prep_tracks(buildtracks,buildextras)
0100        : m_event->validation_.alignTracks(tracks,extras,false)
0101    : prep_cmsswtracks()
0102      : prep_tracks(cmsswtracks,cmsswextras)     
0103        : m_event->validation_.alignTracks(tracks,extras,false)  
0104    : m_event->Validate()
0105      : validation_.setTrackExtras(*this)
0106        : storeSeedAndMCID() 
0107        : if (--cmssw-matching trkparam) extra.setCMSSWTrackIDInfoByTrkParams() // Chi2 and dphi matching (also incudes option for nHits matching)
0108          : modifyRecTrackID()                 
0109        : else if (--cmssw-matching hit) extra.setCMSSWTrackIDInfoByHits() // Chi2 and dphi matching (also incudes option for nHits matching)
0110          : modifyRecTrackID()                 
0111        : else if (--cmssw-matching label) extra.setCMSSWTrackIDInfoByLabel() // 50% hit sharing after seed
0112          : modifyRecTrackID()                 
0113      : validation_.makeCMSSWTkToRecoTksMaps(*this)
0114        : mapRefTkToRecoTks(buildtracks,buildextras,cmsswToBuildMap)
0115      : validation_.fillCMSSWEfficiencyTree(*this)       
0116      : validation_.fillCMSSWFakeRateTree(*this) 
0117 
0118 =======================================
0119  C. Explanation of validation routines
0120 =======================================
0121 
0122 - map/remap hit functions: see notes in section E. Essentially, validation needs all hit indices inside tracks to match the hit indices inside ev.layerHits_.
0123 
0124 +++++++++++++++++++++++++++
0125  I. Tracks and Extras Prep
0126 +++++++++++++++++++++++++++
0127 
0128 - clean_cms_simtracks()
0129   : loop over sim tracks
0130     : mark sim track status not findable if (nLayers < [Config::cmsSelMinLayers == 8])
0131     : tracks are not removed from collection, just have this bit set. this way the mcTrackID == position in vector == label
0132 
0133 - clean_cms_seedtracks()
0134   : cmssw seed tracks are cleaned according to closeness in deta, dphi, dR to other cmssw seed tracks--> duplicate removal
0135   : loop over cleaned seed tracks, and if label_ == -1, then incrementally decrease label (so second -1 seed is -2, third is -3)
0136 
0137 - prep_tracks(tracks,extras) 
0138   : Loop over all track collections in consideration
0139     : sort hits inside track by layer : needed for counting unique layers and for association routines
0140     : emplace_back a track extra, initialized with the label of the track (which happens to be its seed ID) // if using sim seeds, we know that seed ID == sim ID
0141   : m_event->validation_.alignTracks(tracks,extras,alignExtra)   
0142 
0143 - alignTracks(tracks,extras,alignExtra)
0144   : if alignExtra == true // needed for when a reco track collection, which was previously labeled by its label() == seedID, created its track extra at the same time but the track collection has been moved or sorted
0145     : create temporary track extra collection, size of track collection
0146     : loop over tracks
0147       : set tmp extra to the old track extra collection matching the track label
0148     : set the old track extra to equal the new collection
0149   : loop over tracks
0150     : set the track label equal to the index inside the vector // needed for filling routines which rely on maps of indices between two track collections
0151 
0152 - prep_cmsswtracks()
0153   : Stanard prep_tracks()
0154   : loop over cmssw tracks
0155     : Count unique layers = nLayers
0156     : set status of cmssw track to notFindable() if: (nUniqueLayers() < [Config::cmsSelMinLayers == 8]) // same criteria for "notFindable()" cmssw sim tracks used for seeds
0157 
0158 ++++++++++++++++++++++++++++++++
0159  II. Track Association Routines
0160 ++++++++++++++++++++++++++++++++
0161 
0162 - setTrackExtras(&Event)    
0163   : if [1.]
0164     : loop over seed tracks
0165       : setMCTrackIDInfo(true) : Require 75% of found hits to match a single sim track
0166     : loop over build tracks
0167        : if (sim seeds) setMCTrackIDInfoByLabel() : Require 50% of found hits after seed to match label of seed/sim track
0168        : if (cms seeds) setMCTrackIDInfo(false) : Require 75% of found hits to match a single sim track
0169     : loop over fit tracks 
0170       : same options as build tracks   
0171   : if [2.]
0172     : setupCMSSWMatching()
0173       : first loop over cmssw tracks
0174         : create a vector of "reduced tracks" that stores 1./pt, eta, and associated covariances in reduced track states
0175         : add cmssw label to a map of lyr, map of idx, vector of labels
0176         : also include track momentum phi, and a list of hits inside a map. map key = layer, map value = vector of hit indices
0177     : loop over build tracks
0178       : setCMSSWTrackIDInfo() : require matching by chi2 and dphi
0179     : storeMCandSeedID()
0180 
0181 - modifyRecTrackID() 
0182   // Config::nMinFoundHits = 7, Config::nlayers_per_seed = 4 or 3 
0183   // nCandHits = trk.nFoundHits() OR trk.nFoundHits()-Config::nlayers_per_seed (see calling function)
0184   // nMinHits = Config::nMinFoundHits OR Config::nMinFoundHits-Config::nlayers_per_seed (see calling function)
0185   : if track has been marked as a duplicate, mc/cmsswTrackID = -10
0186   : else if (mc/cmsswTrackID >= 0) (i.e. the track has successfully matched)
0187     : if mc/cmsswTrack is findable
0188       : if nCandHits < nMinHits, mc/cmsswTrackID = -2
0189     : else
0190       : if nCandHits < nMinHits, mc/cmsswTrackID = -3 
0191       : else mc/cmsswTrackID = -4 (track is long enough, matched, but that sim track that is unfindable)
0192   : else if (mc/cmsswTrackID == -1)
0193     : if matching by label, and ref track exists
0194       : if ref track is findable
0195         : if nCandHits < minHits, ID = -5
0196       : else 
0197         : if nCandHits < nMinHits, ID = -6
0198         : else, ID = -7
0199     : else (not matching by label, or ref track does not exist
0200       : if nCandHits < nMinHits, ID = -8
0201       : else, ID = -9
0202   -->return potentially new ID assignment
0203 
0204 - setMCTrackIDInfoByLabel()
0205   : Loop over found hits on build track after seed
0206     : count the hits who have a mcTrackID == seedID_ (i.e. seedID == simTrack label == mcTrackID)
0207   : if hits are found after seed
0208     : if 50% are matched, mcTrackID == seedID_
0209     : else, mcTrackID == -1
0210   : mcTrackID = modifyRecTrackID() // nCandhits = nFoundHits-nlayers_per_seed, nMinHits = Config::nMinFoundHits - Config::nlayers_per_seed                    
0211     
0212 - setMCTrackIDInfo(isSeedTrack)
0213   : Loop over all found hits on build track (includes seed hits)
0214     : count the mcTrackID that appears most from the hits
0215   : if 75% of hits on reco track match a single sim track, mcTrackID == mcTrackID of single sim track
0216   : else, mcTrackID == -1
0217   : if (!isSeedTrack)
0218     : modifyRecTrackID() // nCandHits = nFoundHits, nMinHits = Config::nMinFoundHits
0219 
0220 - setCMSSWTrackIDInfo()
0221   : Loop over all cmssw "reduced" tracks
0222     : if helix chi2 < [Config::minCMSSWMatchChi2 == 50]
0223       : append label of cmssw track to a vector, along with chi2
0224   : sort vector by chi2
0225   : loop over label vector
0226     : swim cmssw track momentum phi from phi0 to mkFit reco track
0227     : if abs(wrapphi(dphi)) < [Config::minCMSSWMatchdPhi == 0.03]
0228       : see if dphi < currently best stored mindphi, and if yes, then set this as the new mindphi + label as matched cmsswTrackID
0229       : if using nHits matching, check for nHits matched --> currently not used nor tuned
0230   : if no label is found, cmsswTrackID == -1
0231   : modifyRefTrackID() // nCandHits and nMinHits same as setMCTrackIDInfo()
0232 
0233 - setCMSSWTrackIDInfoByLabel()
0234   : want to match the hits on the reco track to those on the CMSSW track
0235   : loop over hits on reco track after seed
0236     : get hit idx and lyr
0237       : if the cmssw track has this lyr, loop over hit indices on cmssw track with this layer
0238         : if cmssw hit idx matches reco idx, increment nHitsMatched_
0239   : follow same logic as setMCTrackIDInfoByLabel() for setting cmsswTrackID
0240   : modifyRecTrackID() // nCandHits and nMinHits same as setMCTrackIDInfoByLabel()
0241   
0242 - mapRefTkToRecoTks(tracks,extras,map)
0243   : Loop over reco tracks
0244     : get track extra for track
0245     : if [1.], map[extra.mcTrackID()].push_back(track.label()) // reminder, label() now equals index inside track vector!
0246     : if [2.], map[extra.cmsswTrackID()].push_back(track.label()) // reminder, label() now equals index inside track vector!
0247   : Loop over pairs in map
0248     : if vector of labels size == 1, get track extra for label, and set duplicate index == 0
0249     : else
0250       : make temp track vector from track labels, sort track vector by nHits (and sum hit chi2 if tracks have same nHits)
0251       : set vector of labels to sorted tracks
0252       : loop over vector labels
0253         : get track extra for label, and set duplicate index++ 
0254 
0255 - mapSeedTkToRecoTk(tracks,extras,map)
0256   : loop over reco tracks
0257     : map[extra.seedID()] = track.label()
0258 
0259 - makeSeedTkToCMSSWTkMap(event)
0260   : this is run BEFORE seed cleaning AND BEFORE the seeds are sorted in eta in prepare_seeds()
0261   : if seed track index in vector == cmssw track label(), store map key = seed track label(), map value = cmssw track label() in seedToCmsswMap (seedID of cmssw track)
0262 
0263 - storeMCandSeedID()
0264   : reminder: both the candidate tracks and the cmssw tracks have had their labels reassigned, but their original labels were stored in their track extra seedIDs.  reminder, seedID of candidate track points to the label of the seed track.  label on seed track == sim track reference, if it exists!
0265   : loop over candidate tracks
0266     : set mcTrackID == seedID_ of track
0267     : if seedToCmsswMap[cand.label()] exists, then set the seedID equal to the mapped value (i.e. the seedID of the cmssw track!)
0268     : else, set seedID == -1
0269   : After this is run, to get the matching CMSSW track, we then need to loop over the CMSSW track extras with an index based loop, popping out when the cmsswextra[i].seedID() == buidextra[j].seedID()
0270 
0271 ++++++++++++++++++++
0272  III. TTree Filling
0273 ++++++++++++++++++++
0274 
0275 - fillEfficiencyTree()
0276   : loop over simtracks
0277     : get mcTrackID (i.e. simTrack.label())
0278     : store sim track gen info
0279     : if simToSeedMap[mcTrackID] has value
0280       : mcmask == 1
0281       : get first seed track matched (i.e. the one with the highest nHits --> or lowest sum hit chi2 as provided by sort from above)
0282       : store seed track parameters
0283       : store nHits, nlayers, last layer, chi2
0284       : store duplicate info: nTrks_matched from size() of mapped vector of labels, and duplicateMask == seedtrack.isDuplicate()
0285       : get last found hit index
0286         : store hit parameters
0287         : if mcTrackID of hit == mcTrackID of sim track // ONLY for when simtrackstates are stored, i.e. in ToyMC only at the moment
0288           : store sim track state momentum info from this layer (from simTrackStates[mcHitID])
0289         : else get sim track state of mcTrackID, then store momentum info
0290     : else
0291       : mcmask == 0, or == -1 if simtrack.isNotFindable()
0292     : if simToBuildMap[mcTrackID] has value
0293       : repeat as above
0294     : if simToFitMap[mcTrackID] has value
0295       : repeat as above
0296     : fill efftree
0297 
0298 - fillFakeRateTree()
0299   : loop over seed tracks
0300     : get seedID of seed track from track extra
0301     : fill seed track parameters + last hit info, nhits, etc
0302     : assign mcmask info based on mcTrackID from track extra (see section D and G for explanation of mask assignments)
0303     : if mcmask == 1
0304       : store gen sim momentum parameters
0305       : store nhits info, last layer
0306       : store duplicate info: iTh track matched from seedtrack extra, duplicateMask == seedtrack.isDuplicate()
0307       : if last hit found has a valid mcHitID
0308         : store sim track state momentum info from simTrackStates[mcHitID]
0309     : if seedToBuildMap[seedID] has value
0310       : fill build track parameters + last hit info, nhits, etc
0311       : assign mcmask info based on mcTrackID from track extra (see section D and G for explanation of mask assignments)
0312       : if mcmask == 1
0313         : store gen sim momentum parameters
0314         : store nhits info, last layer, duplicate info as above
0315         : if last hit found has a valid mcHitID
0316           : store sim track state momentum info from simTrackStates[mcHitID]
0317     : if seedToFitMap[seedID] has value
0318       : same as above
0319     : fill frtree
0320 
0321 - fillCMSSWEfficiencyTree()
0322   : loop over cmsswtracks
0323     : get label of cmsswtrack, seedID
0324     : store cmssw track PCA parameters + nhits, nlayers, last layer
0325     : if cmsswToBuilddMap[cmsswtrack.label()] has value
0326       : get first build track matched (i.e. the one with the highest nHits --> or lowest sum hit chi2 as provided by sort from above)
0327       : store build track parameters + errors
0328       : store nHits, nlayers, last layer, last hit parameters, hit and helix chi2, duplicate info, seedID
0329       : swim cmssw phi to mkFit track, store it
0330     : fill cmsswefftree
0331 
0332 - fillCMSSWFakeRateTree()
0333   : loop over build tracks
0334     : store build track parameters + errors
0335     : store nHits, nlayers, last layer, last hit parameters, hit and helix chi2, duplicate info, seedID
0336     : get cmsswTrackID, assign cmsswmask according to section D and G
0337     : if cmsswmask == 1 
0338       : store cmssw track PCA parameters + nhits, nlayers, last layer, seedID
0339       : swim cmssw phi to mkFit track, store it
0340     : fill cmsswefftree
0341 
0342 =============================================================
0343  D. Definitions of efficiency, fake rate, and duplicate rate
0344 =============================================================
0345 
0346 Use rootValidation.C to create efficiency, fake rate, and duplicate rate vs. pT, phi, eta. This macro compiles PlotValidation.cpp/.hh. Efficiency uses sim track momentum info. Fake rate uses the reco track momentum. For [1.], plots are made for seed, build, and fit tracks. For [2.], the plots are only against the build tracks. See G. for more details on ID assignments.
0347 
0348 root -l -b -q runValidation.C\([agruments]\)
0349 
0350 Argument list: 
0351 First is additional input name of root file [def = ""]
0352 Second argument is boolean to compute momentum pulls: currently implemented only when sim track states are available (ToyMC validation only)! [def = false]
0353 Third argument is boolean to do special CMSSW validation [def = false]
0354 Fourth argument == true to move input root file to output directory, false to keep input file where it is. [def = true]
0355 Fifth argument is a bool to save the image files [def = false]
0356 Last argument is output type of plots [def = "pdf"]
0357 
0358 Efficiency [PlotValidation::PlotEfficiency()]
0359   numerator:   sim tracks with at least one reco track with mcTrackID >= 0 (mcmask_[reco] == 1)
0360   denominator: all findable sim tracks (mcmask_[reco] = 0 || == 1)
0361   mcmask_[reco] == - 1 excluded from both numerator and denominator because this sim track was not findable!
0362 
0363 Fake Rate (with only long reco tracks: Config::inclusiveShorts == false) [PlotValidation::PlotFakeRate()]
0364   numerator:   reco tracks with mcTrackID == -1 || == -9
0365   denominator: reco tracks with mcTrackID >=  0 || == -1 || == -9
0366   mcTrackID | mcmask_[reco] 
0367      >= 0   |     1
0368     -1,-9   |     0
0369      -10    |    -2
0370      else   |    -1 // OR the seed track does produce a build/fit track as determined by the seedToBuild/FitMap
0371 
0372 N.B. In the MTV-Like SimVal: the requirement on minHits is removed, so all reco tracks are considered.
0373  - For the efficiency: only simtracks from the hard scatter (with some quality cuts on d0, dz, and eta) are considered for the denominator and numerator. If a simtrack from the hard-scatter is unmatched, it will not enter the numerator.
0374  - For the FR: all reco tracks (regardless of nHits) are in the denominator, and only those that are unmatched to any simtrack are in the numerator. Compared to the standard FR definition, we now allow reco tracks that are matched to any simtrack (regardless of quality of the simtrack, if its from PU, etc.) to enter the denominator. 
0375 - This means that tracks with mcTrackID == -4 will now have a mcmask_[reco] == 2 for MTV-Like simtrack validation. 
0376 
0377 Fake Rate (with all reco tracks: Config::inclusiveShorts == true, enabled with command line option: --inc-shorts) [PlotValidation::PlotFakeRate()]
0378   numerator:   reco tracks with mcTrackID == -1 || == -5 || ==  -8 || ==  -9
0379   denominator: reco tracks with mcTrackID >=  0 || == -2 || == -1 || == -5 || == -8 || == -9
0380   mcTrackID  | mcmask_[reco] 
0381     >= 0     |     1
0382  -1,-5,-8,-9 |     0
0383     -10      |    -2
0384      -2      |     2   
0385     else     |    -1 // OR the seed track does produce a build/fit track as determined by the seedToBuild/FitMap
0386 
0387 Duplicate Rate [PlotValidation::PlotDuplicateRate()], see special note in section H
0388   numerator:   sim tracks with more than reco track match (duplmask_[reco] == 1), or another way is nTrks_matched_[reco] > 1
0389   denominator: sim tracks with at least one reco track with mcTrackID >= 0 (duplmask_[reco] != -1), or mcmask_[reco] == 1
0390 
0391 ========================
0392  E. Overview of scripts
0393 ========================
0394 
0395 I. ./validation-snb-toymc-fulldet-build.sh
0396 Runs ToyMC full detector tracking for BH, STD, CE, for 400 events with nTracks/event = 2500. Sim seeds only.
0397 
0398 To move the images + text files and clean up directory:
0399 ./web/move-toymcval.sh ${outdir name}
0400 
0401 II. ./validation-snb-cmssw-10mu-fulldet-build.sh
0402 Runs CMSSW full detector tracking for BH, STD, CE, for ~1000 events with 10 muons/event, with sim and cmssw seeds, using N^2 cleaning for cmssw seeds.
0403 Samples are split by eta region. Building is run for each region:
0404 - ECN2: 2.4 < eta < 1.7
0405 - ECN1: 1.75 < eta < 0.55
0406 - BRL: |eta| < 0.6
0407 - ECP1: 0.55 < eta < 1.75
0408 - ECP2: 1.7 < eta < 2.4
0409 
0410 Validation plots are produced for each sample (region), seeding source, and building routine. At the very end, validation trees are hadd'ed for each region in a given seed source + building routine. Plots are produced again to yield "full-detector" tracking.
0411 
0412 To move the images + text files and clean up directory:
0413 ./web/move-cmsswval-10mu.sh ${outdir name}
0414 
0415 III. ./validation-snb-cmssw-10mu-fulldet-extrectracks.sh
0416 Same as II., but now only run with cmssw seeds (as we are comparing directly to cmssw output as the reference).
0417 
0418 To move the images + text files and clean up directory:
0419 ./web/move-cmsswval-10mu-extrectracks.sh ${outdir name}
0420 
0421 IV. ./validation-snb-cmssw-ttbar-fulldet.sh
0422 Runs CMSSW full detector tracking for BH, STD, CE, for three different ttbar samples with 100 events each, with sim and cmssw seeds, using N^2 cleaning for cmssw seeds.
0423 TTbar samples:
0424 - No PU
0425 - PU 35
0426 - PU 70
0427 
0428 To move the images + text files and clean up directory:
0429 ./web/move-cmsswval-ttbar.sh ${outdir name}
0430 
0431 V. ./validation-snb-cmssw-ttbar-fulldet.sh
0432 Same as IV., but now only run with cmssw seeds, using cmssw rec tracks as the reference set of tracks.
0433 
0434 To move the images + text files and clean up directory:
0435 ./web/move-cmsswval-ttbar-extrectracks.sh ${outdir name}
0436 
0437 ============================
0438  F. Hit map/remapping logic
0439 ============================
0440 
0441 *** Originally from mkFit/MkBuilder.cc ***
0442 
0443 All built candidate tracks have all hit indices pointing to m_event_of_hits.m_layers_of_hits[layer].m_hits (LOH)
0444 MC seeds (both CMSSW and toyMC),as well as CMSSW seeds, have seed hit indices pointing to global HitVec m_event->layerHits_[layer] (GLH)
0445 Found seeds from our code have all seed hit indices pointing to LOH.
0446 So.. to have universal seed fitting function --> have seed hits point to LOH no matter their origin.
0447 This means that all MC and CMSSW seeds must be "mapped" from GLH to LOH: map_seed_hits().
0448 Now InputTracksAndHits() for seed fit will use LOH instead of GLH.
0449 The output tracks of the seed fitting are now stored in m_event->seedTracks_.
0450 
0451 Then building proceeds as normal, using m_event->seedTracks_ as input no matter the choice of seeds. 
0452 
0453 For the validation, we can reuse the TrackExtra setMCTrackIDInfo() with a few tricks.
0454 Since setMCTrackIDInfo by necessity uses GLH, we then need ALL track collections (seed, candidate, fit) to their hits point back to GLH.
0455 There are also two validation options: w/ or w/o ROOT.
0456 
0457 W/ ROOT uses the TTreValidation class which needs seedTracks_, candidateTracks_, and fitTracks_ all stored in m_event.
0458 The fitTracks_ collection for now is just a copy of candidateTracks_ (eventually may have cuts and things that affect which tracks to fit).
0459 So... need to "remap" seedTracks_ hits from LOH to GLH with remap_seed_hits().
0460 And also copy in tracks from EtaBin* to candidateTracks_, and then remap hits from LOH to GLH with quality_store_tracks() and remap_cand_hits().
0461 W/ ROOT uses sim_val()
0462 
0463 W/O ROOT is a bit simpler... as we only need to do the copy out tracks from EtaBin* and then remap just candidateTracks_.
0464 This uses quality_output()
0465 
0466 N.B.1 Since fittestMPlex at the moment is not "end-to-end" with candidate tracks, we can still use the GLH version of InputTracksAndHits()
0467 N.B.2 Since we inflate LOH by 2% more than GLH, hit indices in building only go to GLH, so all loops are sized to GLH.
0468 
0469 ==========================================
0470  G. Extra info on ID and mask assignments
0471 ==========================================
0472 
0473 *** Originally from Track.cc ***
0474 
0475 Three basic quantities determine the track ID: 
0476  1. matching criterion (50% after seed for *ByLabel(), 75% for other hit matching, or via chi2+dphi)
0477  2. nCandidateHits found compared nMinHits
0478  3. findability of reference track (if applicable)
0479 
0480 Three outcomes exist for each quantity:
0481  1. matching criterion
0482     a. reco track passed the matching criterion in set*TrackIDInfo*(): M
0483     b. reco track failed the matching criterion in set*TrackIDInfo*(): N
0484     c. reco track never made it past its seed, so matching selection by hit matching via reference track label does not exist in set*TrackIDInfoByLabel(): N/A
0485  2. nCandHits compared to nMinHits
0486     a. reco track has greater than or equal to the min hits requirement (i.e. is long enough): L
0487     b. reco track has less than the min hits requirement (i.e. short): S
0488     c. reco track is a pure seed, and calling function is set*TrackIDInfoByLabel(): O, by definition then O also equals S
0489  3. findability of reference track
0490     a. reference track is findable (nUniqueLayers >= 8 && pT > 0.5): isF
0491     b. reference track is NOT findable (nUniqueLayers < 8 || pT < 0.5): unF
0492     c. reference track does not exist in set*TrackIDInfoByLabel(), or we are using set*TrackIDInfo(): ?
0493 
0494 *** Originally from TTreeValidation.cc ***
0495 
0496 ** Mask assignments **
0497 
0498 _[reco] = {seed,build,fit}
0499 
0500 Logic is as follows: any negative integer means that track is excluded from both the numerator and denominator. A mask with a value greater than 1 means that the track is included in the denominator, but not the numerator.
0501 
0502 --> mcmask_[reco] == 1,"associated" reco to sim track [possible duplmask_[reco] == 1,0] {eff and FR}, enter numer and denom of eff, enter denom only of FR
0503 --> mcmask_[reco] == 0,"unassociated" reco to sim track. by definition no duplicates (no reco to associate to sim tracks!) [possible duplmask_[reco] == -1 {eff and FR}], enter denom only of eff, enter numer and denom of FR
0504 --> mcmask_[reco] == -1, sim or reco track excluded from denominator (and therefore numerator) [possible duplmask_[reco] == -1] {eff and FR}
0505 --> mcmask_[reco] == -2, reco track excluded from denominator because it does not exist (and therefore numerator) [possible duplmask_[reco] == -2] {FR}
0506 --> mcmask_[reco] == 2, reco track included in demoninator of FR, but will not enter numerator: for short "matched" tracks {FR only}
0507 
0508 --> nTkMatches_[reco] > 1,   n reco tracks associated to the same sim track ID {eff only}
0509 --> nTkMatches_[reco] == 1,  1 reco track associated to single sim track ID {eff only}
0510 --> nTkMatches_[reco] == -99, no reco to sim match {eff only}
0511 
0512 --> mcTSmask_[reco] == 1, reco track is associated to sim track, and sim track contains the same hit as the last hit on the reco track
0513 --> mcTSmask_[reco] == 0, reco track is associated to sim track, and either A) sim track does not contain the last hit found on the reco track or B) the sim trackstates were not read in (still save sim info from gen position via --try-to-save-sim-info
0514 --> mcTSmask_[reco] == -1, reco track is unassociated to sim track
0515 --> mcTSmask_[reco] == -2, reco track is associated to sim track, and we fail == 1 and == 0
0516 --> mcTSmask_[reco] == -3, reco track is unassociated to seed track {FR only}
0517 
0518 excluding position variables, as position could be -99!
0519 --> reco var == -99, "unassociated" reco to sim track [possible mcmask_[reco] == 0,-1,2; possible duplmask_[reco] == -1] {eff only}
0520 --> sim  var == -99, "unassociated" reco to sim track [possible mcmask_[reco] == 0,-1,2; possible duplmask_[reco] == -1] {FR only}
0521 --> reco/sim var == -100, "no matching seed to build/fit" track, fill all reco/sim variables -100 [possible mcmask_[reco] == -1, possible duplmask_[reco] == -1] {FR only}
0522 --> sim  var == -101, reco track is "associated" to sim track, however, sim track does have a hit on the layer the reco track is on
0523 
0524 --> seedmask_[reco] == 1, matching seed to reco/fit track [possible mcmask_[reco] == 0,1,2; possible duplmask_[reco] == 0,1,-1] {FR only}
0525 --> seedmask_[reco] == 0, no matching seed to reco/fit track [possible mcmask_[reco] == -2; possible duplmask_[reco] == -2] {FR only}
0526 
0527 --> duplmask_[reco] == 0, only "associated" reco to sim track [possible mcmask_[reco] == 1] {eff and FR}
0528 --> duplmask_[reco] == 1, more than one "associated" reco to sim track [possible mcmask_[reco] == 1] {eff and FR}
0529 --> duplmask_[reco] == -1, no "associated" reco to sim track [possible mcmask_[reco] == 0,-1,-2] {eff and FR}
0530 --> duplmask_[reco] == -2, no matching built/fit track for given seed [possible mcmask_[reco] == -2] {FR only}
0531 
0532 --> reco var == -10, variable not yet implemented for given track object
0533 
0534 position reco variables
0535 --> layers_[reco]    ==  -1, reco unassociated to sim tk {eff only}
0536 --> reco pos+err var == -2000, reco tk is unassociated to sim tk {eff only}
0537 --> reco pos+err var == -3000, reco tk is unassociated to seed tk {FR only}
0538 
0539 ======================================
0540  H. Special note about duplicate rate
0541 ======================================
0542 
0543 *** Originally from PlotValidation.cpp ***
0544 
0545 Currently, TEfficiency does not allow you to fill a weighted number in the numerator and NOT the denominator.
0546 In other words, we cannot fill numerator n-1 times sim track is matched, while denominator is just filled once.
0547 As a result, DR is simply if a sim track is duplicated once, and not how many times it is duplicated. 
0548 
0549 We can revert back to the n-1 filling for the numerator to weight by the amount of times a sim track is duplicated, but this would mean going back to the TH1Fs, and then using the binomial errors (or computing by hand the CP errors or something), in the case that the DR in any bin > 1... This would break the flow of the printouts as well as the stacking macro, but could be done with some mild pain.