Back to home page

Project CMSSW displayed by LXR

 
 

    


Warning, /RecoTracker/MkFitCore/standalone/validation-desc.txt is written in an unsupported language. File is not indexed.

0001 EDIT HISTORY
0002 ** KPM 16/09/18: move id + label explanation to index-desc.txt
0003 ** KPM 06/08/18: moved preface to README **
0004 ** KPM 11/07/18: added a preface but still need to update this for newest methods/revisions... **
0005 ** KPM 25/02/18: still need to update methods for setting mc/cmssw track id **
0006 
0007 PREFACE: This file is a compendium on how the validation runs within mkFit, which makes use of the TTreeValidation class and other supporting macros.  
0008 
0009 2022-07-13 MT notes
0010 ===================
0011 
0012 I have run the MIMI forPR validation in standalone build for the first time
0013 since migration to CMSSW. I have not tried anything else.
0014 
0015 Since standalone build is now done out-of-source, I did the following
0016 in the top-level standalone build directory (where mkFit executable is):
0017 
0018 ln -s ../RecoTracker/MkFitCore/standalone/xeon_scripts .
0019 ln -s ../RecoTracker/MkFitCore/standalone/val_scripts .
0020 ln -s ../RecoTracker/MkFitCore/standalone/plotting .
0021 ln -s ../RecoTracker/MkFitCore/standalone/web .
0022 
0023 rm -rf valtree_*.root log_*.txt
0024 val_scripts/validation-cmssw-benchmarks-multiiter.sh forPR --mtv-like-val TTbar_phase2
0025 
0026 web/collectBenchmarks-multi.sh Phase2-0 forPR
0027 
0028 
0029 ===================
0030  Table of Contents
0031 ===================
0032 
0033 A. Overview of code
0034 B. Overview of routine calls in mkFit
0035 C. Explanation of validation routines
0036   I. Tracks and Extras Prep
0037   II. Track Association Routines
0038   III. TTree Filling
0039 D. Definitions of efficiency, fake rate, and duplicate rate
0040 E. Overview of scripts
0041 F. Hit map/remapping logic
0042 G. Extra info on ID and mask assignments
0043 H. Special note about duplicate rate
0044 
0045 =====================
0046  A. Overview of code
0047 =====================
0048 
0049 TTreeValidation will only compile the necessary ROOT code with WITH_ROOT:=1 enabled (either manually editting Makefile.config, or at the command line). Always do a make clean before compiling with ROOT, as the code is ifdef'ed. To hide the heavy-duty functions from the main code, TTreeValidation inherits from the virtual class "Validation", and overrides the common functions.  The TTreeValidation object is created once per number of events in flight. The Event object obtains a reference to the validation object (store as a data member "validation_"), so it is up to the Event object to reset some of the data members of the TTreeValidation object on every event.
0050 
0051 Three types of validation exist within TTreeValidation:
0052 [1.] "Building validation", enabled with Config::sim_val, via the command line option: --sim-val [--read-sim-trackstates, for pulls]
0053 [2.] "CMSSW external tracks building validation", enabled with Config::cmssw_val, via a minimum of the command line options: --cmssw-val --read-cmssw-tracks --geom CMS-2017 --seed-input cmssw [and potentially a seed cleaning --seed-cleaning <str>, and also specifying which cmssw matching --cmssw-matching <str>]
0054 [3.] "Fit validation", enabled with Config::fit_val, via the command line option: --fit-val
0055 
0056 We will ignore fit validation for the moment. The main idea behind the other two is that the validation routines are called outside of the standard timed sections, and as such, we do not care too much about performance, as long as it takes a reasonable amount of time to complete. Of course, the full wall clock time matters when running multiple events in flight, and because there is a lot of I/O as well as moves and stores that would hurt the performance with the validation enabled, these routines are ignored if the command line option "--silent" is enabled.
0057 
0058 The building validation takes advantage of filling two trees per event per track, namely:
0059 [1.] 
0060   - efftree (filled once per sim track) 
0061   - frtree (filled once per seed track)
0062 [2.] 
0063   - cmsswefftree (filled once per cmssw external track)
0064   - cmsswfrtree (filled once per mkFit build track)
0065 
0066 [1.] validation exists in the following combinations of geometry seed source:
0067   - ToyMC, with --seed-input ["sim", "find]
0068   - CMSSW, with --seed-input ["sim", "cmssw"] (+ --seed-cleaning <str> [<str>: "n2", "badlabel", "pure", "none"], + --cmssw-matching <str> [<str>: "trkparam", "hit", "label"])
0069 
0070 Upon instantiation of the TTreeValidation object, the respective ROOT trees are defined and allocated on the heap, along with setting the addresses of all the branches. After the building is completed in mkFit, we have to have the tracks in their standard event containers, namely: seedTracks_, candidateTracks_, and fitTracks_. In the standard combinatorial or clone engine, we have to copy out the built tracks from event of combined candidates into candidateTracks_ via: builder.quality_store_tracks() in mkFit/buildtestMPlex.cc.  Since we do not yet have fitting after building, we just set the fitTracks_ equal to the candidateTracks_.  For ease, I will from now on refer to the candidateTracks_ as buildTracks_.
0071 
0072 As a reminder, the sim tracks are stored in the Event.cc as simTracks_, while the CMSSW reco tracks are stored as cmsswTracks_. Each track collection has an associated TrackExtra collection, which is stored as {trackname}Extra_ inside the Event object.  It is indexed the same as the collection it references, i.e. track[0] has an associated extra extra[0]. The TrackExtra object contains the mcTrackID, seedID, and cmsswTrackID each mkFit track is associated to. The validation also makes use of simHitsInfo_ (container for storing mcTrackID for each hit), layerHits_, and simTrackStates_ (used for pulls).  See Section B and C for explanations on how the track matching is performed and track information is saved.  Essentially, we store two sets of maps, one which has a key that is an index to the reference track (MC or CMSSW) and a vector of indices for those that match it (for seeds, build tracks, and fit tracks), and the second map which maps the seed track index to its corresponding build and fit tracks.  The reason for having a sim match map for seeds, build tracks, and fit tracks is to keep track of how well the efficiency/fake rate/duplicate improves/degrades with potential cuts between them. And the same reason for having a map of seed to build as well as seed to fit. 
0073 
0074 Following each event, each of the track and extra objects are cleared. In addition, the association maps are cleared and reset. After the main loop over events expires, the ROOT file is written out with the TTrees saved via: val.saveTTrees() in mkFit.cc. The destructor for the validation then deletes the trees. The output is "valtree.root", appended by the thread number if using multiple events in flight.  From here, we then take advantage of the following files:
0075 
0076 - runValidation.C // macro used for turning TTrees into efficiency/fake rate/duplicate rate plots
0077 - PlotValidation.cpp/.hh // source code for doing calculations
0078 - makeValidation.C // plots on a single canvas results for Best Hit (BH), Standard Combinatorial (STD), and Clone Engine (CE)
0079 
0080 ======================================
0081  B. Outline of routine calls in mkFit
0082 ======================================
0083 
0084 The following routines are then called after the building (MkBuilder.cc, Event.cc, TTreeValidation.cc, Track.cc):
0085 
0086 [1.] builder.sim_val()
0087    : (actually run clean_cms_simtracks() when using CMS geom and using sim tracks as reference set)
0088    : remap_seed_hits()
0089    : remap_cand_hits()
0090    : prep_recotracks()
0091      : prep_tracks(seedtracks,seedextras)
0092        : m_event->validation_.alignTracks(tracks,extras,false)
0093      : prep_tracks(buildtracks,buildextras)
0094        : m_event->validation_.alignTracks(tracks,extras,false)
0095      : prep_tracks(fittracks,fitextras)
0096        : m_event->validation_.alignTracks(tracks,extras,false)
0097    : if (cmssw-seeds) m_event->clean_cms_simtracks() // label which simtracks are not findable: already set if using sim seeds
0098    : m_event->Validate()
0099      : validation_.setTrackExtras(*this) 
0100        : if (sim seeds) extra.setMCTrackIDInfoByLabel() // Require 50% of found hits after seed to match label of seed/sim track
0101          : modifyRecTrackID()
0102        : if (--seed-input cmssw || find) extra.setMCTrackIDInfo() // Require 75% of found hits to match a single sim track
0103          : modifyRecTrackID()
0104      : validation_.makeSimTkToRecoTksMaps(*this)
0105        : mapRefTkToRecoTks(seedtracks,seedextras,simToSeedMap) // map key = mcTrackID, map value = vector of seed labels
0106        : mapRefTkToRecoTks(buildtracks,buildextras,simToBuildMap) // map key = mcTrackID, map value = vector of build labels
0107        : mapRefTkToRecoTks(fittracks,fitextras,simToFitMap) // map key = mcTrackID, map value = vector of fit labels
0108      : validation_.makeSeedTkToRecoTkMaps(*this)
0109        : mapSeedTkToRecoTk(buildtracks,buildextras,seedToBuildMap) // map key = seedID, map value = build track label
0110        : mapSeedTkToRecoTk(fittracks,fitextras,seedToFitMap) // map key = seedID, map value = fit track label
0111      : validation_.fillEfficiencyTree(*this)
0112      : validation_.fillFakeRateTree(*this)      
0113 
0114 [2.] builder.cmssw_val()
0115    : (actually runs m_event->validation.makeSeedTkToCMSSWTkMap() from MkBuilder::prepare_seeds())
0116    : (when using N^2 cleanings, Event::clean_cms_seedtracks(), or if not using N^2 cleaning, Event::use_seeds_from_cmsswtracks())
0117    : remap_cand_hits()
0118    : prep_recotracks()
0119      : prep_tracks(buildtracks,buildextras)
0120        : m_event->validation_.alignTracks(tracks,extras,false)
0121    : prep_cmsswtracks()
0122      : prep_tracks(cmsswtracks,cmsswextras)     
0123        : m_event->validation_.alignTracks(tracks,extras,false)  
0124    : m_event->Validate()
0125      : validation_.setTrackExtras(*this)
0126        : storeSeedAndMCID() 
0127        : if (--cmssw-matching trkparam) extra.setCMSSWTrackIDInfoByTrkParams() // Chi2 and dphi matching (also incudes option for nHits matching)
0128          : modifyRecTrackID()                 
0129        : else if (--cmssw-matching hit) extra.setCMSSWTrackIDInfoByHits() // Chi2 and dphi matching (also incudes option for nHits matching)
0130          : modifyRecTrackID()                 
0131        : else if (--cmssw-matching label) extra.setCMSSWTrackIDInfoByLabel() // 50% hit sharing after seed
0132          : modifyRecTrackID()                 
0133      : validation_.makeCMSSWTkToRecoTksMaps(*this)
0134        : mapRefTkToRecoTks(buildtracks,buildextras,cmsswToBuildMap)
0135      : validation_.fillCMSSWEfficiencyTree(*this)       
0136      : validation_.fillCMSSWFakeRateTree(*this) 
0137 
0138 =======================================
0139  C. Explanation of validation routines
0140 =======================================
0141 
0142 - map/remap hit functions: see notes in section E. Essentially, validation needs all hit indices inside tracks to match the hit indices inside ev.layerHits_.
0143 
0144 +++++++++++++++++++++++++++
0145  I. Tracks and Extras Prep
0146 +++++++++++++++++++++++++++
0147 
0148 - clean_cms_simtracks()
0149   : loop over sim tracks
0150     : mark sim track status not findable if (nLayers < [Config::cmsSelMinLayers == 8])
0151     : tracks are not removed from collection, just have this bit set. this way the mcTrackID == position in vector == label
0152 
0153 - clean_cms_seedtracks()
0154   : cmssw seed tracks are cleaned according to closeness in deta, dphi, dR to other cmssw seed tracks--> duplicate removal
0155   : loop over cleaned seed tracks, and if label_ == -1, then incrementally decrease label (so second -1 seed is -2, third is -3)
0156 
0157 - prep_tracks(tracks,extras) 
0158   : Loop over all track collections in consideration
0159     : sort hits inside track by layer : needed for counting unique layers and for association routines
0160     : emplace_back a track extra, initialized with the label of the track (which happens to be its seed ID) // if using sim seeds, we know that seed ID == sim ID
0161   : m_event->validation_.alignTracks(tracks,extras,alignExtra)   
0162 
0163 - alignTracks(tracks,extras,alignExtra)
0164   : if alignExtra == true // needed for when a reco track collection, which was previously labeled by its label() == seedID, created its track extra at the same time but the track collection has been moved or sorted
0165     : create temporary track extra collection, size of track collection
0166     : loop over tracks
0167       : set tmp extra to the old track extra collection matching the track label
0168     : set the old track extra to equal the new collection
0169   : loop over tracks
0170     : set the track label equal to the index inside the vector // needed for filling routines which rely on maps of indices between two track collections
0171 
0172 - prep_cmsswtracks()
0173   : Stanard prep_tracks()
0174   : loop over cmssw tracks
0175     : Count unique layers = nLayers
0176     : set status of cmssw track to notFindable() if: (nUniqueLayers() < [Config::cmsSelMinLayers == 8]) // same criteria for "notFindable()" cmssw sim tracks used for seeds
0177 
0178 ++++++++++++++++++++++++++++++++
0179  II. Track Association Routines
0180 ++++++++++++++++++++++++++++++++
0181 
0182 - setTrackExtras(&Event)    
0183   : if [1.]
0184     : loop over seed tracks
0185       : setMCTrackIDInfo(true) : Require 75% of found hits to match a single sim track
0186     : loop over build tracks
0187        : if (sim seeds) setMCTrackIDInfoByLabel() : Require 50% of found hits after seed to match label of seed/sim track
0188        : if (cms seeds) setMCTrackIDInfo(false) : Require 75% of found hits to match a single sim track
0189     : loop over fit tracks 
0190       : same options as build tracks   
0191   : if [2.]
0192     : setupCMSSWMatching()
0193       : first loop over cmssw tracks
0194         : create a vector of "reduced tracks" that stores 1./pt, eta, and associated covariances in reduced track states
0195         : add cmssw label to a map of lyr, map of idx, vector of labels
0196         : also include track momentum phi, and a list of hits inside a map. map key = layer, map value = vector of hit indices
0197     : loop over build tracks
0198       : setCMSSWTrackIDInfo() : require matching by chi2 and dphi
0199     : storeMCandSeedID()
0200 
0201 - modifyRecTrackID() 
0202   // Config::nMinFoundHits = 7, Config::nlayers_per_seed = 4 or 3 
0203   // nCandHits = trk.nFoundHits() OR trk.nFoundHits()-Config::nlayers_per_seed (see calling function)
0204   // nMinHits = Config::nMinFoundHits OR Config::nMinFoundHits-Config::nlayers_per_seed (see calling function)
0205   : if track has been marked as a duplicate, mc/cmsswTrackID = -10
0206   : else if (mc/cmsswTrackID >= 0) (i.e. the track has successfully matched)
0207     : if mc/cmsswTrack is findable
0208       : if nCandHits < nMinHits, mc/cmsswTrackID = -2
0209     : else
0210       : if nCandHits < nMinHits, mc/cmsswTrackID = -3 
0211       : else mc/cmsswTrackID = -4 (track is long enough, matched, but that sim track that is unfindable)
0212   : else if (mc/cmsswTrackID == -1)
0213     : if matching by label, and ref track exists
0214       : if ref track is findable
0215         : if nCandHits < minHits, ID = -5
0216       : else 
0217         : if nCandHits < nMinHits, ID = -6
0218         : else, ID = -7
0219     : else (not matching by label, or ref track does not exist
0220       : if nCandHits < nMinHits, ID = -8
0221       : else, ID = -9
0222   -->return potentially new ID assignment
0223 
0224 - setMCTrackIDInfoByLabel()
0225   : Loop over found hits on build track after seed
0226     : count the hits who have a mcTrackID == seedID_ (i.e. seedID == simTrack label == mcTrackID)
0227   : if hits are found after seed
0228     : if 50% are matched, mcTrackID == seedID_
0229     : else, mcTrackID == -1
0230   : mcTrackID = modifyRecTrackID() // nCandhits = nFoundHits-nlayers_per_seed, nMinHits = Config::nMinFoundHits - Config::nlayers_per_seed                    
0231     
0232 - setMCTrackIDInfo(isSeedTrack)
0233   : Loop over all found hits on build track (includes seed hits)
0234     : count the mcTrackID that appears most from the hits
0235   : if 75% of hits on reco track match a single sim track, mcTrackID == mcTrackID of single sim track
0236   : else, mcTrackID == -1
0237   : if (!isSeedTrack)
0238     : modifyRecTrackID() // nCandHits = nFoundHits, nMinHits = Config::nMinFoundHits
0239 
0240 - setCMSSWTrackIDInfo()
0241   : Loop over all cmssw "reduced" tracks
0242     : if helix chi2 < [Config::minCMSSWMatchChi2 == 50]
0243       : append label of cmssw track to a vector, along with chi2
0244   : sort vector by chi2
0245   : loop over label vector
0246     : swim cmssw track momentum phi from phi0 to mkFit reco track
0247     : if abs(wrapphi(dphi)) < [Config::minCMSSWMatchdPhi == 0.03]
0248       : see if dphi < currently best stored mindphi, and if yes, then set this as the new mindphi + label as matched cmsswTrackID
0249       : if using nHits matching, check for nHits matched --> currently not used nor tuned
0250   : if no label is found, cmsswTrackID == -1
0251   : modifyRefTrackID() // nCandHits and nMinHits same as setMCTrackIDInfo()
0252 
0253 - setCMSSWTrackIDInfoByLabel()
0254   : want to match the hits on the reco track to those on the CMSSW track
0255   : loop over hits on reco track after seed
0256     : get hit idx and lyr
0257       : if the cmssw track has this lyr, loop over hit indices on cmssw track with this layer
0258         : if cmssw hit idx matches reco idx, increment nHitsMatched_
0259   : follow same logic as setMCTrackIDInfoByLabel() for setting cmsswTrackID
0260   : modifyRecTrackID() // nCandHits and nMinHits same as setMCTrackIDInfoByLabel()
0261   
0262 - mapRefTkToRecoTks(tracks,extras,map)
0263   : Loop over reco tracks
0264     : get track extra for track
0265     : if [1.], map[extra.mcTrackID()].push_back(track.label()) // reminder, label() now equals index inside track vector!
0266     : if [2.], map[extra.cmsswTrackID()].push_back(track.label()) // reminder, label() now equals index inside track vector!
0267   : Loop over pairs in map
0268     : if vector of labels size == 1, get track extra for label, and set duplicate index == 0
0269     : else
0270       : make temp track vector from track labels, sort track vector by nHits (and sum hit chi2 if tracks have same nHits)
0271       : set vector of labels to sorted tracks
0272       : loop over vector labels
0273         : get track extra for label, and set duplicate index++ 
0274 
0275 - mapSeedTkToRecoTk(tracks,extras,map)
0276   : loop over reco tracks
0277     : map[extra.seedID()] = track.label()
0278 
0279 - makeSeedTkToCMSSWTkMap(event)
0280   : this is run BEFORE seed cleaning AND BEFORE the seeds are sorted in eta in prepare_seeds()
0281   : if seed track index in vector == cmssw track label(), store map key = seed track label(), map value = cmssw track label() in seedToCmsswMap (seedID of cmssw track)
0282 
0283 - storeMCandSeedID()
0284   : reminder: both the candidate tracks and the cmssw tracks have had their labels reassigned, but their original labels were stored in their track extra seedIDs.  reminder, seedID of candidate track points to the label of the seed track.  label on seed track == sim track reference, if it exists!
0285   : loop over candidate tracks
0286     : set mcTrackID == seedID_ of track
0287     : if seedToCmsswMap[cand.label()] exists, then set the seedID equal to the mapped value (i.e. the seedID of the cmssw track!)
0288     : else, set seedID == -1
0289   : After this is run, to get the matching CMSSW track, we then need to loop over the CMSSW track extras with an index based loop, popping out when the cmsswextra[i].seedID() == buidextra[j].seedID()
0290 
0291 ++++++++++++++++++++
0292  III. TTree Filling
0293 ++++++++++++++++++++
0294 
0295 - fillEfficiencyTree()
0296   : loop over simtracks
0297     : get mcTrackID (i.e. simTrack.label())
0298     : store sim track gen info
0299     : if simToSeedMap[mcTrackID] has value
0300       : mcmask == 1
0301       : get first seed track matched (i.e. the one with the highest nHits --> or lowest sum hit chi2 as provided by sort from above)
0302       : store seed track parameters
0303       : store nHits, nlayers, last layer, chi2
0304       : store duplicate info: nTrks_matched from size() of mapped vector of labels, and duplicateMask == seedtrack.isDuplicate()
0305       : get last found hit index
0306         : store hit parameters
0307         : if mcTrackID of hit == mcTrackID of sim track // ONLY for when simtrackstates are stored, i.e. in ToyMC only at the moment
0308           : store sim track state momentum info from this layer (from simTrackStates[mcHitID])
0309         : else get sim track state of mcTrackID, then store momentum info
0310     : else
0311       : mcmask == 0, or == -1 if simtrack.isNotFindable()
0312     : if simToBuildMap[mcTrackID] has value
0313       : repeat as above
0314     : if simToFitMap[mcTrackID] has value
0315       : repeat as above
0316     : fill efftree
0317 
0318 - fillFakeRateTree()
0319   : loop over seed tracks
0320     : get seedID of seed track from track extra
0321     : fill seed track parameters + last hit info, nhits, etc
0322     : assign mcmask info based on mcTrackID from track extra (see section D and G for explanation of mask assignments)
0323     : if mcmask == 1
0324       : store gen sim momentum parameters
0325       : store nhits info, last layer
0326       : store duplicate info: iTh track matched from seedtrack extra, duplicateMask == seedtrack.isDuplicate()
0327       : if last hit found has a valid mcHitID
0328         : store sim track state momentum info from simTrackStates[mcHitID]
0329     : if seedToBuildMap[seedID] has value
0330       : fill build track parameters + last hit info, nhits, etc
0331       : assign mcmask info based on mcTrackID from track extra (see section D and G for explanation of mask assignments)
0332       : if mcmask == 1
0333         : store gen sim momentum parameters
0334         : store nhits info, last layer, duplicate info as above
0335         : if last hit found has a valid mcHitID
0336           : store sim track state momentum info from simTrackStates[mcHitID]
0337     : if seedToFitMap[seedID] has value
0338       : same as above
0339     : fill frtree
0340 
0341 - fillCMSSWEfficiencyTree()
0342   : loop over cmsswtracks
0343     : get label of cmsswtrack, seedID
0344     : store cmssw track PCA parameters + nhits, nlayers, last layer
0345     : if cmsswToBuilddMap[cmsswtrack.label()] has value
0346       : get first build track matched (i.e. the one with the highest nHits --> or lowest sum hit chi2 as provided by sort from above)
0347       : store build track parameters + errors
0348       : store nHits, nlayers, last layer, last hit parameters, hit and helix chi2, duplicate info, seedID
0349       : swim cmssw phi to mkFit track, store it
0350     : fill cmsswefftree
0351 
0352 - fillCMSSWFakeRateTree()
0353   : loop over build tracks
0354     : store build track parameters + errors
0355     : store nHits, nlayers, last layer, last hit parameters, hit and helix chi2, duplicate info, seedID
0356     : get cmsswTrackID, assign cmsswmask according to section D and G
0357     : if cmsswmask == 1 
0358       : store cmssw track PCA parameters + nhits, nlayers, last layer, seedID
0359       : swim cmssw phi to mkFit track, store it
0360     : fill cmsswefftree
0361 
0362 =============================================================
0363  D. Definitions of efficiency, fake rate, and duplicate rate
0364 =============================================================
0365 
0366 Use rootValidation.C to create efficiency, fake rate, and duplicate rate vs. pT, phi, eta. This macro compiles PlotValidation.cpp/.hh. Efficiency uses sim track momentum info. Fake rate uses the reco track momentum. For [1.], plots are made for seed, build, and fit tracks. For [2.], the plots are only against the build tracks. See G. for more details on ID assignments.
0367 
0368 root -l -b -q runValidation.C\([agruments]\)
0369 
0370 Argument list: 
0371 First is additional input name of root file [def = ""]
0372 Second argument is boolean to compute momentum pulls: currently implemented only when sim track states are available (ToyMC validation only)! [def = false]
0373 Third argument is boolean to do special CMSSW validation [def = false]
0374 Fourth argument == true to move input root file to output directory, false to keep input file where it is. [def = true]
0375 Fifth argument is a bool to save the image files [def = false]
0376 Last argument is output type of plots [def = "pdf"]
0377 
0378 Efficiency [PlotValidation::PlotEfficiency()]
0379   numerator:   sim tracks with at least one reco track with mcTrackID >= 0 (mcmask_[reco] == 1)
0380   denominator: all findable sim tracks (mcmask_[reco] = 0 || == 1)
0381   mcmask_[reco] == - 1 excluded from both numerator and denominator because this sim track was not findable!
0382 
0383 Fake Rate (with only long reco tracks: Config::inclusiveShorts == false) [PlotValidation::PlotFakeRate()]
0384   numerator:   reco tracks with mcTrackID == -1 || == -9
0385   denominator: reco tracks with mcTrackID >=  0 || == -1 || == -9
0386   mcTrackID | mcmask_[reco] 
0387      >= 0   |     1
0388     -1,-9   |     0
0389      -10    |    -2
0390      else   |    -1 // OR the seed track does produce a build/fit track as determined by the seedToBuild/FitMap
0391 
0392 N.B. In the MTV-Like SimVal: the requirement on minHits is removed, so all reco tracks are considered.
0393  - For the efficiency: only simtracks from the hard scatter (with some quality cuts on d0, dz, and eta) are considered for the denominator and numerator. If a simtrack from the hard-scatter is unmatched, it will not enter the numerator.
0394  - For the FR: all reco tracks (regardless of nHits) are in the denominator, and only those that are unmatched to any simtrack are in the numerator. Compared to the standard FR definition, we now allow reco tracks that are matched to any simtrack (regardless of quality of the simtrack, if its from PU, etc.) to enter the denominator. 
0395 - This means that tracks with mcTrackID == -4 will now have a mcmask_[reco] == 2 for MTV-Like simtrack validation. 
0396 
0397 Fake Rate (with all reco tracks: Config::inclusiveShorts == true, enabled with command line option: --inc-shorts) [PlotValidation::PlotFakeRate()]
0398   numerator:   reco tracks with mcTrackID == -1 || == -5 || ==  -8 || ==  -9
0399   denominator: reco tracks with mcTrackID >=  0 || == -2 || == -1 || == -5 || == -8 || == -9
0400   mcTrackID  | mcmask_[reco] 
0401     >= 0     |     1
0402  -1,-5,-8,-9 |     0
0403     -10      |    -2
0404      -2      |     2   
0405     else     |    -1 // OR the seed track does produce a build/fit track as determined by the seedToBuild/FitMap
0406 
0407 Duplicate Rate [PlotValidation::PlotDuplicateRate()], see special note in section H
0408   numerator:   sim tracks with more than reco track match (duplmask_[reco] == 1), or another way is nTrks_matched_[reco] > 1
0409   denominator: sim tracks with at least one reco track with mcTrackID >= 0 (duplmask_[reco] != -1), or mcmask_[reco] == 1
0410 
0411 ========================
0412  E. Overview of scripts
0413 ========================
0414 
0415 I. ./validation-snb-toymc-fulldet-build.sh
0416 Runs ToyMC full detector tracking for BH, STD, CE, for 400 events with nTracks/event = 2500. Sim seeds only.
0417 
0418 To move the images + text files and clean up directory:
0419 ./web/move-toymcval.sh ${outdir name}
0420 
0421 II. ./validation-snb-cmssw-10mu-fulldet-build.sh
0422 Runs CMSSW full detector tracking for BH, STD, CE, for ~1000 events with 10 muons/event, with sim and cmssw seeds, using N^2 cleaning for cmssw seeds.
0423 Samples are split by eta region. Building is run for each region:
0424 - ECN2: 2.4 < eta < 1.7
0425 - ECN1: 1.75 < eta < 0.55
0426 - BRL: |eta| < 0.6
0427 - ECP1: 0.55 < eta < 1.75
0428 - ECP2: 1.7 < eta < 2.4
0429 
0430 Validation plots are produced for each sample (region), seeding source, and building routine. At the very end, validation trees are hadd'ed for each region in a given seed source + building routine. Plots are produced again to yield "full-detector" tracking.
0431 
0432 To move the images + text files and clean up directory:
0433 ./web/move-cmsswval-10mu.sh ${outdir name}
0434 
0435 III. ./validation-snb-cmssw-10mu-fulldet-extrectracks.sh
0436 Same as II., but now only run with cmssw seeds (as we are comparing directly to cmssw output as the reference).
0437 
0438 To move the images + text files and clean up directory:
0439 ./web/move-cmsswval-10mu-extrectracks.sh ${outdir name}
0440 
0441 IV. ./validation-snb-cmssw-ttbar-fulldet.sh
0442 Runs CMSSW full detector tracking for BH, STD, CE, for three different ttbar samples with 100 events each, with sim and cmssw seeds, using N^2 cleaning for cmssw seeds.
0443 TTbar samples:
0444 - No PU
0445 - PU 35
0446 - PU 70
0447 
0448 To move the images + text files and clean up directory:
0449 ./web/move-cmsswval-ttbar.sh ${outdir name}
0450 
0451 V. ./validation-snb-cmssw-ttbar-fulldet.sh
0452 Same as IV., but now only run with cmssw seeds, using cmssw rec tracks as the reference set of tracks.
0453 
0454 To move the images + text files and clean up directory:
0455 ./web/move-cmsswval-ttbar-extrectracks.sh ${outdir name}
0456 
0457 ============================
0458  F. Hit map/remapping logic
0459 ============================
0460 
0461 *** Originally from mkFit/MkBuilder.cc ***
0462 
0463 All built candidate tracks have all hit indices pointing to m_event_of_hits.m_layers_of_hits[layer].m_hits (LOH)
0464 MC seeds (both CMSSW and toyMC),as well as CMSSW seeds, have seed hit indices pointing to global HitVec m_event->layerHits_[layer] (GLH)
0465 Found seeds from our code have all seed hit indices pointing to LOH.
0466 So.. to have universal seed fitting function --> have seed hits point to LOH no matter their origin.
0467 This means that all MC and CMSSW seeds must be "mapped" from GLH to LOH: map_seed_hits().
0468 Now InputTracksAndHits() for seed fit will use LOH instead of GLH.
0469 The output tracks of the seed fitting are now stored in m_event->seedTracks_.
0470 
0471 Then building proceeds as normal, using m_event->seedTracks_ as input no matter the choice of seeds. 
0472 
0473 For the validation, we can reuse the TrackExtra setMCTrackIDInfo() with a few tricks.
0474 Since setMCTrackIDInfo by necessity uses GLH, we then need ALL track collections (seed, candidate, fit) to their hits point back to GLH.
0475 There are also two validation options: w/ or w/o ROOT.
0476 
0477 W/ ROOT uses the TTreValidation class which needs seedTracks_, candidateTracks_, and fitTracks_ all stored in m_event.
0478 The fitTracks_ collection for now is just a copy of candidateTracks_ (eventually may have cuts and things that affect which tracks to fit).
0479 So... need to "remap" seedTracks_ hits from LOH to GLH with remap_seed_hits().
0480 And also copy in tracks from EtaBin* to candidateTracks_, and then remap hits from LOH to GLH with quality_store_tracks() and remap_cand_hits().
0481 W/ ROOT uses sim_val()
0482 
0483 W/O ROOT is a bit simpler... as we only need to do the copy out tracks from EtaBin* and then remap just candidateTracks_.
0484 This uses quality_output()
0485 
0486 N.B.1 Since fittestMPlex at the moment is not "end-to-end" with candidate tracks, we can still use the GLH version of InputTracksAndHits()
0487 N.B.2 Since we inflate LOH by 2% more than GLH, hit indices in building only go to GLH, so all loops are sized to GLH.
0488 
0489 ==========================================
0490  G. Extra info on ID and mask assignments
0491 ==========================================
0492 
0493 *** Originally from Track.cc ***
0494 
0495 Three basic quantities determine the track ID: 
0496  1. matching criterion (50% after seed for *ByLabel(), 75% for other hit matching, or via chi2+dphi)
0497  2. nCandidateHits found compared nMinHits
0498  3. findability of reference track (if applicable)
0499 
0500 Three outcomes exist for each quantity:
0501  1. matching criterion
0502     a. reco track passed the matching criterion in set*TrackIDInfo*(): M
0503     b. reco track failed the matching criterion in set*TrackIDInfo*(): N
0504     c. reco track never made it past its seed, so matching selection by hit matching via reference track label does not exist in set*TrackIDInfoByLabel(): N/A
0505  2. nCandHits compared to nMinHits
0506     a. reco track has greater than or equal to the min hits requirement (i.e. is long enough): L
0507     b. reco track has less than the min hits requirement (i.e. short): S
0508     c. reco track is a pure seed, and calling function is set*TrackIDInfoByLabel(): O, by definition then O also equals S
0509  3. findability of reference track
0510     a. reference track is findable (nUniqueLayers >= 8 && pT > 0.5): isF
0511     b. reference track is NOT findable (nUniqueLayers < 8 || pT < 0.5): unF
0512     c. reference track does not exist in set*TrackIDInfoByLabel(), or we are using set*TrackIDInfo(): ?
0513 
0514 *** Originally from TTreeValidation.cc ***
0515 
0516 ** Mask assignments **
0517 
0518 _[reco] = {seed,build,fit}
0519 
0520 Logic is as follows: any negative integer means that track is excluded from both the numerator and denominator. A mask with a value greater than 1 means that the track is included in the denominator, but not the numerator.
0521 
0522 --> mcmask_[reco] == 1,"associated" reco to sim track [possible duplmask_[reco] == 1,0] {eff and FR}, enter numer and denom of eff, enter denom only of FR
0523 --> mcmask_[reco] == 0,"unassociated" reco to sim track. by definition no duplicates (no reco to associate to sim tracks!) [possible duplmask_[reco] == -1 {eff and FR}], enter denom only of eff, enter numer and denom of FR
0524 --> mcmask_[reco] == -1, sim or reco track excluded from denominator (and therefore numerator) [possible duplmask_[reco] == -1] {eff and FR}
0525 --> mcmask_[reco] == -2, reco track excluded from denominator because it does not exist (and therefore numerator) [possible duplmask_[reco] == -2] {FR}
0526 --> mcmask_[reco] == 2, reco track included in demoninator of FR, but will not enter numerator: for short "matched" tracks {FR only}
0527 
0528 --> nTkMatches_[reco] > 1,   n reco tracks associated to the same sim track ID {eff only}
0529 --> nTkMatches_[reco] == 1,  1 reco track associated to single sim track ID {eff only}
0530 --> nTkMatches_[reco] == -99, no reco to sim match {eff only}
0531 
0532 --> mcTSmask_[reco] == 1, reco track is associated to sim track, and sim track contains the same hit as the last hit on the reco track
0533 --> mcTSmask_[reco] == 0, reco track is associated to sim track, and either A) sim track does not contain the last hit found on the reco track or B) the sim trackstates were not read in (still save sim info from gen position via --try-to-save-sim-info
0534 --> mcTSmask_[reco] == -1, reco track is unassociated to sim track
0535 --> mcTSmask_[reco] == -2, reco track is associated to sim track, and we fail == 1 and == 0
0536 --> mcTSmask_[reco] == -3, reco track is unassociated to seed track {FR only}
0537 
0538 excluding position variables, as position could be -99!
0539 --> reco var == -99, "unassociated" reco to sim track [possible mcmask_[reco] == 0,-1,2; possible duplmask_[reco] == -1] {eff only}
0540 --> sim  var == -99, "unassociated" reco to sim track [possible mcmask_[reco] == 0,-1,2; possible duplmask_[reco] == -1] {FR only}
0541 --> reco/sim var == -100, "no matching seed to build/fit" track, fill all reco/sim variables -100 [possible mcmask_[reco] == -1, possible duplmask_[reco] == -1] {FR only}
0542 --> sim  var == -101, reco track is "associated" to sim track, however, sim track does have a hit on the layer the reco track is on
0543 
0544 --> seedmask_[reco] == 1, matching seed to reco/fit track [possible mcmask_[reco] == 0,1,2; possible duplmask_[reco] == 0,1,-1] {FR only}
0545 --> seedmask_[reco] == 0, no matching seed to reco/fit track [possible mcmask_[reco] == -2; possible duplmask_[reco] == -2] {FR only}
0546 
0547 --> duplmask_[reco] == 0, only "associated" reco to sim track [possible mcmask_[reco] == 1] {eff and FR}
0548 --> duplmask_[reco] == 1, more than one "associated" reco to sim track [possible mcmask_[reco] == 1] {eff and FR}
0549 --> duplmask_[reco] == -1, no "associated" reco to sim track [possible mcmask_[reco] == 0,-1,-2] {eff and FR}
0550 --> duplmask_[reco] == -2, no matching built/fit track for given seed [possible mcmask_[reco] == -2] {FR only}
0551 
0552 --> reco var == -10, variable not yet implemented for given track object
0553 
0554 position reco variables
0555 --> layers_[reco]    ==  -1, reco unassociated to sim tk {eff only}
0556 --> reco pos+err var == -2000, reco tk is unassociated to sim tk {eff only}
0557 --> reco pos+err var == -3000, reco tk is unassociated to seed tk {FR only}
0558 
0559 ======================================
0560  H. Special note about duplicate rate
0561 ======================================
0562 
0563 *** Originally from PlotValidation.cpp ***
0564 
0565 Currently, TEfficiency does not allow you to fill a weighted number in the numerator and NOT the denominator.
0566 In other words, we cannot fill numerator n-1 times sim track is matched, while denominator is just filled once.
0567 As a result, DR is simply if a sim track is duplicated once, and not how many times it is duplicated. 
0568 
0569 We can revert back to the n-1 filling for the numerator to weight by the amount of times a sim track is duplicated, but this would mean going back to the TH1Fs, and then using the binomial errors (or computing by hand the CP errors or something), in the case that the DR in any bin > 1... This would break the flow of the printouts as well as the stacking macro, but could be done with some mild pain.