Validation/RecoVertex/README.md

0001 INTRODUCTION
0002 ============
0003
0004 This small README file is here to guide you in the process of running
0005 the Vertex Validation, slimmed version, on RelVal samples, in case you
0006 want to perform test on tracking and vertexing. The idea here is not
0007 to give you pre-cooked python3 configuration files, but to teach you
0008 how you could use the most common tool available in CMS to perform the
0009 same. We will mainly use cmsDriver and its powerful option to create
0010 the python3 cfg that we will run, and das_client to explore and find
0011 suitable samples to run upon. At the end of this page there is the description of other standalone analyzers, configurations and Root macros. Let start with order.
0012
0013 PREREQUISITES
0014 =============
0015
0016 We assume that from this point onward, you have setup a proper CMSSW
0017 area and that you have source its environments, since all the script
0018 that we will be using are available to you only after you performed
0019 such actions.
0020
0021 FIND PROPER SAMPLES
0022 ===================
0023
0024 The first thing that we need to do is to find appropriate samples to
0025 run upon. Our suggestion is to start from the RelVal samples that are
0026 regularly produced for every release and pre-release, since this will
0027 avoid all the burden of properly selecting the PU and generation
0028 snippet. In case you want to use anything other than what is available
0029 as RelVal, we assume you are familiar enough with the production
0030 mechanism that you can take care of it alone: no instructions will be
0031 given here.
0032
0033 FIND GEN-SIM-DIGI-RAW-HLTDEBUG samples FOR A SPECIFIC RELEASE
0034 -------------------------------------------------------------
0035
0036 In order to check what samples are available for, e.g. the CMSSW_7_2
0037 release cycle, issue the command
0038
0039 ```
0040 das_client.py --query='dataset=/RelValTTbar*/*7_2_0*/*GEN-SIM-DIGI-RAW-HLTDEBUG' --format=plain
0041 ```
0042
0043 and pick up the proper dataset among the ones printed directly on the
0044 screen. Here we picked up
0045 /RelValTTbar_13/CMSSW_7_2_0_pre1-PU25ns_POSTLS172_V1-v1/GEN-SIM-DIGI-RAW-HLTDEBUG.
0046
0047 FIND ALL FILES BELONGING TO A SPECIFIC DATASET
0048 ----------------------------------------------
0049
0050 In order to discover which files belong to the selected dataset, you
0051 have to issue the following command (of course you have to change the
0052 dataset name in the query, using the one you discovered in the
0053 previous point...)
0054
0055 ```
0056 das_client.py --limit 0 --query='file dataset=/RelValTTbar_13/CMSSW_7_2_0_pre1-PU25ns_POSTLS172_V1-v1/GEN-SIM-DIGI-RAW-HLTDEBUG' --format=plain | sort -u > gen_sim_digi_raw_files.txt 2>&1
0057 ```
0058
0059 This will write the discovered files directly into the ASCII file
0060 gen_sim_digi_raw_files.txt, that will be used as input to the
0061 following cmsDriver commands.
0062
0063 RUN RECO AND VERTEX VALIDATION
0064 ==============================
0065
0066 Inn order to run the vertex validation starting from RAW file, you
0067 need to create a proper python3 cfg. As said, instead of preparing a
0068 pre-cooked one, we think its more useful to give you the cmsDriver
0069 command that will dynamically prepare it for you. To obtain such a cfg
0070 file, issue the following command:
0071
0072 ```
0073 cmsDriver.py step3  --conditions auto:run2_mc -n 100 --eventcontent DQM -s RAW2DIGI,RECO,VALIDATION:vertexValidationStandalone --datatier DQMIO --filein filelist:gen_sim_digi_raw_files.txt --fileout step3_VertexValidation.root --customise SLHCUpgradeSimulations/Configuration/postLS1Customs.customisePostLS1 --magField 38T_PostLS1
0074 ```
0075
0076 This will create the python3 configuration file **and will
0077 automatically run cmsRun on it. If instead you want to just produce
0078 the configuration, e.g. for inspection and further customization, you
0079 can add the option:
0080
0081 ```
0082 --no_exec
0083 ```
0084
0085 to the previous command, This command will produce and output file
0086 named step3_VertexValidation,root that will contain all the histograms
0087 produce by the Vertex Validation package. The internal format of the
0088 ROOT file follows the DQMIO rules, to have better performance while
0089 running harvesting.
0090
0091 RUN VERTEX VALIDATION WITHOUT RECO
0092 ----------------------------------
0093
0094 It is also possible to re-run only the validation without
0095 reconstruction (e.g. for developing the validation package itself).
0096 For that you need first the list of GEN-SIM-RECO files, i.e. e.g.
0097
0098 ```
0099 das_client.py --limit 0 --query='file dataset=/RelValTTbar_13/CMSSW_7_2_0_pre1-PU25ns_POSTLS172_V1-v1/GEN-SIM-RECO' --format=plain | sort -u > gen_sim_reco_files.txt 2>&1
0100 ```
0101
0102 The configuration can then be generated with
0103
0104 ```
0105 cmsDriver.py step3  --conditions auto:run2_mc -n 100 --eventcontent DQM -s VALIDATION:vertexValidationStandalone --datatier DQMIO --filein filelist:gen_sim_reco_files.txt --secondfilein filelist:gen_sim_digi_raw_files.txt --fileout step3_VertexValidation.root --customise SLHCUpgradeSimulations/Configuration/postLS1Customs.customisePostLS1 --magField 38T_PostLS1 --no_exec
0106 ```
0107
0108 Note the `secondfilein` parameter for specifying the RAW files for the
0109 "2-files solution".
0110
0111
0112
0113 RUN FINAL HARVESTING TO PRODUCE EFFICIENCY, FAKE, MERGE AND DUPLICATE RATE PLOTS
0114 ================================================================================
0115
0116 The outcome of the previous step is not yet suitable to be browsed
0117 using plain ROOT. Moreover all the important plots have not yet been
0118 produce. You need to finalize the processing running the harvesting
0119 sequence. Again, we think it is better to provide you with the
0120 cmsDriver command to do that:
0121
0122 ```
0123 cmsDriver.py step4  --scenario pp --filetype DQM --conditions auto:run2_mc --mc  -s HARVESTING:postProcessorVertexStandAlone -n -1 --filein file:step3_VertexValidation.root -no_exec
0124 ```
0125 This command will create a final, plain, ROOT file named:
0126 DQM_V0001_R000000001__Global__CMSSW_X_Y_Z__RECO.root that will contain
0127 all the folders and plots produced by the Vertex Validation package.
0128
0129
0130 FURTHER CUSTOMIZATION
0131 =====================
0132
0133 If you want to customize the default vertex validation sequence, both
0134 the first one and the ones used in harvesting, you need to manually
0135 edit the configuration files produce by the previous cmsDriver
0136 commands. To ease this operation, you can point your browser here:
0137
0138 https://github.com/cms-sw/cmssw/blob/CMSSW_7_2_X/Validation/RecoVertex/python/PrimaryVertexAnalyzer4PUSlimmed_cfi.py
0139
0140 for the first default, and here:
0141
0142 https://github.com/cms-sw/cmssw/blob/CMSSW_7_2_X/Validation/RecoVertex/python/PrimaryVertexAnalyzer4PUSlimmed_Client_cfi.py
0143
0144 for the default used in the harvesting step.
0145
0146 Enjoy.
0147
0148 DETAILED DESCRIPTION OF THE CODE
0149 ================================
0150 ## Plugins
0151 ### AnotherPrimaryVertexAnalyzer
0152 It produces several histograms using a vertex collection as input: the vertex x, y and z  positions, the number of vertices (vs the instantaneous luminosity), the number of tracks per vertex and the sum of the squared pt of the tracks from a vertex (with or without a cut on the track weight), the number of degrees of freedom (also as a function of the number of tracks), the track weights and the average weight and the average values of many of the observables above as a function of the vertex z position.
0153 Distributions are produced also per run or per fill: the number of vertices and their position as a function of the orbit number and of the BX number. By configuration it is possible to choose among TProfile or full 2D plots.
0154 All these histograms can be filled with a weight to be provided by an object defined in the configuration.
0155 An example of configuration can be found in `python/anotherprimaryvertexanalyzer_cfi.py`.
0156
0157 ### AnotherBeamSpotAnalyzer
0158 `AnotherBeamSpotAnalyzer` is the plugin name which corresponds to the code in `src/BeamSpotAnalyzer.cc`. It produces several histograms to monitor the beam spot position; the name of a beamspot collection has to be provided as input. The histograms are the beam spot position and width and their dependence as a function of the orbit number (one set of histograms per run).
0159 An example of configuration can be found in `python/beamspotanalyzer_cfi.py`.
0160
0161 ### BSvsPVAnalyzer
0162 It produces distributions related to the relative position between vertices and the beam spot. It requires a vertex collection and a beam spot collection as input. By configuration it is possible to control whether the comparison has to take into account the tilt of the beamspot. The distributions are the differences of the vertex and beam spot position coordinates, the average of these differences as a function of the vertex z position and, for each run, the dependence of these differences as a function of the orbit number and of the BX number. Configuration parameters have to be used to activate or de-activate those histograms which are more memory demanding.
0163 An example of configuration can be found in `python/bspvanalyzer_cfi.py`.
0164
0165 ### MCVerticesAnalyzer
0166 It produces distributions related to the multiplicity of (in-time and out-of-time) pileup vertices (or interactions), to the position of the main MC vertex and to the z position of the pileup vertices. It correlates the average number of pileup interactions with the actual number of pileup interactions. It can be configured to use weights.
0167 An example of configuration can be found in `python/mcverticesanalyzer_cfi.py`.
0168
0169 ### MCVerticesWeight
0170 It is an `EDFilter` which computes an event weight based on the MC vertices z position to reproduce a different luminous region length. It can be configured to reject events or the weight can be used to fill the histograms of `MCVerticesAnalyzer`.
0171 An example of configuration can be found in `python/mcverticesweight_cfi.py`
0172
0173 ###MCvsRecoVerticesAnalyzer
0174 It produces histograms to correlate the number of reconstructted vertices with the number of generated vertices or with the average pileup, to correlate the z position of the reconstructed vertices with that of the MC vertices and to check how many times the closest reco vertex to the main MC vertex is the first one in the vertex collection. It can be configured to fill histograms with weights to be provided with `MCVerticesWeight`.
0175 An example of configuration can be found in `python/mcvsrecoverticesanalyzer_cfi.py`
0176
0177 ## Configurations
0178 * `test/allanalyzer_example_cfg.py` is a configuration which uses the `AnotherPrimaryVertexAnalyzer`, `AnotherBeamSpotAnalyzer` and `BSvsPVAnalyzer` and that can be used to analyze real data events. It uses VarParsing to pass the input parameters like the input files and the global tag.
0179 * `test/mcverticesanalyzer_cfg.py` an example of configuration which uses the plugins to study the MC vertices
0180 * `test/mcverticessimpleanalyzer_cfg.py` an example of configuration which uses the plugins to study the MC vertices
0181 * `test/mcverticestriggerbiasanalyzer_cfg.py` an example of configuration which uses the plugins to study the MC vertices.
0182