Warning, /RecoVertex/BeamSpotProducer/scripts/READMEMegascript.txt is written in an unsupported language. File is not indexed.
0001 1) Manually running the BeamSpotWorkflow.py script
0002
0003 Just typing BeamSpotWorkflow.py -h will show the possible options of the script
0004
0005 The 3 most common options are
0006 -z -> changes the sigmaZ form the calculated value to 10cm
0007 -u -> upload the valuse into the DB
0008
0009 -c -> allow to specify your custom cfg file otherwise it uses the default BeamSpotWorkflow.cfg
0010
0011 Example:
0012 ./BeamSpotWorkflow.py -c BeamSpotWorkflow_run.cfg -z -u
0013
0014 2)Cfg file structure: (extra lines can be commented with a # at the beginning)
0015
0016 a) SOURCE_DIR = /castor/cern.ch/cms/store/caf/user/uplegger/Workflows/361_patch4/express_T0_v11/
0017 Any directory ( castor or hard disk) where you have the txt files produced by the CMSSW beamspot workflow
0018
0019 b) ARCHIVE_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_archive/
0020 Any directory where you want to store the beamspot files. The files from SOURCE_DIR will be copied to the ARCHIVE_DIR
0021
0022 c) WORKING_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_tmp
0023 After the files are copied in the ARCHIVE_DIR, then they will be copied in the WORKING_DIR. Every time you run the script, the
0024 WORKING_DIR will be WIPED OUT first. In case you are running MORE SCRIPTS AT THE SAME TIME you can keep the same ARCHIVE_DIR but
0025 you MUST to use a different WORKING_DIR for each script to avoid conflicts
0026
0027 d) DBTAG = BeamSpotObjects_2009_v14_offline
0028 Database tag you want to update. Currently we have (BeamSpotObjects_2009_v14_offline, BeamSpotObjects_2009_SigmaZ_v14_offline,
0029 BeamSpotObjects_2009_lumi_v14_offline, BeamSpotObjects_2009_lumi_SigmaZ_v14_offline. I use BeamSpotObjects_2009_v13_offline for testing)
0030
0031 e) DATASET = /StreamExpress/Run2010A-TkAlMinBias-v4/ALCARECO
0032 Dataset which your txt files have been produced by. You can specify multiple DATASETS splitting them with a comma(,) like:
0033 /StreamExpress/Commissioning10-StreamTkAlMinBias-v7/ALCARECO,
0034 /StreamExpress/Commissioning10-StreamTkAlMinBias-v8/ALCARECO,
0035 /StreamExpress/Commissioning10-StreamTkAlMinBias-v9/ALCARECO,
0036 /StreamExpress/Run2010A-StreamTkAlMinBias-v1/ALCARECO,
0037 /StreamExpress/Run2010A-TkAlMinBias-v2/ALCARECO,
0038 /StreamExpress/Run2010A-TkAlMinBias-v3/ALCARECO,
0039 /StreamExpress/Run2010A-TkAlMinBias-v4/ALCARECO
0040 This is a very nice feature when you are reprocessing the all dataset
0041
0042 f) FILE_IOV_BASE = lumibase
0043 The iov base of the txt files. Recently we have been produceing files fitting every lumisection so it has been for a long time lumibase ( can be runbase)
0044
0045 g) DB_IOV_BASE = runnumber
0046 Iov base in the database for the tag you want to upload. Right now the official tag has runnumber iovs. The other possibility is lumiid
0047
0048 h) DBS_TOLERANCE_PERCENT = 10
0049 Percentage of missing lumisection that can be tolerated between the lumi section processed and the ones that dbs says should have been processed.
0050 When querying dbs the script asks how many lumisections were present in the files that the workflow processed. The number of lumi processed and the one
0051 in dbs should always match but unfortunately it is not the case. 10% should let you pass all the files that have been processed so far.
0052
0053 i) DBS_TOLERANCE = 20
0054 Number of missing lumisection that can be tolerated between the lumi section processed and the ones that dbs says should have been processed.
0055 Sometimes a run has few lumisections so in case the workflow doesn't process a few, the percentage of not processed lumis doesn't pass the
0056 previous tolerance.
0057
0058 l) RR_TOLERANCE = 10
0059 Percentage of missing lumisection that can be tolerated between the lumi section processed and the ones that are considered good in the run registry.
0060 If there are too many lumis unprocessed, when comapared to dbs, the script check if the ones that have been processed at least cover the
0061 one that are considered good in the run registry.
0062
0063 m) MISSING_FILES_TOLERANCE = 2
0064 Number of missing files that can be tolerated before the script can continue. It is important to keep this number low 2 max 3 especially
0065 when running it in a cron job. In fact, ithe script can be triggered when few files are still being processed and you don't want to do that
0066 if the number of missing files is still big.
0067
0068 n) MISSING_LUMIS_TIMEOUT = 14400
0069 There are few timeouts in the script (for example if there are still many files missing), and after a certain number of seconds = MISSING_LUMIS_TIMEOUT
0070 hte script keep running. MISSING_LUMIS_TIMEOUT = 0 doesn't produce a timeout and continue the script!
0071
0072 o) EMAIL = uplegger@cern.ch,yumiceva@fnal.gov
0073 Comma separated list of people who will receive an e-mail in case of big troubles. There are some conditions that must be validated by
0074 a person so typically the script stop working and send an e-mail to the persons in this list who will have to take action.
0075
0076 3) Cron job shell script.
0077 In python/tools there is the beamspotWorkflow_cron.sh shell script which runs the workflow automatically.
0078
0079 //--------------------------------------------------------------------------------------------------
0080 export STAGE_HOST=castorcms.cern.ch
0081 source /afs/cern.ch/cms/sw/cmsset_default.sh
0082 cd /afs/cern.ch/user/u/uplegger/scratch0/CMSSW/CMSSW_3_6_1_patch4/src/
0083 logFileName="/afs/cern.ch/user/u/uplegger/www/Logs/MegaScriptLog.txt"
0084 echo >> $logFileName
0085 echo "Begin running the script on " `date` >> $logFileName
0086 if [ ! -e .lock ]
0087 then
0088 touch .lock
0089 eval `scramv1 runtime -sh`
0090 python $CMSSW_BASE/src/RecoVertex/BeamSpotProducer/scripts/BeamSpotWorkflow_T0.py -u -c BeamSpotWorkflow_T0.cfg >> $logFileName
0091 rm .lock
0092 else
0093 echo "There is already a megascript runnning...exiting" >> $logFileName
0094 fi
0095 echo "Done on " `date` >> $logFileName
0096 //--------------------------------------------------------------------------------------------------
0097
0098 REMEMBER:
0099 a) cd /afs/cern.ch/user/u/uplegger/scratch0/CMSSW/CMSSW_3_6_1_patch4/src/
0100 is the CMSSW area where your script is!
0101 b) logFileName="/afs/cern.ch/user/u/uplegger/www/Logs/MegaScriptLog.txt"
0102 is my area which is web accessible, so I can check the output of the script once in a while
0103 c) python $CMSSW_BASE/src/RecoVertex/BeamSpotProducer/scripts/BeamSpotWorkflow_T0.py -u -c BeamSpotWorkflow_T0.cfg >> $logFileName
0104 Runs the script WITH the BeamSpotWorkflow_T0.cfg cfg file and saves the output in the logfilename that I can check online
0105 d) if [ ! -e .lock ] then touch .lock
0106 It creates a .lock file in /afs/cern.ch/user/u/uplegger/scratch0/CMSSW/CMSSW_3_6_1_patch4/src/
0107 This lock file prevent 2 megascripts to run at the same time. It is in the shell script so should be removed 99.9% of the times
0108 but it already happened to me that it was not removed once.
0109
0110
0111 4) Running the cron job:
0112 acrontab -e
0113 let you edit your cron jobs while
0114 acrontab -l
0115 shows what your cron job file is.
0116 //--------------------------------------------------------------------------------------------------
0117 5 * * * * lxplus258 /afs/cern.ch/user/u/uplegger/scratch0/CMSSW/CMSSW_3_6_1_patch4/src/RecoVertex/BeamSpotProducer/python/tools/beamspotWorkflow_cron.sh >& /afs/cern.ch/user/u/uplegger/www/Logs/CronJob.log
0118 25 * * * * lxplus301 /afs/cern.ch/user/u/uplegger/scratch0/CMSSW/CMSSW_3_6_1_patch4/src/RecoVertex/BeamSpotProducer/python/tools/beamspotWorkflow_cron.sh >& /afs/cern.ch/user/u/uplegger/www/Logs/CronJob.log
0119 45 * * * * lxplus256 /afs/cern.ch/user/u/uplegger/scratch0/CMSSW/CMSSW_3_6_1_patch4/src/RecoVertex/BeamSpotProducer/python/tools/beamspotWorkflow_cron.sh >& /afs/cern.ch/user/u/uplegger/www/Logs/CronJob.log
0120 3 0,13 * * * lxplus301 /afs/cern.ch/user/u/uplegger/scratch0/CMSSW/CMSSW_3_6_1_patch4/src/RecoVertex/BeamSpotProducer/python/tools/mvLogFile_cron.sh
0121 //--------------------------------------------------------------------------------------------------
0122 Right now I am running the megascript cron job from 3 different machines every 20 minutes.
0123 I am also running twice a day another script that moves the log files away to keep the one on the web small.
0124
0125
0126 5) The way I run everything.
0127 a) Every few days I run the workflow at T0. This is my crab cfg
0128 //--------------------------------------------------------------------------------------------------
0129 [CRAB]
0130 jobtype = cmssw
0131 scheduler = caf
0132 server_name = caf_test
0133
0134 [CAF]
0135 queue = cmscaf1nd
0136
0137
0138 [CMSSW]
0139
0140 #datasetpath = /MinimumBias/BeamCommissioning09-StreamTkAlMinBias-Dec19thReReco_341_v1/ALCARECO
0141 #datasetpath = /MinimumBias/BeamCommissioning09-StreamTkAlMinBias-Dec19thReReco_341_v1/ALCARECO-TEST-1102
0142 #datasetpath = /MinimumBias/BeamCommissioning09-StreamTkAlMinBias-Dec19thReReco_341_v1/ALCARECO-TEST-Run[0-9]*-1503
0143 #datasetpath = /MinimumBias/BeamCommissioning09-StreamTkAlMinBias-Mar3rdReReco_v2/ALCARECO
0144 #datasetpath = /StreamExpress/Commissioning10-StreamTkAlMinBias-v9/ALCARECO
0145 #datasetpath = /StreamExpress/Run2010A-StreamTkAlMinBias-v1/ALCARECO
0146 #datasetpath = /StreamExpress/Run2010A-TkAlMinBias-v4/ALCARECO
0147 #datasetpath = /StreamExpress/Run2010B-TkAlMinBias-v1/ALCARECO
0148 datasetpath = /StreamExpress/Run2010B-TkAlMinBias-v2/ALCARECO
0149
0150 pset = BeamFit_LumiBased_Workflow.py
0151
0152 get_edm_output = 1
0153 output_file = BeamFit_LumiBased_Workflow.txt,BeamFit_LumiBased_Workflow.root
0154
0155 [USER]
0156 ui_working_dir = crab_LumiBased_express_T0_v3
0157 # return data to local disk, change to 1
0158 return_data = 0
0159 #user_remote_dir = ShortWorkflow
0160 # return data to SE, change to 1
0161 copy_data = 1
0162 storage_element = T2_CH_CAF
0163 # area /castor/cern.ch/cms/store/caf/user/uplegger/Workflows/RunBased
0164 user_remote_dir = Workflows/381_patch3/express_T0_v3
0165
0166 [WMBS]
0167
0168 automation = 1
0169 feeder = T0AST
0170 #feeder = DBS
0171 startrun = 149415
0172 splitting_algorithm = RunBased
0173 split_per_job = files_per_job
0174 split_value = 1
0175 processing = express
0176
0177 //--------------------------------------------------------------------------------------------------
0178 b) I start the cron jobs:
0179 acrontab -e
0180 I uncomment the lines that I care and save with ctrl-O
0181
0182 //--------------------------------------------------------------------------------------------------
0183 using the following cfg file (BeamSpotWorkflow_T0.cfg)
0184 [Common]
0185 SOURCE_DIR = /castor/cern.ch/cms/store/caf/user/uplegger/Workflows/381_patch3/express_T0_v3/
0186 ARCHIVE_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_archive/
0187 WORKING_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_tmp
0188 DBTAG = BeamSpotObjects_2009_v13_offline
0189 DATASET = /StreamExpress/Run2010A-TkAlMinBias-v4/ALCARECO
0190 FILE_IOV_BASE = lumibase
0191 #DB_IOV_BASE = lumiid
0192 DB_IOV_BASE = runnumber
0193 DBS_TOLERANCE_PERCENT = 10
0194 DBS_TOLERANCE = 20
0195 RR_TOLERANCE = 10
0196 MISSING_FILES_TOLERANCE = 6
0197 MISSING_LUMIS_TIMEOUT = 14400
0198 EMAIL = uplegger@cern.ch
0199 //--------------------------------------------------------------------------------------------------
0200
0201
0202 c) Or I receive some unwanted e-mails :( or in the morning I check what happened to the v13 tag using this script whci is in cvs
0203 checkPayloads.py 13
0204 with 13 as argument.
0205 This script compare the iovs uploaded in the tag with the run registry. If there is a run registry entry and not a corresponding IOV
0206 it prints out:
0207 Run: 133509 is missing for DB tag BeamSpotObjects_2009_v14_offline
0208 Run: 139363 is missing for DB tag BeamSpotObjects_2009_v14_offline
0209
0210 This are the only 2 runs that should have an entry in the DB but for some reason we didn't update.
0211 Inside the script I keep a list of the runs that are missing in the db and if the megascript skip some of them I manually go to see in the run
0212 registry why the run is missing. If the strips were bad for example I write that run down and add it to the knownMissingRunList so they won't be printed out
0213
0214 #132573 Beam lost immediately
0215 #132958 Bad strips
0216 #133081 Bad pixels bad strips
0217 #133242 Bad strips
0218 #133472 Bad strips
0219 #133473 Only 20 lumisection, run duration 00:00:03:00
0220 #133509 Should be good!!!!!!!!!!
0221 #136290 Bad Pixels bad strips
0222 #138560 Bad pixels bad strips
0223 #138562 Bad HLT bad L1T, need to rescale the Jet Triggers
0224 #139363 NOT in the bad list but only 15 lumis and stopped for DAQ problems
0225 #139455 Bad Pixels and Strips and stopped because of HCAL trigger rate too high
0226 #140133 Beams dumped
0227 #140182 No pixel and Strips with few entries
0228 knownMissingRunList = [132573,132958,133081,133242,133472,133473,136290,138560,138562,139455,140133,140182]
0229
0230 d) I check the v14 tag with the same script
0231 checkPayloads.py
0232 If the 2 matche there were no new runs otherwise if I think the v13 was correctly updated with all runs, it means
0233 that I have to update the v14.
0234 So I just cut and paste the commands that are in this txt file
0235
0236 more uploadTags.txt
0237 ./BeamSpotWorkflow.py -c BeamSpotWorkflow_run.cfg -z -u
0238 ./BeamSpotWorkflow.py -c BeamSpotWorkflow_run_sigmaz.cfg -u
0239 ./BeamSpotWorkflow.py -c BeamSpotWorkflow_lumi.cfg -z -u
0240 ./BeamSpotWorkflow.py -c BeamSpotWorkflow_lumi_sigmaz.cfg -u
0241
0242 #For prompt and express tags
0243 ./createPayload.py -d PayloadFile.txt -t BeamSpotObjects_2009_v1_prompt -z -u
0244 ./createPayload.py -d PayloadFile.txt -t BeamSpotObjects_2009_v1_express -z -u
0245
0246 I have 4 cfg files
0247 //-------------------BeamSpotWorkflow_run.cfg
0248 [Common]
0249 SOURCE_DIR = /castor/cern.ch/cms/store/caf/user/uplegger/Workflows/381_patch3/express_T0_v3/
0250 ARCHIVE_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_archive/
0251 WORKING_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_run
0252 DBTAG = BeamSpotObjects_2009_v14_offline
0253 DATASET = /StreamExpress/Run2010A-TkAlMinBias-v4/ALCARECO
0254 FILE_IOV_BASE = lumibase
0255 #DB_IOV_BASE = lumiid
0256 DB_IOV_BASE = runnumber
0257 DBS_TOLERANCE_PERCENT = 10
0258 DBS_TOLERANCE = 25
0259 RR_TOLERANCE = 10
0260 MISSING_FILES_TOLERANCE = 2
0261 MISSING_LUMIS_TIMEOUT = 0
0262 EMAIL = uplegger@cern.ch
0263 //--------------------------------------------------------------------------------------------------
0264
0265 //-------------------
0266 [Common]
0267 SOURCE_DIR = /castor/cern.ch/cms/store/caf/user/uplegger/Workflows/381_patch3/express_T0_v3/
0268 ARCHIVE_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_archive/
0269 WORKING_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_run_sigmaz
0270 DBTAG = BeamSpotObjects_2009_SigmaZ_v14_offline
0271 DATASET = /StreamExpress/Run2010A-TkAlMinBias-v4/ALCARECO
0272 FILE_IOV_BASE = lumibase
0273 #DB_IOV_BASE = lumiid
0274 DB_IOV_BASE = runnumber
0275 DBS_TOLERANCE_PERCENT = 10
0276 DBS_TOLERANCE = 25
0277 RR_TOLERANCE = 10
0278 MISSING_FILES_TOLERANCE = 2
0279 MISSING_LUMIS_TIMEOUT = 0
0280 EMAIL = uplegger@cern.ch
0281 //--------------------------------------------------------------------------------------------------
0282
0283 //------------------BeamSpotWorkflow_lumi.cfg
0284 [Common]
0285 SOURCE_DIR = /castor/cern.ch/cms/store/caf/user/uplegger/Workflows/381_patch3/express_T0_v3/
0286 ARCHIVE_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_archive/
0287 WORKING_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_lumi
0288 DBTAG = BeamSpotObjects_2009_LumiBased_v14_offline
0289 DATASET = /StreamExpress/Run2010A-TkAlMinBias-v4/ALCARECO
0290 FILE_IOV_BASE = lumibase
0291 DB_IOV_BASE = lumiid
0292 #DB_IOV_BASE = runnumber
0293 DBS_TOLERANCE_PERCENT = 10
0294 DBS_TOLERANCE = 25
0295 RR_TOLERANCE = 10
0296 MISSING_FILES_TOLERANCE = 2
0297 MISSING_LUMIS_TIMEOUT = 0
0298 EMAIL = uplegger@cern.ch
0299 //--------------------------------------------------------------------------------------------------
0300
0301 //----------------BeamSpotWorkflow_lumi_sigmaz.cfg
0302 [Common]
0303 SOURCE_DIR = /castor/cern.ch/cms/store/caf/user/uplegger/Workflows/381_patch3/express_T0_v3/
0304 ARCHIVE_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_archive/
0305 WORKING_DIR = /afs/cern.ch/cms/CAF/CMSCOMM/COMM_BSPOT/automated_workflow/good_lumi_sigmaz
0306 DBTAG = BeamSpotObjects_2009_LumiBased_SigmaZ_v14_offline
0307 DATASET = /StreamExpress/Run2010A-TkAlMinBias-v4/ALCARECO
0308 FILE_IOV_BASE = lumibase
0309 DB_IOV_BASE = lumiid
0310 #DB_IOV_BASE = runnumber
0311 DBS_TOLERANCE_PERCENT = 10
0312 DBS_TOLERANCE = 25
0313 RR_TOLERANCE = 10
0314 MISSING_FILES_TOLERANCE = 2
0315 MISSING_LUMIS_TIMEOUT = 0
0316 EMAIL = uplegger@cern.ch
0317 //--------------------------------------------------------------------------------------------------
0318
0319 As you can see the ARCHIVE_DIR are all the same and what changes are just the DBTAG, DB_IOV_BASE and the WORKING_DIR.
0320 The MISSING_LUMIS_TIMEOUT is set to 0 because I already know that everything went well with the v13 so I don't want to timeout!