Analysis code for UPC picoDST (Krakow format)
Analysis code for UPC picoDST (Krakow format)
Should you have any questions/comments/remarks please contact rafal.sikora@fis.agh.edu.pl or
leszek.adamczyk@agh.edu.pl.
1. Introduction
2. Code structure
3. How to run
4. Configuration file (options)
5. Useful links
Introduction
In this page you can find a set of instructions that will enable you to develop, run, and share
ROOT-based C++ code for the picoDST analysis created and maintained by the Krakow group of the UPC PWG. The code is shared beetween all data analyzers via CVS repository
http://www.star.bnl.gov/cgi-bin/protected/cvsweb.cgi/offline/UPC/.
Code structure
► Shared files - can be editted by all users:
rpAnalysis.cpp - analysis class (definitions)
rpAnalysis.hh - analysis class (header)
config.txt - configuration file
► Core files - do not edit those files and directories:
runRpAnalysis - launching script (recommended to use)
rpAnalysisLauncher.C - launching script
clearSchedulerFiles.sh - utility macro which removes files created by STAR scheduler
picoDstDescriptionFiles - folder with a core code describing picoDST content etc.
The skeleton of the analysis class rpAnalysis (inherits from
TSelector) was created with
ROOT built-in method MakeSelector() (some more information about MakeSelector() can found here).
When the analysis starts, methods rpAnalysis::Begin() and rpAnalysis::SlaveBegin() are invoked (right place to create histograms etc.). Next, the rpAnalysis::Process() is called for each event in picoDST tree - you can put here your selection algorithms, filling histograms, and so on. After all events in picoDST are processed, methods rpAnalysis::SlaveTerminate() and rpAnalysis::Terminate() are invoked, where e.g. output file can be written.
Data of single event accessible in
rpAnalysis::Process() is stored by particle_event object. Click here to see all elements of this class.
Running an analysis code should be launched by runRpAnalysis script (
executable). The script can be run with one argument which is the name of the configuration file with a definition of the trigger that you would like to analyze and some analysis options (they can be used to control which part of a code should be executed). If script is run without any arguments, a configuration file config.txt is used to launch the analysis.
How to run
►
Preparation of the analysis environment (first run)
- Setup environment to stardev.
stardev
- Create and enter a directory where you want to work with the UPC analysis code. Let's denote it MY_PATH.
mkdir MY_PATH
cd MY_PATH
- Download analysis code from repository. Enter the code directory.
cvs co offline/UPC
cd offline/UPC
-
Edit the configuration file config.txt. Especially important is to provide valid paths under options CODE_DIRECTORY and OUTPUT_DIRECTORY (absolute paths). You are encouraged to set the path for analys output outside the offline/UPC.
CODE_DIRECTORY=/absolute/path/to/MY_PATH/offline/UPC
OUTPUT_DIRECTORY=/absolute/path/to/MY_PATH/output
OUTPUT_DIRECTORY does not have to exist, in such case it will be automatically created by the analysis launching script.
-
Now you are prepared to run the analysis. For the first execution do not edit SINGLE_RUN option in the configuration (leave it set to "yes"). To start analysis simply type
runRpAnalysis
If there are any problems with the configuration file, e.g. wrong data directory etc., you will receive appropriate message(s) in the terminal.
If no problems are found by the launching script, you should see a ROOT being started and displaying messages about compilation progress (please, do not bother about the warnings related to picoDST description files). When the compilation is done, analysis code is finally executed. You can verify the successfull execution by checking the content of OUTPUT_DIRECTORY - you should find there a ROOT file with analysis output.
► Regular code development/running
- Setup environment to stardev.
stardev
- Enter the directory with UPC analysis code (MY_PATH is where you have offline/UPC directory).
cd MY_PATH/offline/UPC
-
Update the content of shared repository - this will ensure you are working with the latest version of all files in offline/UPC.
cvs update
- Now you are free to work on analysis. You can change the analysis code (rpAnalysis.cpp, rpAnalysis.hh), edit the configuration file to run analysis code over various triggers, with different options etc., and launch analysis using runRpAnalysis script.
NOTE: Use comments // or /* */ to describe part of the code you add, so that everybode can understand it.
- When you finish working with the code you should commit the changes you have made, so that all users are always working with same version of the software. It is important to always do a commit if change in the code has been made. Simply type
cvs commit rpAnalysis.cpp rpAnalysis.hh
NOTE: Before 'commit' always make sure that the code compiles and executes without errors! If the code doesn't work, but you would like to save all your work, you can easily comment lines in the code you have added, commit and work out the problem later.
NOTE 2: Do not commit files other that rpAnalysis.cpp or rpAnalysis.hh. Especially important is to avoid committing configuration file, which is analyzer-specific.
NOTE 3: CVS is "smart", so if somebody does a commit before you do, it can merge (typically with success) the changes in latest commited and your version of the file. If after doing 'cvs commit' you receive a message similiar to
cvs commit: Up-to-date check failed for `rpAnalysis.cpp'
cvs [commit aborted]: correct above errors first!
it means that described conflict has occured. In such case simply do
cvs update
If you don't get any warnings, you can re-commit (first command in bullet #5). However, if you find a warnig like
rcsmerge: warning: conflicts during merge
cvs update: conflicts found in rpAnalysis.cpp
you need to manually edit the file you want to commit. Click here to learn about the details.
If you find any problems (code does not compile or crashes at execution) and you suspect it is an issue of the code core you are kindly requested to report it to developers.
Configuration file (options)
Find below a list of options available in the configuration file. Obligatory options are TRIGGER, SINGLE_RUN, RUN_NUMBER (only if SINGLE_RUN=yes), DATA_DIRECTORY, CODE_DIRECTORY and OUTPUT_DIRECTORY.
If you think more options/utilities are needed in the configuration file, contact developers.
- TRIGGER
This option is the name of the trigger that you want to analyze. It should have the same form as at http://online.star.bnl.gov/rp/pp200/.
- SINGLE_RUN
• If set to "yes" forces analysis of a single run (run number is defined by RUN_NUMBER option). In this case analysis is launched without STAR scheduler, using node you are currently logged on. Name of the output ROOT file in the OUTPUT_DIRECTORY has the following form: analysisOutput.RUN_NUMBER.TRIGGER.root.
• If set to "no", full available dataset for a TRIGGER is analyzed using STAR Scheduler with job-splitting to multiple RACF nodes.
NOTE: It is recommended to check validity of the code (compilation and execution with no errors) using SINGLE_RUN=yes before you run analysis over full dataset using STAR Scheduler with SINGLE_RUN set to "no".
The submission XML file is automatically created by the launching script. Number of files for a single job is defined to 20 (can make it changeable if needed), so typically there are a few dozens of jobs submitted. This results in plenty of scheduler files showing up in CODE_DIRECTORY, as well as log/error files in OUTPUT_DIRECTORY. If you want to clean up CODE_DIRECTORY from the scheduler files, use clearSchedulerFiles.sh script. You can check progress of jobs execution with command
condor_q -submitter $USER
or, if you do not have any other jobs submitted, use
condor_q -submitter $USER | tail -n1
If the output is:
0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended
it means that all jobs are finished. If all jobs were successfull, in your OUTPUT_SIRECTORY you should see a number of ROOT files called analysisOutput.SOME_LONG_NAME_WITH_VARIOUS_CHARACTERS.TRIGGER.root. Those are output files from each single job (SOME_LONG_NAME_WITH_VARIOUS_CHARACTERS is the ID of submission and ID of the job, separated by underscore "_"). To merge them into single file type
hadd allRunsMerged.root analysisOutput.*
This will create a single file called allRunsMerged.root. Remember to merge files only from one submission! If you suspect something went wrong during job execution you can check log and error files of each single job that are placed in OUTPUT_DIRECTORY and have extensions .log and .err, respectively.
- RUN_NUMBER
is the ID number of analyzed run (this option is omitted if SINGLE_RUN=no).
- DATA_DIRECTORY
Should contain full path to a directory where lists of available picoDST files are stored (same place as picoDSTs themselves). Currently it is /gpfs01/star/pwg/UPCdst.
- CODE_DIRECTORY
Should contain full path to a directory where your private copy of offline/UPC/ directory is placed.
- OUTPUT_DIRECTORY
Should contain full path to a directory where you want an analysis output (ROOT files, log files) to be saved. In case OUTPUT_DIRECTORY does not exist, it is created.
- ANALYSIS_OPTIONS
This option is intended to contain set of options separated by "|" character, that are send to analysis program and can be used, for example, to control which part of code should be executed etc..
Useful links
UPC analysis code repository in STAR CVS: http://www.star.bnl.gov/cgi-bin/protected/cvsweb.cgi/offline/UPC/
CVS tutorial @ drupal: https://drupal.star.bnl.gov/STAR/comp/sofi/tutorials/cvs
Presentation on the Krakow picoDST: https://drupal.star.bnl.gov/STAR/system/files/talk_42.pdf
StMuRpsCollection documentation (write-up): https://drupal.star.bnl.gov/STAR/system/files/RomanPotsInStEvent_0.pdf
StMuRpsCollection documentation (doxygen): http://www.star.bnl.gov/webdata/dox/html/classStMuRpsCollection.html
Roman Pot alignment description: to be added