oleg's blog

 Instructions for submission of condor jobs
--------------------------------------------

0. Content
   condor/AAAA_README (this file)
   condor/condor.cmd (script for each process)
   get_file_list.csh (contains examples to create file lists)
   myutils.py (contains the submit_run method and condor.job information)
   submit_job.py (main python script)
   
1. Create file list (e.g. production=run12_pp200_trans)
   -> [production].files: list of MuDst files and events per file
   Examples for PRODUCTION tags are in get_file_list.csh.
   The number of events has to be included in the file list even if this
   option is not used in the submit_jobs.py script.
   (The list of runnumbers is created from the file list.)
   
2. Use "python submit_jobs.py [production]"
   [production] has to be the name of the file lists.
   It will also be used for the directory structure in the output
   -> ../data/[production]
      ../data/[production]/[runnumber]/
      ../data/[production]/[runnumber]/condor.job
      ../data/[production]/[runnumber]/condor.cmd
      ../data/[production]/[runnumber]/files_[runnumber].list
      ../data/[production]/[runnumber]/proc_0/
      ../data/[production]/[runnumber]/proc_0/files.list
      ../data/[production]

   Jobs are submitted for each run.
   If a maximum number of events is set, each process tries to maximize the input
   files (starting with the largest file).
   Log files will be written to the process directories for each process.
   The analysis writes the root output to the local scratch directory.
   If the job crashes or gets evicted, the output will not be copied into
   the process directory.

   *** Change analysis specific details in submit_jobs.py and condor.cmd (template)
   *** submit_jobs.py: output_root = "output.root"
   *** condor.cmd: OUTPUT (basename) and DIR_ANA (directory with .sl73_gcc485 libraries)

   -e: Maximum number of events per process
       Files are grouped into segments with less than the maximum number of events.
       If single files have more than the maximum number, they will still process 
       the whole file in a single job. If not used, all files will be processed 
       separately (one job per file).
   -f: Force all runs to process
       Any previous [production] will be removed and all runs are processed.
       If -f is NOT set, existing runs will be skipped.
   -m: Last runnumber to process
       Stops submission after [runnumber] to limit the number of concurrent jobs and 
       database access. It can be used to incrementally submit ranges of runnumber,
       when previously existing runs are skipped.
   -s: Resubmit data set
       Resubmit all runs from [production].
       Runlists and scripts are not recreated. All other options will be ignored.
       Only incomplete segments will actually run the analysis, but all processes are 
       submitted (log files will be overwritten).

3. At the completion of each segment, the script will try to merge the 
   output from all segments into a single root file for each run:
   ../data/[production]/[runnumber]/outfile_merge_[runnumber].root
   A merged file will only be created when all processes are complete.

4. The condor jobs can be submitted manually for each run too.
   -> condor_submit condor.job
   The log files will be appended in this case.

oleg's blog
Login or register to post comments

The STAR experiment

oleg's blog

User login

Navigation

Condor submission organized by run number