User login

Navigation

Fast Offline

Updated on Wed, 2023-05-17 13:51. Originally created by jeromel on 2007-11-12 17:20. Under:

computing

Fast Offline documentation

Disk space layout (where to find the data, path name convention)
- Running conditions (under which condition a job is submitted to the queue)
- FastOffline running modes
Functional details

Fast Offline tools and related

Fast-Offline Browser (latest year).
DAQInfo table content (feeding FastOffline)
JobInfo table (for 'dev')
Period Coordination mailing list.
Shift Reports archive
Offline database
Current Run Browser
2009 Run Browser

Layout and how it (generally) works

Disk space and path name

First, FastOffline results are on

/star/data09/
/star/data10/
/star/data11/
/star/data12/

Note first as a reminder, production paths follows a convention

/star/dataXX/$TriggerSetupName/$FieldString/$production/$Year/$DayInYear

and that starting from 2006 data, an additional path $RunNumber is appended before the file names. This was added to avoid having large directories (and therefore, slow directory look ups) on heavy loaded file system.

All log files goes to /star/rcf/prodlog/dev/log/daq. FastOffline runs out of the dev library which is associated to a production tag dev. Running from dev has its inconvenience (code must run) but this allows for code developers to see the result of their changes (hopefully improvements and bug correction) the next day as dev is released using the AutoBuild framework (see the Library release structure and policy for more information as well as the code sanity pages and the AutoBuild last release report).

Conditions for jobs to start

FastOffline works by sampling the data as follows:

The system goes from the newest to the oldest run as they appear in HPSS and the online db. Several notes are in order
- ANY runs marked bad by the Shift crew or RTS will NOT be picked nor processed by FastOffline.
  - Note: Runs marked questionable by the ShiftLeader or of unknown status could be picked by FastOffline if the other conditions below are met. Keep in mind that if the ShiftLeader marks the runs as bad late in the shift, FastOffline may have already processed the file at an earlier time when the status appeared to allow processing.
- The file must have a minimum of MINEVT events to be sent for processing
  - Note: The rational behind this cut-off was in the early days that too many "tests" were taken with valid names but would all end-up to contain about 100 events or so. This number was set to 200 from 2006 onward.
- The files MUST be present in HPSS for FastOffline to pick a run.
  - Note: If the online to offline migration of data is not possible or not done, the system will not be able to process any data.
- ANY network interrupt will not allow for FastOffline to update its database (this is rare and inconsequential and mentioned for the sake of completeness) [this has not happened since 2011]
- ANY problems with the online db, mainly the database known as RunLog, will make FastOffline stop as well. As soon as the RunLog “sanity” is restored, FastOffline would resume as normal including periodic bootstrapping logic of all records which may have appeared during the downtime [database consolidation has made this category of problems go away]
  - Note: “sanity” includes slow downs due to db overload (anyone performing intense operations with the RunLog would kill or affect negatively FastOffline so, be attentive if you receive a request from the database leader in regards of usage). In other words, RunLog is a fundamental part of operational sanity of this system.
- FastOffline relies on tables which are aggregated and migrated for sole use of FastOffline. This allows greatre scalability. However, if the migration daemons stop, FastOffline willl cease getting the vital information ite needs and hence ... stop as well.
  - Note: most tables needed by FastOffline are monitored here.
The system uses a finite number of nodes on the reconstruction farm. This number is adjustable within the limit of the farm size and the ongoing scheduled productions. It is also limited to the amount of disk space available for the system to work.
Note that FastOffline bypass (see definition in the next bullet) processes all events in a given file when a file is selected which means that the average processing time depends on the type of collision, trigger mix etc ... The number of events to be processed is otherwise adjustable and set by the NUMEVT variable. From 2006 to 2009, this was set to 200 events per file to increase sampling.

Modes

FastOffline implements several specialized algorithm and treatments:
- There are three basic treatments: regular mode, bypass and calibration.
  - The three treatment are NOT sharing the same queues in the reconstruction farm.
  - Any mode can be adjusted separately in terms of the resource taken.
  - Although the queue are separate, the reconstruction farm software checks the farm sanity as a whole. Any jobs not moving in a queue will stall the system entirely to avoid the job “vacuum effect”. Especially, HPSS sanity can greatly influence the system's ability to perform or a burst of job will lead to a pause.
  - Last, when the disk space available for FastOffline is beyond 95% occupancy, no more jobs submission may occur.
- Calibration tasks have the highest priority over ALL FastOffline processes and are ALWAYS processed first to ensure calibration readiness when processing the real data.
- Bypass is the ability to ask the system to bypass automatic processing and deal with specific runs processing (within any arbitrary chains) immediately and with high priority.
- Data pre-processing is third in the priority order. There is only one pre-processing available to date. It was introduced in 2005 and known as “ezTree production”.
  - This pre-processing pass shares the same queue than the regular mode
  - Pre-processing passes are possible as it is fast and would not jeopardize the regular data production ; too many or too long jobs would prevent regular jobs from ever running (when the slots are filled, that is it)
  - Those passes may be set using their own chain options
  - This pre-processing stage was limited to a specific trigger setup name.
- The regular mode comprise any other data processing tasks.
  Within this broad category, the resources were divided as follow up to 2007 (all numbers are adjustable).
  In 2008 onward, this queue division was disabled (mostly due to the appearance of too many streams).
  - 70% of the available slots will be used for express stream (if any exists). Note that express stream are defined as file type:
    express, jpsi, upsilon, muon and gamma
  - 30% for zerobias in all other cases
  - The remainder for any other physics runs
- FastOffline also allows for different chains for different species. Within this restriction, the following applies for
  - Files named pedestal are permanently skipped
  - Trigger setups (names or trigger set) equal to pedestal or pulser will be permanently skipped
  - Run numbers which does not have a trigger setup name or a set of triggers associated to them will be temporarily skipped until the system is able to auto-adjust (update) its information
  - In regular mode, trigger setups named test or tune (2010 addition) are skipped. If you need FastOffline to process those, do not name them test nor tune. It is expected that individual test triggers included in production trigger setups are not funneled into the previously mentioned file streams processed by FastOffline.
  - Runs not containing TPC or TPX data are skipped in calibration and regular mode at the exception of the pre-processing passes (like ezTree etc ...)
FastOfflinekeeps on disk a 15 days (adjustable) worth production sample.
- Files older than 15 days are automatically deleted from disk leaving space for newer ones.
- Results should be on retrievable from HPSS (starting from 2005, it is the default to save a copy)

All of the above numbers are adjustable so this pseudo-policy may change any time. Note that ANY additional selections such as selections on the number of events in a file (minimum), the beam energy, the magnet field type, the collision or the trigger setup name may be used to sub-select a sample of files to be used in a pre-processing pass or otherwise new algorithm.

The gory details

You don't need to read this unless you are planning to help with production or understand how the system works ... It is simple. One needs to read perl only. But if you have read until here, it probably means you ant to know so here we go ...

2005 selection for ezTree

For example, ezTree processing in 2005 included

ppProductionMinBias trigger setup name only
Files NOT already processed by the regular FastOffline processing

This leads to a selection TrgSetup=='ppProductionMinBias' AND Status==0. The value of Status and meaning is as follow:

0: the file is “new”
1: the file was submitted to the queue
2: the file has been processed. All went OK
3: the file has been processed AND has been QA-ed (note: the QA system sets this value)
4: the file was skipped
5: the file was submitted for calibration. This value will be overwritten by regular processing.
666: ... the jobs have died, disappeared or otherwise indicating a problem.

Ideally, you need to select ONLY on Status in (2,3). They are indicators of success (the rest being for internal bookkeeping only). Currently, the extended statuses for pre-processing are not updated to a value beyond 1. We can but one would need to know what to search for and parse from the logs to define success (it is not a framework limitation per say but any “decision” on OK or not implies a piece of code one has to write to validate the processing). Therefore, a safe assumption for selection pre-processed files from FastOffline is to check for Status in (2,3) AND XStatus$ID > 0 (where $ID is the ID assigned to you for your pre-processing pass i.e. 1 for ezTree, etc ...) and check for the presence of the expected output on disk.

Scripts layout

All scripts for FastOffline are located in $STAR_SCRIPTS or $STAR_CGI depending on usage. FastOffline heavily rely on

The presence of the RunLog database on onldb.star.bnl.gov
The copy of daqFileTag table into that database (initial thanks to Jeff Porter for this copy which simplifies SQL queries)
The presence of the tables daqSummary, beamInfo, magField in that same database.
The field name in each of those tables.
Any of the first section caveats and/or requirements

Everything is concentrated in one unique perl module named RunDAQ.pm. Scripts developed around are merely using methods available via this module so, one change in the module, all changes (including one mistake = all broken).

DAQFill.pl

Invoked in a cronjob via the DAQFill.csh, the sole purpose of this script is to read the online database and build a condensed summary information related to new records as they come (table DAQInfo. The perl script SHOULD NOT be invoked from the crontab since it works in an infinite loop mode of period 60 seconds.
At every cycle, only new records are updated. However, the current scheme meant to save precious time also causes to lose file sequences (this is a synchronization online/offline issue, nothing to do with FastOffline itself). Therefore, this script currently has an update mode where all records are scanned and missed ones are merged with the old ones.
There are currently 3 modes of operation for the csh wrapper :

./DAQFill.csh Clean : kills all running jobs. The purpose of this mode is to clean hanged processes not listed by a ps -ef has it happens whenever AFS experience a hick-up. This mode is executed twice a day. This has not been really necessary since 2004 but kept for those rare events which tends to happen whenever no-one is looking ...
./DAQFill.csh Update: Update mode scans all records in the online database since the last check and inserts in Fast-offline database any records it does not already have. This mode is run twice a day, just after the Clean mode. This mode starts the perl script with the first parameter=0 (which means no loop) without checking for the presence of another process. Logically, it should be placed after the Clean.
./DAQFill.csh Run : This is executed once every 10 minutes. However, the wrapper detaches the DAQFill.pl script only if does not find another process with the same name. The script is invoked with parameter=1, a loop mode of time specified by the second argument of the wrapper (and second of the perl script as well). The default time is one scan every 60 seconds. The crontabs currently specifies 300 seconds. It makes no sens to make this time parameter greater than the laps time between 2 wrapper execution.
This mode updates the records based on the last entry. Last entry involves the run number only (that's why we may lose records as whenever we query for a run, some of the file sequence may not be ready but our internal counters moves forward).

All cron-jobs are currently running under starreco and on rcas6003. Note that the csh wrapper MUST be running on a Linux box or a box where the auwx options of the ps command are available.

JobSubmit.pl

This script takes care of submitting job description files to the CRS queue system. It MUST therefore run under the following conditions

On one of the rcsuser nodes
The crs_status.pl and crs_submit.pl script MUST function (hopefully, this is true).

The current cronjobs are running under starreco on rcrsuser4.
There are 2 useful lines in this crontab :

./JobSubmit.pl dev -1 /star/data09/reco : tells fastOffline use dev as the library version, -1 (i.e. all) events found in a given file,with all output going to /star/data09/reco as a base path. Base-path means that a structure will be created from that point.
Note that the syntax allows for disk spanning. Disk spanning can be specified by using a syntax similar to
./JobSubmit.pl dev -1 /star/data+09-11/reco : this will be submitted as-is (only disk space will be checked), the disk spanning resolution will be delegated to the reconstruction wrapper code bfcca. In our example, any disk from /star/data09 to /star/data11 (that is any number XX in [09,11] for /star/dataXX) will be used.
./JobSubmit.pl dev -1 1 : this special syntax specifies that the queue should be scanned for terminated jobs and the jobfiles moved into the archive directory.
./JobSubmit.pl dev -1 C/star/data09/reco . This line processes calibration passes. All priorities are set internally to the JobSubmit.pl script.
./JobSubmit.pl dev -1 Z/star/data09/reco . This line is used for the special processing bypass (ezTree for example). All settings are internal to JobSubmit.pl . Especially, the trigger setupname to be used for this pass is a restrictive parameter.
./JobSubmit.pl dev -1 X/star/data13/Magellan:balewski . Mode "X" was introduced in 2011 data taking to pipe a portion of the data to a different disk. In this case, part of the data was moved to /star/data13/Magellan and the ownership changed as indicated. This out-sourcing mode as been used in a seldom manner.

Some syntax options while invoking this script :

You DO NOT have to know the number of jobs it can submit. There are NO CHANGES necessary. The only assumption is that FastOffline will be running om some queues of the crs nodes. The script will figure out what that means ...
The default assumed chain is assumed is declared by collision in an associative array named DCHAIN (default chain). This be overridden by specifying the next argument of the command line calling (i.e. argument 4).
There is provision for a 5^th argument = collision tag which purpose is to exclude any collision tag different than the specified one. The default implemented values are AuAu and PPPP. Note that if the collider changes the collisions tag, we are doomed (or need to declare extraneous chains to treat those).
There are 2 mode of submission even in what we named the regular mode
- by default, the submission will go from top (last run entered in the database) to bottom (first run entered in the database). During periods when we have nothing new, the submission will proceed in submitting EVERYTHING it finds. In heavy acquisition mode, this script will function as expected (sampling the data as it comes), however, at the end of the year of acquisition and/or in between active times, it will also proceed in processing the rest of the runs.
- This may be unwanted (especially at the end of the run, FastOffline not being a replacement for production). The second mode can therefore be switch on. This is achieved by specifying a restoration path preceded by a ^ character (a-la perl, this symbol means "anchor with start of"). This can be switched at crontab level of course. In this mode of submission, only the latest files will be returned by the RunDAQ.pm modules and the submission system will NOT run a job already run again. FastOffline will therefore not crawl down from latest to earliest runs.

FastOffCheck.pl

This script is still in development mode. It is meant to be executed regularly from a cronjob. Its purpose is to scan archive directory for finished jobs, compare the list with what is available in the target/destination directory and mark the entries in the DAQInfo table as processed. This is used for hand-shaking with QA (and let them know what is done or not). Possible arguments are the library version then the target directory to scan. Note that the standard directory structure is assumed. This script must be executed from a node which has access to both /star/u/starreco and the target directory. It is currently running under starreco on rcas6003.

RunDAQ.pm

RunDAQ.pm is the perl module at the heart of FastOffline processing. This module does about everything by providing other scripts (DAQFill.pl , JobSubmit.pl ) function interface.

Function interfaces

All functions are documented in the module itself. Documentation is maintained in the code header and will not be replicated here. Click here to get more information.

How can I make a selection using RunDAQ.pm

In principle, you do not need to know SQL to use the module. Many selections are handled through defining a perl associative array of conditions with values and operator and pass it as-is to the methods accepting such argument and all queries will be generated for you.

For example, setting $Cond{“Status”} = 0 implies a constraints where the db field Status will need to be equal to zero for the records to be selected.
When a value is assigned, you may use operators such as > (greater than), < (lesser than), or ! (not equal). For example, $Cond{“TrgSetup”} = “!12” would be mean anything but 12.
The | (logical or) may also be used to select several values like 12|14|17.
Bitwise operation are also handled automatically within a syntax using the : character. For example, :2 means that from the selected field, the second bit has to be true. Maybe a problem to some extent, bitwise operations and selection has to be on specific values only (multiple bits selection is not yet implemented). Also, it is done on numerical values ... but the fields using bitwise coding are fields which may potentially carry lots of possible values (like detector setup or trigger setup). Functions exists to convert string to numbers however so, this is not a problem (the multiple selection is perhaps). If needed, please send me an Email requesting the extension.
Round offs are taken care off internally by conversion routines. You can specify for example an approximate beam energy as displayed by the FastOffline browser. This allows for consistency of treatment of the precision.

Table structures, initializing a new year

Tables exists in the operation database and are:

DAQInfo - the main table containing all information about files, run, number of events and status of the FastOffline processing (field Status). Note that Status are given by rdaq_status_string which may take as valueone of the following
- 0: 000 : unknown
- 1: 001 : Submitted
- 2: 010 : Processed
- 3: 011 : QADone
- 4: 100 : Skipped
- 5 : SCalib (used for calibration)
- 6 : FCalib (used for Fast calibration)
- 111 : Marked bad
- 666 : Died
Ideally, those codes are internal or for external use (QA) and may be changed. Especially, the mask model has not been followed for statuses > 3 (calibration may set bits later on the upper range).
Other tables are dictionary tables allowing for fast list building and indirect search. They are self maintained by the system and should not be tempered with unless you really (really) know what you are doing. They are:
- FOChains - this will hold a historical list of chains used in the process
- FODetectorTypes - this will hold all detectors found and build masks
- FOFileType - will hold the kind of files i.e. physics, pedestal, etc ...
- FOLocations - this table will tell you where the files are processed
- FOMessages - this table will hold all internal to the system's messages broadcast via rdaq_set_message.
- FOruns - list of run numbers for the year
- FOTriggerBits - list of individual triggers contained ina run. Masks will be build from this information.
- FOTriggerSetup - list of run configuration trigger setup
- DAQInfo - this table wil contain a copy of all information from the RunLog

bfcca

Because the CRS queue system is only a job-description based system, an extraneous script needs to be used in order to handle the running pass of root4star. There are many bfcXXX scripts around and we tried to make this one as general as possible.
The help was moved here: bfcca.

Printer-friendly version
Login or register to post comments

The STAR experiment

Software & Computing