scheduler tricks

Updated on Thu, 2012-06-21 14:50. Originally created by balewski on 2009-04-07 10:43.

To see full details of the job que do:
- condor_q -long -submitter balewski
To see all machines available to STAR do
- condor_status -constraint 'Turn_Off==False'
STAR queues limits:
+---------------------------------------------Queues-----------------------------------------------+
+-------------------------+------+--------+-----------+--------+-------+--------+------------------+
| ID                      | Name | Type   | TimeLimit | MaxMem | Local | S.O.P. | Cluster          |
+-------------------------+------+--------+-----------+--------+-------+--------+------------------+
| bnl_condor_short_quick |      | CONDOR | 180min    | 350MB | N     | 1      | rcrs.rcf.bnl.gov |
| BNL_condor_medium_quick |      | CONDOR | 300min    | 400MB | N     | 50     | rcrs.rcf.bnl.gov |
| bnl_condor_long_quick   |      | CONDOR | 14400min | 440MB | N     | 100    | rcrs.rcf.bnl.gov |
+-------------------------+------+--------+-----------+--------+-------+--------+------------------+
Condor priority can be checked with condor_userprio

% condor_userprio -allusers | grep "balewski"
balewski@bnl.gov 179.52

with no activity you have value of 5 and it grows from there...
running jobs fired from different rcas6nnn machines:
- condor_q -submitter balewski | grep "running" ; date
You can remove all the jobs by using
% condor_vacate -fast
wait a few mnts
% condor_rm $USER

Alternatively, use the
-pool condor02.rcf.bnl.gov:9664
qualifier to be sure to remove jobs even if they were not
submitted from the node where you are now.
tell scheduler to split files: (PDF )
1. <job datasetSplitting="eventBased" maxEvents="2000">
2. root4star -q -b userMacro.C$$EVENTS_START, $EVENTS_STOP,\"$FILELIST\"$

Job priority, by Leve:

This is not really a SUMS issue, it’s sofi. It would depend on a few things:
1) The resource requirements of the jobs.
2) Your condor priority can be checked with condor_userprio
3) The queue the jobs are in.
4) Also if the queue gets real long it will, only check a section of the 
jobs at once to try to match them to nodes. So in condor it will depend 
on which section it’s matching. This is why we fear users submitting 
real large numbers of jobs.
5) The priority of the jobs relative to each other, this can force jobs 
to start in the order you want it can be set in SUMS or in the .condor 
file. This will not effect overall priority.

< ResourceUsage>
< Priority>n< /Priority>
< /ResourceUsage>



See: http://www.star.bnl.gov/public/comp/Grid/scheduler/manual.htm
See Section: 3.4 Resource requirements :  element and  sub-elements

balewski's blog
Login or register to post comments

The STAR experiment

User login

Navigation

scheduler tricks