- balewski's home page
- Posts
- 2013
- 2012
- 2011
- 2010
- 2009
- December (4)
- November (1)
- October (5)
- September (6)
- August (1)
- July (1)
- June (1)
- May (2)
- April (5)
- March (5)
- February (4)
- January (6)
- 2008
- December (1)
- November (1)
- September (3)
- July (1)
- June (4)
- April (2)
- March (2)
- February (3)
- January (2)
- 2007
- October (1)
- My blog
- Post new blog entry
- All blogs
scheduler tricks
Updated on Thu, 2012-06-21 14:50. Originally created by balewski on 2009-04-07 10:43.
- To see full details of the job que do:
- condor_q -long -submitter balewski
- To see all machines available to STAR do
- condor_status -constraint 'Turn_Off==False'
- STAR queues limits:
+---------------------------------------------Queues-----------------------------------------------+
+-------------------------+------+--------+-----------+--------+-------+--------+------------------+
| ID | Name | Type | TimeLimit | MaxMem | Local | S.O.P. | Cluster |
+-------------------------+------+--------+-----------+--------+-------+--------+------------------+
| bnl_condor_short_quick | | CONDOR | 180min | 350MB | N | 1 | rcrs.rcf.bnl.gov |
| BNL_condor_medium_quick | | CONDOR | 300min | 400MB | N | 50 | rcrs.rcf.bnl.gov |
| bnl_condor_long_quick | | CONDOR | 14400min | 440MB | N | 100 | rcrs.rcf.bnl.gov |
+-------------------------+------+--------+-----------+--------+-------+--------+------------------+ - Condor priority can be checked with condor_userprio
% condor_userprio -allusers | grep "balewski"
balewski@bnl.gov 179.52
with no activity you have value of 5 and it grows from there... - running jobs fired from different rcas6nnn machines:
- condor_q -submitter balewski | grep "running" ; date
- You can remove all the jobs by using
% condor_vacate -fast
wait a few mnts
% condor_rm $USER
Alternatively, use the
-pool condor02.rcf.bnl.gov:9664
qualifier to be sure to remove jobs even if they were not
submitted from the node where you are now. - tell scheduler to split files: (PDF )
-
<job datasetSplitting="eventBased" maxEvents="2000">
-
root4star -q -b userMacro.C\($EVENTS_START, $EVENTS_STOP,\"$FILELIST\"\)
-
- Job priority, by Leve:
This is not really a SUMS issue, it’s sofi. It would depend on a few things: 1) The resource requirements of the jobs. 2) Your condor priority can be checked with condor_userprio 3) The queue the jobs are in. 4) Also if the queue gets real long it will, only check a section of the jobs at once to try to match them to nodes. So in condor it will depend on which section it’s matching. This is why we fear users submitting real large numbers of jobs. 5) The priority of the jobs relative to each other, this can force jobs to start in the order you want it can be set in SUMS or in the .condor file. This will not effect overall priority. < ResourceUsage> < Priority>n< /Priority> < /ResourceUsage> See: http://www.star.bnl.gov/public/comp/Grid/scheduler/manual.htm See Section: 3.4 Resource requirements :
element and sub-elements - ddd
»
- balewski's blog
- Login or register to post comments