SGE Job Manager patch
Updated on Tue, 2006-12-26 18:15. Originally created by stargrid on 2005-09-12 15:42.
Under:
We should come on this page with a draft that we want to send
to the VDT guys about the SGE Job Manager.
The snipped of code above is also missing a statement for the standard error.
At the end instead of:
Additionally, if deployed in a CHOS environment, the job manager should be
modified with the following additions at line 567:
- Missing environment variables definition
- In the BEGIN section check if $SGE_ROOT, $SGE_CELL and the commands ($qsub, $qstat, etc) are defined properly
- in the SUBMIT, POOL and CLEAR sections, locate the line
$ENV{"SGE_ROOT"} = $SGE_ROOT;
and add the line$ENV{"SGE_CELL"} = $SGE_CELL;
- Bug finding the correct job id when clearing jobs
- in the CLEAR section, locate the line
system("$qdel $job_id > /dev/null 2 > /dev/null");
and replace for the following block$ENV{"SGE_ROOT"} = $SGE_ROOT; $ENV{"SGE_CELL"} = $SGE_CELL; $job_id =~ /(.*)\|(.*)\|(.*)/; $job_id = $1; system("$qdel $job_id > /dev/null 2 > /dev/null");
- in the CLEAR section, locate the line
- SGE Job Manager modifies definitions of both the standard output and standard error file names by appending .real. This procedure fails when a user specifies /dev/null for either of those files. The problem happens twice - once starting at line 318
##### # Where to write output and error? # if(($description->jobtype() eq "single") && ($description->count() > 1)) { ##### # It's a single job and we use job arrays # $sge_job_script->print("#\$ -o " . $description->stdout() . ".\$TASK_ID\n"); $sge_job_script->print("#\$ -e " . $description->stderr() . ".\$TASK_ID\n"); } else { # [dwm] Don't use real output paths; copy the output there later. # Globus doesn't seem to handle streaming of the output # properly and can result in the output being lost. # FIXME: We would prefer continuous streaming. Try to determine # precisely what's failing so that we can fix the problem. # See Globus bug #1288. $sge_job_script->print("#\$ -o " . $description->stdout() . ".real\n"); $sge_job_script->print("#\$ -e " . $description->stderr() . ".real\n"); }and then again at line 659:
if(($description->jobtype() eq "single") && ($description->count() > 1)) ##### # Jobtype is single and count>1. Therefore, we used job arrays. We # need to merge individual output/error files into one. # { # [dwm] Use append, not overwrite to work around file streaming issues. system ("$cat $job_out.* >> $job_out"); system ("$cat $job_err.* >> $job_err"); } else { # [dwm] We still need to append the job output to the GASS cache file. # We can't let SGE do this directly because it appears to # *overwrite* the file, not append to it -- which the Globus # file streaming components don't seem to handle properly. # So append the output manually now. system("$cat $job_out.real >> $job_out"); }
# So append the output manually now. system("$cat $job_out.real >> $job_out"); }it should read:
# So append the output manually now. system("$cat $job_out.real >> $job_out"); system("$cat $job_err.real >> $job_err"); }
$ENV{"SGE_ROOT"} = $SGE_ROOT; if ( -r "$ENV{HOME}/.chos" ){ $chos=`cat $ENV{HOME}/.chos`; $chos=~s/\n.*//; $ENV{CHOS}=$chos; }
»
- Printer-friendly version
- Login or register to post comments