- genevb's home page
- Posts
- 2024
- 2023
- 2022
- September (1)
- 2021
- 2020
- 2019
- December (1)
- October (4)
- September (2)
- August (6)
- July (1)
- June (2)
- May (4)
- April (2)
- March (3)
- February (3)
- 2018
- 2017
- December (1)
- October (3)
- September (1)
- August (1)
- July (2)
- June (2)
- April (2)
- March (2)
- February (1)
- 2016
- November (2)
- September (1)
- August (2)
- July (1)
- June (2)
- May (2)
- April (1)
- March (5)
- February (2)
- January (1)
- 2015
- December (1)
- October (1)
- September (2)
- June (1)
- May (2)
- April (2)
- March (3)
- February (1)
- January (3)
- 2014
- December (2)
- October (2)
- September (2)
- August (3)
- July (2)
- June (2)
- May (2)
- April (9)
- March (2)
- February (2)
- January (1)
- 2013
- December (5)
- October (3)
- September (3)
- August (1)
- July (1)
- May (4)
- April (4)
- March (7)
- February (1)
- January (2)
- 2012
- December (2)
- November (6)
- October (2)
- September (3)
- August (7)
- July (2)
- June (1)
- May (3)
- April (1)
- March (2)
- February (1)
- 2011
- November (1)
- October (1)
- September (4)
- August (2)
- July (4)
- June (3)
- May (4)
- April (9)
- March (5)
- February (6)
- January (3)
- 2010
- December (3)
- November (6)
- October (3)
- September (1)
- August (5)
- July (1)
- June (4)
- May (1)
- April (2)
- March (2)
- February (4)
- January (2)
- 2009
- November (1)
- October (2)
- September (6)
- August (4)
- July (4)
- June (3)
- May (5)
- April (5)
- March (3)
- February (1)
- 2008
- 2005
- October (1)
- My blog
- Post new blog entry
- All blogs
FastOffline crashes on 2013-03-21
Just a documentation of the lifetimes of jobs that crashed in FastOffline on March 21, 2013.
Using the production logs, I took a quick look at the end time (vertical axis) of jobs that ended on March 21 vs. the start time (horizontal axis). Red/magenta indicates the job crashed, and blue/cyan is no crash. Files are only st_physics (blue/red) or st_physics_adc (cyan/magenta). Negative on the horizontal axis just means the hours before midnight, otherwise the numbers are the hour in eastern time (i.e. 6 = 6am, and 5.2 = 5:12am). The second plot is a zoom in on the jobs that started just after 5am but ended quickly.
Observations:
_____________________
Codes (there are probably better ways to do this, but what I was able to do quickly):
Obtain time stamps:
Make plots:
-Gene
Using the production logs, I took a quick look at the end time (vertical axis) of jobs that ended on March 21 vs. the start time (horizontal axis). Red/magenta indicates the job crashed, and blue/cyan is no crash. Files are only st_physics (blue/red) or st_physics_adc (cyan/magenta). Negative on the horizontal axis just means the hours before midnight, otherwise the numbers are the hour in eastern time (i.e. 6 = 6am, and 5.2 = 5:12am). The second plot is a zoom in on the jobs that started just after 5am but ended quickly.
Observations:
- All jobs which started before 2am and finished after ~5:12am crashed.
- Many jobs tried to start around 12:30, and they all crashed, but took a long time to do so (longer than the typical time for such non-adc or adc jobs).
- There are no log files (yet?) for jobs that started between ~12:30am and ~5:12am.
- Many jobs that started just after ~5:12am crashed, in two classes: quickly, or (again) longer than it typically took to run such jobs.
- Jobs that started after ~5:18am that have log files (i.e. have finished) have finished successfully. So far these are only adc jobs because they finish more quickly than the non-adc.
_____________________
Codes (there are probably better ways to do this, but what I was able to do quickly):
Obtain time stamps:
set ffs = `/bin/ls -l /star/rcf/prodlog/dev/log/daq/st_ph*.log.gz | grep "Mar 21" | colrm 1 50` touch res touch files foreach ff ($ffs) set brks = `zgrep -c -i break $ff ` echo $ff >> files set btime = `zgrep "Mar 2" $ff | head -1 | awk '{print $8}'` set etime = `/bin/ls -l $ff | awk '{print $8}'` set adc = `echo $ff | grep -c adc` echo $brks $adc $btime $etime >> res end sed -i 's/\:/ /g' res
Make plots:
TNtuple tt("tt","tt","break:adc:sh:sm:ss:eh:em"); tt.ReadFile("res"); tt.SetMarkerStyle(8); gStyle->SetGridColor(kGray); TCut late = "abs(sh+(sm/60.)-24*(sh>14)-5)<1&&eh+(em/60.)<6"; TCut break = "break>0"; TCut adc = "adc>0"; tt.SetMarkerColor(4); tt.Draw("eh+(em/60.):sh+(sm/60.)-24*(sh>13)"); tt.SetMarkerColor(7); tt.Draw("eh+(em/60.):sh+(sm/60.)-24*(sh>13)",adc,"same"); tt.SetMarkerColor(2); tt.Draw("eh+(em/60.):sh+(sm/60.)-24*(sh>13)",break&&!adc,"same"); tt.SetMarkerColor(6); tt.Draw("eh+(em/60.):sh+(sm/60.)-24*(sh>13)",break&&adc,"same"); tt.SetMarkerColor(4); tt.Draw("eh+(em/60.):sh+(sm/60.)-24*(sh>13)",late); tt.SetMarkerColor(7); tt.Draw("eh+(em/60.):sh+(sm/60.)-24*(sh>13)",adc,"same"); tt.SetMarkerColor(2); tt.Draw("eh+(em/60.):sh+(sm/60.)-24*(sh>13)",break&&!adc,"same"); tt.SetMarkerColor(7); tt.Draw("eh+(em/60.):sh+(sm/60.)-24*(sh>13)",break&&adc,"same");
-Gene
Groups:
- genevb's blog
- Login or register to post comments