Investigation CPUtime > RunTime


Lidia reported on April 1st that some jobs seem to have CPUTime > RunTime, throwing off balance our efficiency calculation.

Upon check, all jobs displaying this characteristsics did run on (only one node). An example provided follows

Job name CPU/events RealTime/events Time run
AuAu39_production_ReversedFullField_P10ik_st_mtd_11110011_raw_4170001 22.34 14.17 2011-04-02 02:50:00
AuAu39_production_ReversedFullField_P10ik_st_physics_11110028_raw_4040001 11.06 7.31 2011-04-03 12:21:00


Measurements and notes


The below table shows the bench mark measurements for the reference node (rcrs133) and a normally behaving ode (rcrs134). Both are Intel(R) Xeon(R) CPU X5560 @ 2.80 GHz. Bench mark were based on a 10 passes over a suite similar than gcc 4.3.2 compared to gcc 4.5. and averaged below

    Composite FFT SOR MC SparseMult LU Average 692.63 632.01 778.91 215.98 746.08 1090.19 Average 457.73 422.05 506.47 152.02 486.61 721.52
  Differences 33.91% 33.22% 34.98% 29.61% 34.78% 33.82%

The defective node is between 33-34% less efficient than a normal node but this may be due to a clock problem (if CPUTime is systematically greater than RunTime, marks will be skewed toward lower values).

In the above, our mark program was NOT muli-threaded.

Note on the executable

To the extent our application would contain ANY multi-threading, it is possible to have CPUTime (usually including the sum of all times used across all CPUs) greater than RealTime. To the extreme case where the workload spread evenly on two CPus for example, CPUTime wuld be exactly RealTime x 2.

 ldd /afs/ =>  (0xffffe000) => /afs/ (0xf776f000) => /afs/ (0xf6fca000) => /afs/ (0xf6ddf000) => /afs/ (0xf6ccb000) => /afs/ (0xf68c6000) => /afs/ (0xf66ce000) => /afs/ (0xf65bb000) => /afs/ (0xf64e1000) => /afs/ (0xf62e1000) => /afs/ (0xf62b2000) => /afs/ (0xf6262000) => /afs/ (0xf6020000) => /afs/ (0xf5f93000) => /afs/ (0xf5dc1000) => /afs/ (0xf5d76000) => /afs/ (0xf5a35000) => /afs/ (0xf58ef000) => /usr/lib/ (0xf57d6000) => /lib/ (0xf57bd000) => /lib/ (0xf578a000) => /lib/ (0xf5785000) => /usr/lib/ (0xf56cb000) => /usr/lib/ (0xf5674000) => /usr/lib/ (0xf5663000) => /lib/ (0xf565a000) => /usr/lib/ (0xf5570000) => /lib/ (0xf5547000) => /lib/ (0xf553b000) => /lib/ (0xf53e2000) => /lib/ (0xf53c9000) => /usr/lib/ (0xf53c5000) => /usr/lib/ (0xf53bf000)
        /lib/ (0xf7f01000) => /usr/lib/ (0xf53b6000) => /usr/lib/ (0xf539c000)

The way times are calculated could be found in this article for example (where an explaination of this issue also arises). See also the CPU Time article in Wikipedia (while not accademic, its simple and comprehensive explaination provides a good place to start).

Clock timing

Using the time system function on a sample executable (spending time calculating prime numbers greater than an initial value / no IO involved) BUT single thread application, I observed the following:

  • rcrs6134 (reference): always returns %tages with 100%
  • rcrs6133 returned twice in 10 tests %tages > 100% (once 200.0% and once 198.4%)
  • I could create the > 100% results at least half of the time if, before a a 'time', I would use 'sync' (stat over 10 tries only / perhaps coincidence)

A kernel and/or IO problem has to exist on this node (results > 100% are not possible for single threaded processes).

Sample results from rcrs6133:

0.132u 0.002s 0:00.13 100.0%    0+0k 0+0io 0pf+0w
27.983u 0.045s 0:14.01 200.0%   0+0k 0+0io 0pf+0w
1.877u 0.002s 0:01.91 97.9%     0+0k 0+0io 0pf+0w
19.045u 0.005s 0:19.07 99.8%    0+0k 0+0io 0pf+0w
20.267u 0.000s 0:20.28 99.9%    0+0k 0+0io 0pf+0w
20.425u 0.000s 0:20.45 99.8%    0+0k 0+0io 0pf+0w
20.401u 0.000s 0:20.41 99.9%    0+0k 0+0io 0pf+0w
etc ...
Using "sync" in between
40.580u 0.028s 0:20.46 198.4%   0+0k 0+0io 0pf+0w
19.884u 0.000s 0:19.90 99.8%    0+0k 0+0io 0pf+0w
19.002u 0.000s 0:19.01 99.9%    0+0k 0+0io 0pf+0w
36.679u 0.929s 0:20.10 187.0%   0+0k 0+0io 0pf+0w
19.709u 0.000s 0:19.71 99.9%    0+0k 0+0io 0pf+0w
39.912u 0.090s 0:20.05 199.4%   0+0k 0+0io 0pf+0w
etc ...