dongx's blog

To have a better understanding on the precision and size impact of the choice of different type of variables for NSigmas, I did a test production with several different setups on a single file.

DEV version 20180802

I modified the NSigma definitions using the following four different types. I only made the change to the four NSigma variables in PicoTrack.

DEV: Float16_t // [-31,31,16]
DEV_Dmitri: Float16_t // [0,0,8]
DEV_Short: Short_t // *100
DEV_Float: Float_t

Here is the output file size comparison from different setups.

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Courier New'; color: #ff9300; background-color: #232323}
span.s1 {font-variant-ligatures: no-common-ligatures; color: #232323; background-color: #0433ff}
span.s2 {font-variant-ligatures: no-common-ligatures}

rcas6006] ~/pwg/pico/20180802/> ls -l */*.root

-rw-r--r-- 1 dongx rhstar 11999988 Aug 2 12:32 DEV_Dmitri/st_physics_12130084_raw_5020002.picoDst.root

-rw-r--r-- 1 dongx rhstar 12375213 Aug 2 12:50 DEV_Float/st_physics_12130084_raw_5020002.picoDst.root

-rw-r--r-- 1 dongx rhstar 11910779 Aug 2 12:29 DEV_Short/st_physics_12130084_raw_5020002.picoDst.root

-rw-r--r-- 1 dongx rhstar 12158187 Aug 2 12:12 DEV/st_physics_12130084_raw_5020002.picoDst.root

DEV_Short takes the smallest size, DEV (Float16) is about 2% larger in size, DEV_Dmitri is in between as Dmitri predicted. DEV_Float is the largest, but only ~4% larger than DEV_Short. So it is also a question where really worth of the effort, or maybe just save Float so we don't need to worry about the precision.

I compared the NSigmaElectron distributions with two different choices of histogram binning
A - (400,-10,10)
B - (289,-13,13) - this was suggested by Kunsu in his setup before to avoid spikes for Short type variables.

Here is the comparison of NSigmaElectron distributions
Left to right are for four different setups: DEV, DEV_Dmitri, DEV_Short, DEV_Float
and top row for binning-A and bottom row for binning-B.

I took the ratio for each one w.r.t the Float (presumably the baseline). Error bars are calculated from a simple error propagation assuming independent histograms (may not be very correct, but I don't have a better assessment at this moment).

Conclusions:
- DEV and DEV_Short (with good choice of binning) seem to preserve good precision w.r.t to the baseline.
- DEV_Dmitri seems to show still fluctuations with the two binning choices investigated here.

Current DEV version looks to be OK for NSigma variables. But the question is also whether it is worth it considering the size reduction from Float -> Float16 in the current picoDst is only 2%. May be better to just keep Float to avoid any potential precision loss issue that may show up in various analysis.

dongx's blog
Login or register to post comments

The STAR experiment

dongx's blog

User login

Navigation

PicoDst Test - 20180802