- jwebb's home page
- Posts
- 2019
- 2018
- 2017
- 2016
- 2015
- 2014
- 2013
- November (1)
- October (1)
- September (1)
- July (1)
- June (1)
- April (1)
- March (3)
- February (1)
- January (1)
- 2012
- 2011
- December (2)
- September (3)
- August (5)
- July (6)
- June (6)
- May (1)
- April (5)
- March (5)
- February (2)
- January (2)
- 2010
- December (3)
- October (3)
- September (2)
- August (2)
- June (2)
- May (4)
- April (4)
- March (2)
- February (4)
- January (10)
- 2009
- 2008
- 2007
- 2006
- July (1)
- My blog
- Post new blog entry
- All blogs
AgML Checksum Nightly Tests
04/17/2014
Lidia deployed nightly checksum of ROOT geometry in nightly tests. Resulting checksums and number of volumes which match the template checksums are recorded in the LibraryJobs database in the AgML_checksum table.
04/18/2014
Checksums were evaluated after last night's autobuild. Comparison of the checksums in the database show errors... clear differences between the 04/17 and 04/18 checksums. Jerome suggested to look at configuration... indeed, we see that the checksum changes depending on whether we run the tests at 64 bit vs 32 bit vs optimized and non optimized. So... is this a real problem, or just machine precision / roundoff error. To investigate...
Create three geometries (y2014a) under the three different conditions -- optimized vs nonoptimized at 32 bit, and nonoptimized at 64 bit. Then evaluate the checksum for each of the three conditions using each of the three geometries as input. (9 values in total...) tag=y2014a-dbg-32b 1) Run checksum in DEV, non optimized, 32 bits HALL: 0521e4659c57773fdf15e554bdd5b6a9 2) Run checksum in DEV, non optimized, 64 bits HALL: f33eaae5cd1b645726fe0924ddf2f2c7 3) Run checksum in DEV, optimized, 32 bits HALL: 0521e4659c57773fdf15e554bdd5b6a9 tag=y2014a-opt-32b 4) Run checksum in DEV, non optimized, 32 bits HALL: bb586bb43e452c825aa16cb95f7684e3 5) Run checksum in DEV, non optimized, 64 bits HALL: 6ac0a6be950c16e98cdd4f2b4e1fc7ec 6) Run checksum in DEV, optimized, 32 bits HALL: bb586bb43e452c825aa16cb95f7684e3 tag=y2014a-dbg-64b 7) Run checksum in DEV, non optimized, 32 bits HALL: cc029dc9b3fbad8982195e8fe85f0c3e 8) Run checksum in DEV, non optimized, 64 bits HALL: 79ee228c82c31f7cc562f5f1afc4fabb 9) Run checksum in DEV, optimized, 32 bits HALL: cc029dc9b3fbad8982195e8fe85f0c3e What do we see? o AgML checksum evaluation does not depend on optimization at 32 bits... we see no change when the same geometry is input. Compare 1 and 3. o AgML checksum evaluation does depend on 32 bits vs 64 bits. For the same input file, we get a different checksum. Compare 2 and 3. o The geometry which is produced is bitwise different between optimized and non optimized code. Compare 1 and 4. o The geometry which is produced is bitwise different between 32 and 64 bit. Compare 1 and 7.
04/18/2014
Let's restrict ourselves to 32bit opt geometry vs 32bit debug geometry. Looking at the volume by volume checksums we see most checksums differ. In particular we have the magnet showing differences in volume PAWT... [Checksum Mismatch PAWT 2314db7fc691d8a8278b6b0fc0a8ebdb 062183dc2da4dc8c2cbc360311424782] PAWT has no children, so it will be easy to see where (if) it has real differences or not. Checksums are evaluated recursively down the volume tree. A volume's checksum depends on the checksums and positions of its daughters, plus the checksums of its material, media parameters and shape. Comparing geometry created under debug vs nodebug... Geometry.y2014a-dbg-32b.root Geometry.y2014a-opt-32b.root PAWT Material parameters: A: (const Double_t)1.43218804507728983e+01 (const Double_t)1.43218804507729001e+01 Z: (const Double_t)7.21671159149344277e+00 (const Double_t)7.21671159149344366e+00 D: (const Double_t)1.00000000000000000e+00 (const Double_t)1.00000000000000000e+00 R: (const Double_t)3.57585804664578362e+01 (const Double_t)3.57585804664578362e+01 I: (const Double_t)7.55166047475367748e+01 (const Double_t)7.55166047475367606e+01 Medium parameters: 0 0 1 1 20 20 20 20 10 10 0 0 0.01 0.01 ... Shape (tube) parameters: Rmin: (const Double_t)2.66607147216796875e+02 (const Double_t)2.66607147216796875e+02 Rmax: (const Double_t)2.68107147216796875e+02 (const Double_t)2.68107147216796875e+02 Dz: (const Double_t)7.50000000000000000e-01 (const Double_t)7.50000000000000000e-01 So this looks like an issue with materials. In this specific case it is a mixture: Mixture MagpGeo_Water Material MagpGeo_Water last touched in block PAWT of module MagpGeo Aeff=14.3219 Zeff=7.21671 rho=1 radlen=35.7586 intlen=75.5166 index=25 Element #0 : H Z= 1.00 A= 1.01 w= 0.112 natoms=2 Element #1 : O Z= 8.00 A= 16.00 w= 0.888 natoms=1 HOLE: Material: 14.60999999999999943157 14.60999999999999943157 7.299999999999999822364 7.299999999999999822364 0.00120499999999999990799 0.00120499999999999990799 30412.60851290595383034 30412.60851290595383034 70037.71747003440395929 70037.71747003438940737 ** Medium: 0 1 20 20 10 0 Shape: *** Shape TGeoBBox: TGeoBBox *** dX = 1666.45996 dY = 45.72000 dZ = 899.15997 origin: x= 0.00000 y= 0.00000 z= 0.00000 Medium and shape the same... So... it looks like (1) Evaluation of mixtures causes issues ... (2) Evaluation of derived material properties (intlen) causes problems ... TCOO is also a mixture Material: 25.78537218018238874606 25.78537218018238874606 13.35006757009759326138 13.35006757009759326138 2.321549999999999780442 2.321549999999999780442 9.393737860498498903894 9.393737860498497127537 44.18165789569228252276 44.18165789569227541733
04/21/2014
First attempt. Apply a decimal shift and truncation algorithm to material properties. i.e. shift decimal point 4 places past the most significant figure, then cast to an integer. The number 0.000012345/f becomes 12345/i. This results in a new checksum for y2014a, which we add to the StarDb/AgMLChecksum DB... [Checksum Validation y2014a] HALL 0521e4659c57773fdf15e554bdd5b6a9 0521e4659c57773fdf15e554bdd5b6a9 [Checksum Validation y2014a] HALL Geometry Checksums Agree [Checksum Validation y2014a] Total number of volumes: 4872 [Checksum Validation y2014a] Number of same volumes: 4872 [Checksum Validation y2014a] Number of different volumes: 0 Running the code under optimized library... ... seems to have failed... far worse than I would have thought... [Checksum Validation y2014a] HALL bff661296520a246e71cb84bfb2aa332 0521e4659c57773fdf15e554bdd5b6a9 [Checksum Validation y2014a] HALL Geometry Checksums Mismatch [Checksum Mismatch HOLE 8fbd050895a1346fcf0acc0b300bbd25 daa7de5141cbb27b0c87ff12b1758670] ... [Checksum Mismatch PVAG 1a5dfa3e8335be8b1b26cba917acfef1 843e645af96c85f6389496437ec22fce] [Checksum Validation y2014a] Total number of volumes: 4872 [Checksum Validation y2014a] Number of same volumes: 1 [Checksum Validation y2014a] Number of different volumes: 4871 Try making checksum completely insensitive to materials... Then I get only 1656 volumes showing differences. Looking deeper... it looks like even shapes are a factor: e.g., in ECAL ESCI -- ((TGeoTrd1 *)esci->GetShape())->GetDx2() (const Double_t)4.31181628415218388e+00 (debug) (const Double_t)4.31181588431256291e+00 (optimized) So... looks like all parameters need to be degraded slightly... 5 digits past decimal. Maybe down to 4 digits past decimal. With all parameters @ 5 sig figures, down to 376 volumes difference. Mostly ESMD strips, but few others. (Note-- material @ 3 sig figs). Ajust so that material params @ 5 sig figs as well... still 376 volumes. So let's find out what's up... DEBUG-- matrix pos_THX1_in_BBC1_1 - tr=1 rot=0 refl=0 scl=0 1.000000 0.000000 0.000000 Tx = 9.640015 0.000000 1.000000 0.000000 Ty = 50.091003 0.000000 0.000000 1.000000 Tz = 0.000000 Info in <TGeoNodeMatrix::InspectNode>: Mother volume BBC1 OPTIMIZED-- matrix pos_THX1_in_BBC1_1 - tr=1 rot=0 refl=0 scl=0 1.000000 0.000000 0.000000 Tx = 9.640015 0.000000 1.000000 0.000000 Ty = 50.091000 0.000000 0.000000 1.000000 Tz = 0.000000 Info in <TGeoNodeMatrix::InspectNode>: Mother volume BBC1 So... we *can* have slight differences in volume positions... in this case, the tripple hex module w/in the inner BBC annulus. It's off by a small amount in Y... ... but we're already shifting/truncating to take care of that. At this point, we're pretty much going to need to degrade sensitivity to all parameters, not just materials. So I would rather just restrict the checksum test to a single compilation configuration (e.g. 32bits debug) OR define a separate test for each config.
Groups:
- jwebb's blog
- Login or register to post comments