PXL database design

Pixel Status Mask Database Design


I. Overall stats:
  • 400 sensors (10 sectors, 4 ladders per sector, 10 sensors per ladder)
  • 960 columns vs 928 rows per sensor
  • 400 x 960 x 928 = 357M individual channels in total
( see: http://www.star.bnl.gov/cgi-bin/protected/cvsweb.cgi/offline/hft/StRoot/StPxlUtil/StPxlConstants.h?rev=1.4 )

II. Definitions :
  PXL status may have the following flags (from various sources):
  1. Sensors:
  • good   
    • perfect
    • good but hot (?)
  • bad (over 50% hot/missing)
    • hot (less than 5% hot?)
    • dead (no hits)
  • other
    • missing (not installed)
    • non-uniform (less than 50% channels hot/missing)
    • low efficiency (num entries very low after masking)
  2 Column / Row
  • good
  • bad (over 20% bad pixels hot)
  • expect: ~30 bad row/columns per sensor
  3. Individual Pixels
  • good
  • bad (hot fires > 0.5% of the time)
  4. Individual pixel masks are a problem with 357M channels, thus questions:
  • what exactly is hot pixel? Is it electronic noise or just background (hits from soft curlers)?
  • is it persistent across runs or every run has its own set of hot individual pixels?
  • is it possible to suppress it at DAQ level?
III. Status DB proposal:

  1. Generic observations:
  • PXL status changes a lot in-between runs, thus "only a few channels change" paradigm cannot be aplied => no "indexed" tables possible
  • Original design is flawed:
    • we do not use .C or .root files in production mode, it is designed for debugging purposes only and has severe limitations;
    • real database performs lookups back in time for "indexed" tables, which means one has to insert "now channel is good again" entries => huge dataset;
    • hardcoded array sizes rely on preliminary results from Run 13, but do not guarantee anything for Run 14 (i.e. what if > 2000 pixels / sector will be hot in Run 14?);
  2. Sensor status:
  • uchar/ushort sensors[400]; // easy one, status is one byte, bitmask with 8/16 overlapping states;
  • matches the one from original design, hard to make any additional suggestions here..
  3. Row/Column status:
  • observations:
    • seems to be either good or bad (binary flag)
    • we expect ~10 masked rows per sector on average (needs proof)
    • only bad rows recorded, others are considered good by default
  • ROWS: std::vector<int row_id> => BINARY OBJ / BLOB + length => serialized list like "<id1>,<id2>..<idN>" => "124,532,5556"
    • row_id => 928*<sensor_id> + k
    • insert into std::map<int row_id, bool flag> after deserialization
  • COLS: std::vector<int col_id> => BINARY OBJ / BLOB + length => ..same as rows..
    • col_id => 960*<sensor_id> + k
    • insert into std::map<int col_id, bool flag> after deserialization
  4. Individual pixels status:
  • observation:
    • seems to be either good or bad (binary flag)
    • no "easy'n'simple" way to serialize 357M entries in C++
    • 400 sensors, up to 2000 entries each, average is 1.6 channels per sector (???)
  • store std::vector<uint pxl_id> as BINARY OBJ / BLOB + length;
    • <pxl_id> = <row_id>*<col_id>*<sensor_id> => 357M max
    • it is fairly easy to gzip blob before storage, but will hardly help as data is not text..
    • alternatively: use ROOT serialization, but this adds overhead, and won't allow non-ROOT readout, thus not recommended;

Note: I investigated source codes for StPxlDbMaker, and found that Rows / Columns are accessed by scanning an array of std::vectors:
It is highly recommended to use std::map, which would be later replaced with std::unordered_map (as STAR upgrades to newer gcc). Map access is much faster, and code would be much simpler too. See code examples at /star/u/dmitry/4PXL/