Software and Computing
Software and Computing phone meeting
1-189, EVO, at 16:00 (GMT), duration : 01:00
Time | Talk | Presenter |
---|---|---|
12:00 | Embedding opened requests and problem summary ( 00:20 ) 0 files | Andrew Rose (LBNL) |
12:20 | TRS tuning, status and path forward ( 00:10 ) 0 files | Yuri Fisyak (BNL) |
12:30 | Simulation issues ( 00:10 ) 0 files | Victor Perevoztchikov (BNL) |
12:40 | Review of production space, data samples and cleanup ( 00:10 ) 0 files | Lidia Didenko (BNL) |
12:50 | AOB ( 00:10 ) 0 files | All (All) |
Reconstruction and simulation issues
1-189, internal, at 17:00 (GMT), duration : 01:00
Previous meeting is available [node:14909, "here"].
- Review last meeting notes + Additional items thought of since the last meeting? * Reco/simu organization - thoughts? * Yuri: Cellular Automaton project & project description
Software and Computing phone meeting
1-189, EVO, at 16:00 (GMT), duration : 01:00
Time | Talk | Presenter |
---|---|---|
12:00 | Calibration overview for Year9 and other issues ( 00:20 ) 1 file | Gene Van Buren (BNL) |
12:20 | Minuit vertex improvement, current status ( 00:15 ) 0 files | Matthew Cervantes (TAMU) |
12:35 | BeamLine determination using 3D method ( 00:10 ) 1 file | Rosi Reed (UC-Davis) |
12:45 | p+p 500 GeV production request, status & motivations ( 00:15 ) 1 file | Jan Balewski (MIT) |
Review of calibration issues, tasks and plans
1-189, internal, at 19:00 (GMT), duration : 03:00
Chair: Jerome
Purpose:
- Review calibrtaion issues for Run 9
- Review of the calibration status and overview for Year8, Year7 (h+/h- and plan forward
- Plan and activities for the next 6 months to a year [general]
- Analysis meeting strategy / presentation and expectations
Internal only.
Meeting will be interrupted from 15:30 -> 16:30 by a STAR/BNL group meeting.
Reconstruction and simulation issues
1-189, internal, at 18:00 (GMT), duration : 01:00
Chair: Jerome
Invited: Yuri and Victor
Purpose:
- Review activities and progress in both areas
-
Timeline would be next 6 months to a year.
- Review and discuss overlaps between the two areas
Meeting is internal.
CSW4DB, status and path forward
1-189, HighSpeed conference bridge, at 18:00 (GMT), duration : 02:00
Meeting was in two parts. The first hour with Mark Green from Tech-X and the second internal.
User and computer service, progress
1-189, internal, at 20:00 (GMT), duration : 01:00
Chair: Jerome
Software and Computing phone meeting
1-189, HighSpeed conference bridge, at 16:00 (GMT), duration : 01:00
Time | Talk | Presenter |
---|---|---|
12:00 | SPIN PWG request for pp 500GeV ( 00:15 ) 2 files | Joseph Seele (MIT) |
12:15 | Update on the p and n side cluster in SSD ( 00:20 ) 1 file | Jonathan Bouchet (KSU) |
12:35 | iSCSI transfer, progress update ( 00:15 ) 1 file | Matthew Ahrenstein (BNL) |
12:50 | AOB ( 00:10 ) 0 files | All (All) |
UCM discussions
1-189, internal, at 18:00 (GMT), duration : 01:00
User and computer service, task review
1-189, internal, at 20:00 (GMT), duration : 01:00
Discussion on organization and tasks for the user and computer support team.
Review of plan until September/October.
Software and Computing phone meeting
1-189, HighSpeed conference bridge, at 16:00 (GMT), duration : 01:00
Time | Talk | Presenter |
---|---|---|
12:00 | SUMS updates and new release ( 00:30 ) 1 file | Levente Hajdu (BNL) |
12:15 | Database development progress ( 00:20 ) 1 file | Dmitry Arkhipkin (BNL) |
12:35 | Overview of recent production and library release ( 00:15 ) 0 files | Lidia Didenko |
12:50 | AOB ( 00:10 ) 0 files | All (All) |
Database development follow-up
1-189, internal, at 16:00 (GMT), duration : 01:00
Invited: Dmitry, Wayne, Gene
Attendees: Jerome (chair), Dmitry, Wayne, Leve, Gene
The meeting started with miscellaneous topics such as Leve bringing the existence of RAM disk at 1000$ for 32 GB.
Dmitry presented slides (attached). General observations included:
- LB – missing a RR on low load (this should really be implemented)
- FC – most benefit if increase key buffer size
- Can different jobs alter the way we measure the cache speed / enhancement?
Specific discussion:
Discussing the FileCatalog, it was also pointed that MySQL query optimizer sometimes skips the key read (skips because sequential read of the table is believed to be faster than random read) and hence, some select may not use the index at all.
Clearly, the FC is disk speed limited and one point would be to attempt to solve this with more beefy nodes with optimal IO.
Also discussed of UPDATE race conditions observed in the FC. Dmitry stated that there may be a possible fix for the UPDATE lock in recent minor revision of MySQL. Tricky to upgrade the Master as share some service with the offline DB but the idea was to perhaps, add a slave + update if it can be mixed. Mix was not certain / guaranteed.
Jerome asked if there were any benefit in turning on the slow query logs. In principle, we know what they are and Dmitry also noted that "slow" is defined by threshold (with a default low value of a few 10 seconds) and since most queries are mnts long, all queries may be returned. Thought we will think of this matter again whenever we would have improved the IO speed (which seem to be the main problem at this time).
Wayne did spec a node – so far close to 7k$ - [atime disabled as well]
There were lots more from the slides.
Recommended for the path forward:
- Get a performance test suite really settled (comparison as we go - it seems we are there but we need to compare between configurations and and hardware)
- We saw that CPU is not important ... whenever spec-ing a node, we should be conservative with the CPU speed (twice faster is not better so, let's not push)
- What is the network overall performance? If we speed up IO and response of the DB, would we saturate the network? Dmitry presented some results of that and there is food for thoughts.
- We need to assess the API overheads soon enough too ... For now, the IO (cache and hardware reshape) seem to be a big gain but we should not ignore the performance gain on the API side and should also consider this soon.
- Project to be done by July
- Concentrate first on db06, db11
- We make a two phases project - first, look at those two nodes and test in-place. We test until the end of the run
- The second phase would be after the run – we proceed with a rolling upgrade (or rolling phase-in) as nodes are available
- We try to get it all done before T=(end-of-the-run+2 months) which is a typical rule-of-thumb for calibration and start of production. This would give a generous timeline of all db replaced by the RCF nodes by September.
- Another thing to do in parralele, get a (or more) beefy 8 GB mem within 2 month as well
- General though was to try to get beefy nodes for the FileCatalog but it was not all clear what to do there. Wayne pointed after the meeting that if so, what is the plan for the Web-Server and a datbase as redundant swapable?
- Answer is: it would be an over-kill with the FileCatalog but as it stands, and considering Dmitry's results, the avenue of getting 8 GB mem nodes + good IO is a way for the FC with a low number of node purchase requirement (FC need 1 master and 2 slaves at the moment)
- Pricing is the key
- General goal would be (if we go that route) to have the new web server not before September (but also not much after)
Dmitry pointed that for the DB-slave, there may not be enough space – thinking of a TB storage.
- Jerome stated that we will NOT purchase addiitonal storage for backup at this stage as this overalps (perhaps) with Matthew's project of providing an online file-server like capability. We do not want to disperse in all directions.
- suggested to leverage Legato AND pushing snapshot into HPSS and revisit this later
UCM discussions
2-187, at 18:00 (GMT), duration : 01:00
09W23
Software and Computing phone meeting
1-189, HighSpeed conference bridge, at 16:00 (GMT), duration : 01:00
Time | Talk | Presenter |
---|---|---|
12:00 | SVT simulator tuning and embeding ( 00:15 ) 1 file | Stephen Baumgart (Yale) |
12:15 | Geometry differential, an update ( 00:20 ) 0 files | Victor Perevoztchikov (BNL) |
12:35 | NPE analysis & embedding opened discussions ( 00:15 ) 0 files | TBC (TBC) |
12:40 | Dalitz decays in starsim ( 00:10 ) 1 file | Thomas Ullrich (BNL) |
12:50 | AOB ( 00:10 ) 0 files | All (All) |
Software and Computing phone meeting
1-189, HighSpeed conference bridge, at 16:00 (GMT), duration : 01:00
Time | Talk | Presenter |
---|---|---|
12:00 | Geometry differential, a follow-up ( 00:20 ) 4 files | Victor Perevoztchikov (BNL) |
12:20 | AOB ( 00:20 ) 0 files | All (All) |
Software and Computing phone meeting
1-189, HighSpeed conference bridge, at 16:00 (GMT), duration : 01:30
Time | Talk | Presenter |
---|---|---|
12:00 | DCA resolution in Run7 (Silicon) ( 00:20 ) 1 file | Jonathan Bouchet (KSU) |
12:20 | TPC alignement in Year9 ( 00:15 ) 1 file | Na Li (IOPP) |
12:35 | TPC Alignement in Year7 ( 00:15 ) 0 files | Hao Qiu (IMPCAS) |
12:50 | AOB ( 00:10 ) 0 files | All (All) |
Preparing TPC Anode HV data for the DB
There are 2 kinds of information about the Anode HVs which are important:
Software and Computing phone meeting
1-189, HighSpeed conference bridge, at 16:00 (GMT), duration : 01:00
Time | Talk | Presenter |
---|---|---|
12:00 | Drupal upgrade, status and plan forward ( 00:15 ) 0 files | Zbigniew Chajecki (OSU) |
12:15 | Update on the early/prompt hits ( 00:20 ) 0 files | Pibero Djawotho (TAMU) |
12:35 | Vertex finding update for d+Au 200 GeV ( 00:20 ) 2 files | Anthony Timmins (WSU) |
12:55 | AOB ( 00:10 ) 0 files | All (All) |
Software and Computing phone meeting
1-189, HighSpeed conference bridge, at 16:00 (GMT), duration : 00:00
Time | Talk | Presenter |
---|---|---|
12:00 | Embedding post-QM overview ( 00:15 ) 0 files | Patricia Fachini (BNL) |
12:15 | Overview of the database consolidation & monitoring strategy ( 00:20 ) 1 file | Dmitry Arkhipkin (BNL) |
12:35 | iSCI in STAR ( 00:15 ) 1 file | Matthew Ahrenstein (BNL) |
12:50 | AOB ( 00:10 ) 0 files | All (All) |