Software and Computing

Review of Past Issues and Current Understanding


Talk time : 15:20, Duration : 00:20

h+/h- in Run 10 and beyond

2010-12-03 15:00
2010-12-03 16:00
America/New York
Friday, 3 December 2010
EVO, at 20:00 (GMT), duration : 01:00
TimeTalkPresenter
15:00Description of Problem ( 00:20 ) 0 files
15:20Review of Past Issues and Current Understanding ( 00:20 ) 1 file
15:40Plan of Action Development ( 00:20 ) 0 files
16:00Task Assignments ( 00:20 ) 0 files
16:20AOB ( 00:20 ) 0 files

Run 11 preparation meeting #4

2010-12-03 14:00
2010-12-03 15:00
America/New York
Friday, 3 December 2010
1-189, at 19:00 (GMT), duration : 01:00

Minutes:

Attendees:Dmitry A., Leve H., Matt A., Wayne B., Jeff L., Jérôme L., G. Van Buren

Databases [Dmitry]:

  • Online backup now using the new NAS system
    • Daily backup of all three ports with a retention time of 7 days
    • Currently have 2+ TB of space, which is probably more than enough for even 14 days retention
    • Potential problem with permissions due to NFS mount and different user IDs on different systems, but not a problem presently
    • Email alerts of problems from NAS goes to Wayne & Dmitry
  • Flush of online DBs not yet done
    • Reasoning is that still in testing at STAR and this can add up to significant amount of data
    • ...but we're not sure which test data people will want to keep associated to Run 11
    • Decision made to go forward with the flush and not continue waiting for testing to get further along
    • NB: ShiftLog (and some other) DBs and tables are skipped in the flush; ShiftLog is already recording for Run 11
  • Shift Signup GUI has been re-written
    • Demo shown
    • Some details of new features still need implementation (working with Jérôme)
    • All old features are in place; could replace old codes at any time (pending bug checks)
    • Deployment schedule not fixed by any deadlines
  • Isolated nodes for FastOffline in Run 11
    • Not in place, but Dmitry will write up a config file for this

 

ShiftLog [Leve]:

  • Nothing until new web server
    • No progress on new webserver

 

Online nodes [Wayne & Matt]:

  • OS upgrades:
    • FTPC and PMDsc done/replaced
    • To be done: Bond, EMCsc (coordinate with users), STARUtilities (coordinate with C-AD), EMC01, Beatrice, L3display (Jeff notes the need for QT4 on this node for display programs)
  • FUSE now working on all linux pool nodes
  • gcc standardized on all linux pool nodes
  • Recent rise in instability of the linux pool nodes: several have halted and/or needed rebooted in the past two weeks
    • Previous solution of disabling USB controller not helpful for this (that solution is still in effect)
    • No obvious environmental changes, but seems likely given the pattern
    • These nodes are ~5.5 years old (hard to believe they would show age problems within a couple weeks of each other)
    • Similar nodes are in use for offline DBs and not showing problems (located in BCF)
  • Newer EVP machine experiencing AFS issues
    • Access given to John McCarthy to help diagnose
    • OnlinePlots will be switched to use old EVP machine for the time being (Gene & Jeff will arrange)

 

Software and Computing phone meeting

2010-12-01 12:00
2010-12-01 13:00
America/New York
Wednesday, 1 December 2010
1-189, EVO, at 17:00 (GMT), duration : 01:00
TimeTalkPresenter
12:00Review of ticket 2036 (embedding timestamp) ( 00:20 ) 0 filesAll (All)
12:20Data production issues and projections ( 00:20 ) 0 filesLidia Didenko (BNL)
12:40AOB ( 00:20 ) 0 filesAll (All)

Run 11 preparation meeting #3

2010-11-19 14:00
2010-11-19 15:00
America/New York
Friday, 19 November 2010
1-189, at 19:00 (GMT), duration : 01:00

Attendees: Wayne B., Leve H., Gene V.B., Matt A.

Leve:

  • ShiftLog exercised (successfully tried loading large 4+MB images)
  • Awaiting new online web server
  • No current action items regarding changing shift status after lat week's meeting

Matt:

  • Windows 2000 upgrades to XP; involves replacing some computers:
    • ftpctemp, emcsc, pmdsc replacements exist, but need configured before switch
    • Timescale: mid-December
  • OS upgrades on emc01, beatrice, l3display
    • l3display has GUI issues (independent of the old large display screen issue brought up a couple weeks ago by Wayne); hope that the OS update to SL53 will resolve the GUI problems
    • Timescale: next week (make sure systems are running again before the long weekend next week, perhaps time beatrice update with any plans by EMC people to be away)
    • It would be of interest to know whether the l3display can support the STAR environment (e.g. AGS, gcc, etc.)

Wayne:

  • Online linux pool gcc issue: an artifact of a known problem not corrected in Matt's installation script.
    • A few nodes already fixed, pne node known to need fixing, and a few more need checked (low priority)
    • FUSE also stopped working on a few nodes during some of these re-installs (due to some package dependencies)
  • Ganglia metric to list number of users logged in now turned on for STAR gateways; will also add metric to the online linux pool machines

Gene:

  • OnlinePlots running stably on newer EVP server
  • FMS group expressed interest in changing some OnlinePlots; they were directed to the codes
  • Contacts for components of online QA [node:19794 "posted"]

AOB:

  • Next meeting in two weeks (holiday next week)

Agenda question: what do you think of the use of icon images and text for accessing printer friendly version and other features?

I prefer text only and have no problem with the default
7% (1 vote)
I like both icons and text, it helps navigation
73% (11 votes)
I would prefer to see only icons (with mouse-over pop-up explaination)
7% (1 vote)
No preference (none of the choices make the options more visible or clearer)
13% (2 votes)
Total votes: 15

Run 11 preparation meeting #2

2010-11-12 14:00
2010-11-12 15:00
America/New York
Friday, 12 November 2010
1-189, at 19:00 (GMT), duration : 01:00

Attendees: Jeff L., Jérôme L., Wayne B., Gene V.B., Leve H., Dmitry A.

 

Minutes:

 

jevp : the new online QA package

  • Deployment intended before run starts
    • Will run concurrently with OnlinePlots package on the evp.starp machine
      • That machine is believed, at this point, to be sufficiently up to the task
  • Infrastructure ready at 90% (old adage: last 10% takes 90% of the time)
    • Using QtRoot
  • New architecture has 3 components
    • server: repository for plots
    • builder: stand-alone process for reading event pool, filling hists, and ships plots of them to the server in a ROOT container class (StJEvpPlotSet)
      • StJevpPlotSet contains the histograms, plus lines and texts, and axis scales
    • display: it asks the server for the plots it wants to present at any given time (and tabs); not the content of other plots
  • All components communicate over ethernet and can be on separate machines
    • Multiple builder instances will all read the same files simultaneously
    • Suggestion to possibly leverage other available online nodes
      • Would need to mount evp.starp: /a
      • Won't pursue until a need is shown
  • Deployed all existing plots to server (999 exactly!)
    • Divided into 17 builders: generic, daq, hlt, l3, each subsystem
  • Main features:
    • Easily configured plotting
      • There is an xml configuration file, plus an editor for managing the xml file
      • Hierarchy of tabs and plots easily changed (directory-like arrangement)
      • Top level is shift, HLT, other expert sets
      • Same plot can be in more than one place
      • Control over rows and columns, log scaling (part of plot definition, but over-ridable in the configuration), plot maxima and minima (setting can span multiple plots)
      • Display comes up with shift histograms by default
    • Reference histogram capability
      • Clicking on a hist brings up the data and reference histograms (and possible old references)
      • This page allows one to set the data as the reference along with a comment
      • Discussion of expanding this capability for automated analysis and alerts
        • Analytic comparisons be done in the builder (that's where the analysis happens)
        • Suggestions to keep this in mind as programming continues
  • Future Planned Features:
    • Conditional suppression of unwanted plots
      • Distinction between not-in-run, and broken
      • Some plots shown only if some state exists
    • Fast trigger stream: possible send all trigger data (not DAQ data) to the event pool
    • Monitors from within the package (e.g. how long does each builder take)
      • Needs to be coordinated in the server
  • Additional To-Do tasks:
    • Run stop/start isn't well-coordinated yet (when to clear histograms for start of a new run; codes need more understanding of state)
    • Basic shift histogram group needs to be set-up
      • Plan: present a set of plots to the trigger board for feedback & input
    • Need testing with real data for each subsystem for possible code-copying mistakes
    • Need extended operation testing (e.g. does it have memory leaks? eventually crashes?)
      • Already some concerns about ownership of objects in the plotting code (an issue we had with the OnlinePlots package)
    • Plan to run display component locally on rts01
      • Needs STAR environment to do so
      • Interest in a second display screen
      • Interest in upgrading rts01 hardware
        • Wayne agreed to spec out new hardware
  • Plan to learn as we go (from problems during the Run) of what are the optimal ways to present data which make the problem(s) evident
  • Plan for now is to demo the jevp package in 3 weeks time

 

Other discussion topics:

  • Rotations of DBs done for new year, but not all hyperlinks have been rotated
  • Online disk space usage needs: users need to be polled (e.g. spin QA programs and data), and who the users are needs to be determined
  • Solaris phase out: systems (slow controls and trigger) are not in S&C control; admins need to be convinced that support is gone and catastrophes loom (e.g. ITD can disconnect machines from the network)
  • Switch to using dbbak for daily DB backups not yet done
  • DB flush planned for Monday, and that should be it until online daemons start up (awaits EPICS ramp up)
  • Jérôme mentioned workflow tests are in order for QA-RunLog-ShiftLog interconnections
  • Discussion of RTS/ShiftLeader status editing/updating (continued post-meeting):
    • Conclusion seemed to be that the current set-up is what is needed, but that edits/updates should be in the hands of shift leaders. This policy should be made very clear and known.

 

Run 11 preparation meeting #1

2010-11-05 16:00
2010-11-05 17:00
America/New York
Friday, 5 November 2010
1-189, EVO, at 20:00 (GMT), duration : 01:00
For guidance, our flagship projects for Year 10 from last
year included:
* Online storage / central storage
* Revamping the online network
* Auto-calib and Auto-QA projects
* New online plots
* Try to make good on code in CVS

We would need to hear about
- QA (online and offline) - assume Gene will compile a summary
- Online tools ShiftSignup, RunLog, ShiftLog - Dmitry & Leve
- Production requests during yhe run - Lidia
- DB issues if any remaining - Dmitry
- Storage and other issues - Wayne

 As first action items, we will also need as every year
to (a) poll users for storage requirements online and (b) be sure
all DB back-end supporting all tools are rotated and ready (and all
links / generic names are as well for Run 11). If you need storage
space online and have a opinion, please voice it especially if
need diverge from past year's.

Software and Computing phone meeting

2010-11-03 12:00
2010-11-03 13:00
America/New York
Wednesday, 3 November 2010
1-189, EVO, at 16:00 (GMT), duration : 01:00
TimeTalkPresenter
12:00Calibration status for run 10, 200 GeV ( 00:20 ) 0 filesGrant Webb (UKY)
12:20Library readiness ( 00:20 ) 0 filesLidia Didenko (BNL)
12:40AOB ( 00:20 ) 1 fileAll (All)

201009

Topics were:

Software and Computing phone meeting

2010-09-22 12:00
2010-09-22 13:00
America/New York
Wednesday, 22 September 2010
1-189, EVO, at 16:00 (GMT), duration : 01:00
TimeTalkPresenter
12:00Production status for Run10 ( 00:20 ) 0 filesLidia Didenko (BNL)
12:20Calibration status for Au+Au 200 GeV, run 10 ( 00:15 ) 1 fileGrant Webb (UKY)
12:35TPC calibration status details for Run 10 data ( 00:15 ) 0 filesMaxim Naglis (LBNL)
12:50AOB ( 00:10 ) 0 filesAll (All)

AOB


Talk time : 12:30, Duration : 00:10

Run 10 production status

Speaker : L. Didenko ( BNL )


Talk time : 12:20, Duration : 00:10

VPD-TOF Calibration for 39 GeV

Speaker : R. D. de Souza ( IFUSP )


Talk time : 12:00, Duration : 00:20

 Please see slides here.

Software and Computing phone meeting

2010-09-08 12:00
2010-09-08 13:00
America/New York
Wednesday, 8 September 2010
1-189, EVO, at 16:00 (GMT), duration : 01:00
TimeTalkPresenter
12:00VPD-TOF Calibration for 39 GeV ( 00:20 ) 0 filesR. D. de Souza (IFUSP)
12:20Run 10 production status ( 00:10 ) 0 filesL. Didenko (BNL)
12:30AOB ( 00:10 ) 0 files

AOB


Talk time : 12:50, Duration : 00:10
  • Catching up on activities planned from You do not have access to view this node:
    • CuCu reproduction status
    • AuAu11 streams production status

Run 10 Calibrations Update

Speaker : G. Webb ( UK )


Talk time : 12:35, Duration : 00:15

TPC:

Limiting production of events with large hit counts

Speaker : G. Van Buren ( BNL )


Talk time : 12:20, Duration : 00:15

 See You do not have access to view this node

 Also You do not have access to view this node

TOF in PPV

Speaker : R. Reed ( UC Davis )


Talk time : 12:00, Duration : 00:20

 See You do not have access to view this node

Software and Computing phone meeting

2010-09-01 12:00
2010-09-01 13:00
America/New York
Wednesday, 1 September 2010
1-189, EVO, at 16:00 (GMT), duration : 01:00
TimeTalkPresenter
12:00TOF in PPV ( 00:20 ) 1 fileR. Reed (UC Davis)
12:20Limiting production of events with large hit counts ( 00:15 ) 0 filesG. Van Buren (BNL)
12:35Run 10 Calibrations Update ( 00:15 ) 0 filesG. Webb (UK)
12:50AOB ( 00:10 ) 0 files