genevb's blog

genevb's home page
Posts

2025

June (2)
April (2)
March (1)

2024

December (5)
November (1)
October (2)
June (2)
May (7)
April (1)

2023

September (2)
August (1)
July (1)
June (5)
May (1)
January (1)

2022

September (1)

2021

December (1)
November (1)
October (1)
May (1)
March (3)
February (1)
January (2)

2020

October (2)
September (2)
July (3)
May (1)
April (2)
March (3)
February (4)
January (1)

2019

December (1)
October (4)
September (2)
August (6)
July (1)
June (2)
May (4)
April (2)
March (3)
February (3)

2018

November (2)
October (2)
September (2)
August (2)
July (1)
June (5)
May (5)
April (3)
January (1)

2017

December (1)
October (3)
September (1)
August (1)
July (2)
June (2)
April (2)
March (2)
February (1)

2016

November (2)
September (1)
August (2)
July (1)
June (2)
May (2)
April (1)
March (5)
February (2)
January (1)

2015

December (1)
October (1)
September (2)
June (1)
May (2)
April (2)
March (3)
February (1)
January (3)

2014

December (2)
October (2)
September (2)
August (3)
July (2)
June (2)
May (2)
April (9)
March (2)
February (2)
January (1)

2013

December (5)
October (3)
September (3)
August (1)
July (1)
May (4)
April (4)
March (7)
February (1)
January (2)

2012

December (2)
November (6)
October (2)
September (3)
August (7)
July (2)
June (1)
May (3)
April (1)
March (2)
February (1)

2011

November (1)
October (1)
September (4)
August (2)
July (4)
June (3)
May (4)
April (9)
March (5)
February (6)
January (3)

2010

December (3)
November (6)
October (3)
September (1)
August (5)
July (1)
June (4)
May (1)
April (2)
March (2)
February (4)
January (2)

2009

November (1)
October (2)
September (6)
August (4)
July (4)
June (3)
May (5)
April (5)
March (3)
February (1)

2008

2005

October (1)

My blog
Post new blog entry
All blogs

Software and Computing

You must register or login in order to post into this group.

User login

Navigation

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

All posts:
- Feed
- Page

CHEP talk on AutoQA

Updated on Mon, 2011-02-28 11:38. Originally created by genevb on 2010-10-06 13:10.

Talk drafts and proceedings are attached to this page.

_____________________________________________

Outline for discussion:

Goals

- Reduce load of QA Shift crew per data, thus enabling greater amount of data to be examined per shift

+ Necessitated by increase in number of runs acquired per year & increase in number of subsystems (hence histograms) to examine

- Improve uniformity of issue reporting

+ More objective analysis

+ Tighter link between reported issues and observations

What's in place

- Reference histogram analysis codes (StAnalysisUtilties + macros)

+ Three inputs: data hist file, reference hist file, text file of histogram-specific analysis parameters

+ Three outputs: plots of data hists, plots of reference hists, text file of analysis results

- Interface (led into from QA Browser) for...

+ Selecting reference sets for analysis

+ Viewing analysis results

+ Defining new reference histogram sets

+ Defining new histogram-specific analyses (comparison method & options, cut value)

- Additional interface features:

+ Context-relevant help (button next to section brings up a new window)

+ Histogram descriptions

+ Combine histogram files in the background while selecting reference (e.g. combine multiple runs)

+ Visual of individual plots & their reference, linked to postscript versions (full plot files also available, as before)

+ Access to log files (as before)

+ Organization of individual histograms (including graphic layout) matching their arrangement in full plots files (i.e. trigger type, page, cell)

+ Show all or failed only (default) analysis results in a table (failed analysis ~= QA issue)

+ Trend viewer for results of analyses vs. time/run

+ Reference sets have tags to collisions, trigger setups, and versioning therein, plus description

+ Disk caching of some database items (reference histogram sets, analysis cuts)

- Additional automation

+ New cronjob set up by Elizabeth

+ Currently pre-combines histogram files

+ Tested at the end of Run 10

+ Possible tool to trigger analysis on processed/combined files

Known work that needs to get done / thought about

- Possible automatic triggering of analysis [big item]

+ When to trigger (every time it runs and new files appear for a run number?)

+ Organization/presentation of results from automatic running (a web page in the same interface?)

+ Issue maintenance still on the shift crew

- Access to trend plots without needing to select a dataset to examine (i.e. skip the browser)

- Back-end for creating new reference sets as an old set with just a few specific histograms updated

- "Marking" runs as examined (as per the QA Browser)

- Documentation:

+ Help text needs completion

+ Explanation of code files needs added to existing documentation

- Minor GUI glitches

- Get subsystems to select analysis parameters for their histograms at the start of Run 11 [big item]

Demo

Each link is the same analysis, but with a different username so that multiple people can run it simultaneously by clicking different links.

Link1

Link2

Link3

CHEP presentation

Title: "Automated QA framework for PetaScale data challenges – Overview and development in the RHIC/STAR experiment"

Abstract:

Over the lifetime of the STAR Experiment, a large investment of

workforce time has gone into a variety of QA efforts, including

continuous processing of a portion of the data for automated

calibration and iterative convergence and quality assurance

purposes. A rotating workforce coupled with ever-increasing volumes

of information to examine led to sometimes inconsistent or

incomplete reporting of issues, eventually leading to additional

work. The traditional approach of manually screening a data sample

was no longer adequate and doomed to eventual failure with planned

future growth in data extents. To prevent this collapse we have

developed a new system employing user-defined reference histograms,

permitting automated comparisons and flagging of issues. Based on

the ROOT framework at its core, the front end is a Web based service

allowing shift personnel to visualize the results, and to set test

parameters and thresholds defining success or failure. The versatile

and flexible approach allows for a slew of histograms to be configured and

grouped into categories (results and thresholds may depend on

experimental triggers and data types) ensuring framework evolution

with the years of running to come. Historical information is also

saved to track changes and allow for rapid convergence of future

tuning. Database storage and processing of data are handled outside

the web server for security and fault tolerance.

Allotted time: 15+3 minutes

[estimated number of slides in brackets]

Layout:

- Cover [1]

- Intro and challenges

+ Briefly intro STAR & its "PetaScale" datasets [1]

+ Review organization of QA (stages, shift crew) in STAR [1]

- Approach: automated comparison to reference histograms

+ Pros and cons of approach [1]

+ Methods involved in analysis [1]

+ Presentation of interface

* Defining references [1]

* Defining analysis parameters [1]

* Viewing analysis results [2]

* Viewing historical results (trends) [1]

- Future possibilities

+ Further automation of triggering [1]

- Summary [1]

Total: ~12 slides

Draft by Friday (Oct. 8)

Practice talk on Monday afternoon (Oct. 11)

__________________

Discussion on Oct. 6, 2010:

Attendees: Elizabeth, Gene

Feedback:

- Possibly have "questionable" in addition to pass/fail?

+ We both worried this might move back towards subjectivity

+ Complicates things for subsystems when deciding on parameters (must then also decide a range for questionable)

+ Best to have the cut be such that questionable results be tagged as failed, so that there is at least an alert triggering the crew to take a closer look (failure does not strictly require it to be a QA issue at this time)

+ One possibility: gradation of red color used in severity of result (i.e. mild color when result is just below the cut, but strong color as the result approaches zero)

- Documentation also needs updated for codes Elizabeth has maintained

+ I will point Elizabeth to the documentation I've made so far

- Trend plots could use markers (small) in addition to the lines to better see how many data points there are (a straight line may not make it obvious)

- Mechanism of marking for update of reference histograms did not work with Firefox on Windows

+ Everything else appeared to work OK during demo

Groups:

genevb's blog
Login or register to post comments