2008

Background information

 

Projects and proposals

This page will either have requirements document or project description for R&D related activity in S&C (or defined activities hopefully in progress).

 

  1. We proposed an R&D development within the ROOT framework to support full schema evolution as described in the project description
  2. We worked on the elaboration of the KISTI proposal to join STAR
  3. We supported a network upgrade for the RCF backbone (sent via Micheal Ernst to Tom Ludlam Physics departement chair at BNL on April 16th 2008, discussed at the Spokesperson meeting on April 4th 2008)
  4. For 2008 Network Requirements Workshop for the DOE/SC Nuclear Physics Program Office, we provided background material as below (STAR sections and summary, the Phenix portion was taken verbatim from their contribution)
  5. Trigger emulation / simualtion framework: discussions from 20070528 and following Emails.

 

 

Ongoing activities

Internal projects and sub-systems task lists

 

Tasks and projects

Computing operation: IO performance measurements

Goals:

  • Provide a documented reference of IO performance tests made toward several configuration sin both disk formatting and RAID level space under non-constrained hardware considerations.
  • The base line would help making future configuration choices when it comes to hardware provisioning of servers (services) such as database servers, grid gatekeepers, network IO doors, etc...

Steps and tasks:

  1. Survey community work on the topic of IO performance of drives especially topics concerning
    1. Effect of disk format on performance
    2. Effect of parallelism on performance
    3. Effect of software raid (Linux) performance and responsiveness (load impact on node under stress)
    4. Software RAID level and performance impacts
    5. Kernel parameter tweaks impacting IO performance (good examples are efforts of DAQ group, review consequence)
       
  2. Prepare a baseline IO test suite for measuring IO performance (read and write) under two mode Possible test suite could follow what was used in the  IO performance page . Other tools welcomed based upon survey recommendations.
    • single stream IO
    • multi stream IO (parallel IO)
       
  3. Use a test node and measure IO performance under the diverse reviewed configurations. A few constraints on choice of hardware are needed to avoid biasing the performance results
    • The node should have sufficient memory to accommodate for the tests (2 GB of memory or more is assumed to be large sufficient to accommodate for any tests)
    • OS must support software RAID
    • Disks used for the test should be isolated from system drive to avoid performance degradation
    • Node should have more than two drives (including system disk) and ideally, at least 4 (3+1)
       
  4. Present result as a function of disk formatting, RAID level and/or number of drives added in both absolute values (values for each configuration) and differentials (gain when moving from one configuration to another).

Status: See results on Disk IO testing, comparative study 2008.

Opened project and activities listing

A summary of ongoing and incoming projects was sent to the software coordinators for feedback. The document refers to projects listed in this section under Projects and proposals.

The list below does NOT include general tasks such as the one described as part of the S&C core team roles as defined in the Organization job descriptions documents . Examples of which would be global tracking with Silicon including HFT, geometry maintenance and updates or otherwise calibration or production tasks as typically carried for the past few years. Neither does this list include improvements we need for areas such as online computing (many infrastructure issues, including networking an area of responsibility which has been unclear at best) nor activities such as the development and enhancement of the Drupal project (requirements and plans sent here).

The list includes:

  • Closer look at Calorimetry issues if any (2007 operation workshop feedback follow-up related to calibration being too"TPC centric" and not addressing Physics qualities). Proposed a workshop with goals to:
    • gather requirements from the PWG (statements from the operation workshop in 2007 seemed to have taken the EMC coordinators by surprised as per what resolution was needed to achieve Physics goals)
    • discuss with experts technical details and implementation, unrolling / deployment and timing
  • Status: Underway, see report from a review as PSN0465 : EMC Calibrations Workshop report, fall 2008

  • Db related: load balancing improvements, monitoring and performance measurements, resource discovery, distributed database
    Status: underway.
    References: You do not have access to view this node
     
  • Trigger simulations - (some fleshed out on May 2007 as mentioned in this S&C meeting and attached below). The general idea was to provide a framework to allow trigger emulation / simulation offline for studying rejection/selection effects either applying trigger algorithms on real data (minimum bias) or via true simulation or allow re-applying trigger algorithm to triggered sample (higher threshold for example)
    Status: nowhere close to where it should be
    References: trigger simulation discussions meeting notes and Email communications.
     
  • Embedding framework reshape.
    Status: underway (need full eval with SVT and SSD integrated)
     
  • Unified online/offline framework including integration of online reader offline and offline tools online (leveraging knowledge, minimizing work). This task would address comments and concerns that whenever code is developed online (for PPlot purposes for example), it also needs to be developed offline within separate and very different reader approaches. At a higher level, dramatic memory overwrite offline occurred in early 2007 due to the lack of synchronization between structure sizes (information did NOT propagate and was not adjusted offline by the software sub-system coordinator of interest; an entire production had to be re-run).
    Status: tasked and underway, first version delivered in 2008, usage of "cons" and regression testing in principle in place (TBC in 2009 run)
     
  • EventDisplay revisited
    Status: underway (are we done? need new review follow-up after the pre-review meeting made in 2007)
     
  • VMC - realistic geometry / geometry description
    Status: Project on hold due to reconstruction issues, resumed July 2008.
     
  • Forward tracking (radial field issue). May have importance for FGT project upon schedule understanding.
    Status: depend on previous item and would be tasked whenever forward tracking need would be better defined.
     
  • Old framework cleanup, table cleanup, drop old formats and historical baggage. In principle a framework tasks, this is bound to introduce instabilities during which assembling a production library would be challenging. This need to be tasked outside major development projects.
    Status: only depend on production of Year 7/8 start-up
     
  • Multi-core CPU era - Task force assembled in 2007 (Multi-core CPU era task force) had an unfortunate conclusion that the work would be too hard hence not necessary. Unfortunately, market development and aggressive company progression toward even more packed CPU and core indicates the future must integrate this new paradigm. First attempts should target the "obvious".
    Status: First status and proposal made at ACAT08 (changing chains to accommodate for possible parallelism). Investigated possibility of parallelism at library level and core algorithm (tracking). Talks at ACAT08 very informative.
     
  • Automated QA (project draft available, Kolmogorov etc... discussed and summarized here)
    Status: no project drafted yet, only live discussions and Email communications.
     
  • Automated calibration. The main project objective is to move toward a more automated calibration framework whereas migration from one chain to another chain (distortion correction) would be triggered by a criteria (resolution, convergence) rather than a manual change. This work may leverage the FastOffline framework (which was a first attempt to make automated calibration a reality; currently modified by hand and the trigger mechanism is not present / implemented)
    Status: Project description available . Summer 08 service task.
     
  • IO schema evolution (reduction of file size by dropping redundant variables but with full transparency to users)
    Status: Project started as planned on July 16th with goals drafted on page Projects and proposals. Project deliverables were achieved (tested from a custom ROOT version now in the ROOT main CVS). Future release will include a fully functional schema evolution as specified in our document. Integration will be needed.
    Project team: Jerome Lauret (coordination), Valeri Fine (STAR tetsing), Philippe Canal (ROOT team)

     
  • Distributed storage improvement (Efficient dynamic disk population). This project would aim to restore the dynamic disk population of datasets on distributed disk as well as a prioritization mechanism (and possibly bandwidth throttling) so user cannot over-subscribe storage, causing past observed massive delete/restore dropping efficiency.
    Status: under-graduate thesis done ; model to improve IO in/out of HPSS is defined and need implementation.
     
  • Efficient multi-site data transfer (coordination of data movement), this project aims to address multi-Tier2 data transfer support and help organize / best utilize the bandwidth out of BNL. A second part of this project aims at data placement on Grid whereas a "task" working on a dataset is to be scheduled with use of existing staged files at sites or possible pre-staging or migration of files from any site to any site (a bit ambitious).
    Status: Project started as a computer science PhD program (thesis submitted). Work scheduled over a 3 years period and deliverable would need to be put in perspectives of Grid project deliverables.

     
  • Distributed production and monitoring system, job monitoring, centralized production requests interface
    Status: work tasked within the production team.
     
  • FileCatalg improvement. The FileCatalog in STAR was developed from in-house knowledge and support (starting from service work). The catalog now hold 15 Million records (scalability beyond is a concern) and its access possibly inefficient. An initial design diverging from Meta-Data catalog, File Catalog, Replica Catalog has allowed for a quick start and the development of additional infrastructure but has also lead to the replication of the Meta Data information, making hard to maintain consistency of the Catalogs across sites. Federating the Catalogs and using all site's information simultaneously has been marginal to not possible, making a global namespace (replicas) not possible. The lack of this component will directly affect grid realities.
    Status: Ongoing (see Catalog centralized load management, resolving slow querries).

Wish list (for now):

  • Online tracking & High Level trigger. This may depend on a trigger simulation framework (it would have benefited from it for sure) or may be an opportunity to revive the issue and shape anew focused (and reduced in scope) project.
    Status: How to fit this additional activity is under debate. First discussion held at BNL on 2008/07/10 and followed later by additional meetings. This activity moved to the "upgrade" activity.

 

STAR/RCF resource plans

 

 

General fund

 The level of funding planned for 2008 was:

  • According to the RHIC mid-term strategic planning for 2006-2011 document, the budget for 2008 was projected to be 2140 k$ (table 7-2) with a note that and additional 2 M $ additional would be needed between FY08 and FY10 (to accommodate for network infrastructure, storage robotics and silo expansion and general infrastructure changes)
  • The budget planned for FY08 in FY07 was 2.5 M$, accounting for recovering by 0.5 M$ already present past years shortfalls
  • The current budget available is 1.7 M$ with a 1.5 M$ usable base fund.

External funds

Following previous years "outsourcing" of funds approach, an note was sent to the STAR collaboration (Subject: RCF requirements & purchase) on 3/31/2008 12:18. The pricing offered was 4.2 $/GB i.e. 4.3 k$/TB of usable space. Based on the 2007 RCF requirement learning experience (pricing was based on vendor's total space rather than usable), the price was firmed, fixed and guaranteed as "not higher than 4.2 $/GB" by the facility director Micheal Ernst at the March 27th liaison meeting.

The institutions external fund profile for 2008 is as follows:

 

STAR external funds
Institution Paying account TB requested Price
UCLA UCLA 1 4300.8
rice rice 1 4300.8
LBNL LBNL 4 17203.2
VECC BNL 1 4300.8
UKY UKY 1 4300.8
Totals 8 34406.4

Penn State university provided (late) funds for 1 TB worth.

 

*** WORK IN PROGRESS ***

Requirements 

The requirements for FY08 are determined based on 

The initial STAR requirements provided for the RHIC mid-term strategic plan can be found here

 

STAR resource requirements FY05-FY12STAR resource requirements FY05-FY12

 

The initial raw data projected was 870 TB (+310 TB).

The RAW data volume taken by STAR in FY08 (shorter run) is given by the HPSS usage (RAW COS) as showed below:


A total of 165 TB was accumulated far below expected data projections by a factor of 2. The run was however declared as meeting (to exceeding) goals comparing to the STAR initial BUR.

Some notes:

  • STAR made extensive use this year of fast triggers
     
  • Based on those numbers, we assumed that
    • The CPU requirements of 1532 kSI2k (+1071 kSI2k) would equally scale, hence a minimal requirement of +215 kSI2K should be accounted for
    • A bigger pool of distributed storage would allow for more flexibility: it would allow for re-considering multiple (if not most of) the datasets to be placed on disk in Xrootd pool + it would allow (modulo expanding beyond the 1.2 replication baseline) to better load balance the resources.
    • The distributed disk planing accounted for 365 TB of storage (1 pass production, small fraction of past results on disk). We targeted 800 TB of disk space (about twice the initial amount).

Allocations within total budgets

scenario B = scenario A + external funds

 

Experiment Parameters STAR STAR
  Senario A Senario B.
Sustained d-Au Data Rate (MB/sec) 70 70
Sustained p-p Data Rate (MB/sec) 50 50
Experiment Efficiency (d-Au) 90% 90%
Experiment Efficiency (p-p) 90% 90%
Estimated d-Au Raw Data Volume (TB) 130.8 130.8
Estimated p-p Raw Data Volume (TB) 41.5 41.5
Estimated Raw Data Volume (TB) 172.3 172.3
<d-AU Event Size> (MB) 1 1
<p-p Event Size> (MB) 0.4 0.4
Estimated Number of Raw d-Au Events 137,168,640 137,168,640
Estimated Number of Raw p-p Events 108,864,000 108,864,000
d-AU Event Reconstruction Time (sec) 9 9
p-p Event Reconstruction Time (sec) 16 16
SI2000-sec/event d-Au 5202 5202
SI2000-sec/event p-p 9248 9248
CPU Required (kSI2000-sec) 1.7E+9 1.7E+9
CRS Farm Size if take 1 Yr. (kSI2k) 54.6 54.6
CRS Farm Size if take 6 Mo. (kSI2k) 109.1 109.1
     
Estimated Derived Data Vlume (TB) 200.0 200.0
Estimated CAS Farm Size (kSI2k) 400.0 400.0
     
Total Farm Size (1 Yr. CRS) (kSI2k) 454.6 454.6
Total Farm Size (6 Mo. CRS) (kSI2k) 509.1 509.1
     
Current Central Disk  (TB) 82 82
Current Distributed Disk (TB) 527.5 527.5
Current kSI2000 1819.4 1819.4
     
Central Disk to retire (TB) 0 0
# machines to retire form CAS 0 0
# machines to retire from CRS 128 128
Distributed disk to retire (TB) 27.00 27.00
CPU to retire (kSI2k) 120.00 120.00
     
Central Disk (TB) 49.00 57.00
     
Cost of Central Disk $205,721.60 $239,308.80
Cost of Servers to support Central Disk    
     
Compensation Disk entitled (TB) 0.00 0.00
Amount (up to entitlement) (TB) 0.00 0.00
Cost of Compensation Disk $0 $0
Remaining Funds $0 $0
     
Compensation count (1U, 4 GB below) 5 5
Compensation count (1U, 8 GB below) 0 0
CPU Cost $27,500 $27,500
Distributed Disk 27.8 27.8
kSI2k 114.5 114.5
     
     
# 2U, 8 cores, 5900 GB disk, 8 GB RAM 27 27
# 2U, 8 cores, 5900 GB disk, 16 GB RAM 0 0
CPU Cost $148,500 $148,500
Distrib. Disk on new machines (TB) 153.9 153.9
kSI2k new 618.2 618.2
Total Disk (TB) 813.2 821.2
Total CPU (kSI2000) 2432.1 2432.1
Total Cost $354,222 $387,809
Outside Funds Available $0 $34,406
Funds Available $355,000 $355,000

 

Post purchase actions

BlueArc disk layout before the new storage commissioning

 

Name

File System

Path

Hard Quota

Space allocated

Available Space

BlueArc Physical storage

star_institutions_bnl

STAR-FS01

/star_institution/bnl

3.50

16.50

19.00

BA01

star_institutions_emn

STAR-FS01

/star_institution/emn

1.60

star_institutions_iucf

STAR-FS01

/star_institution/iucf

0.80

star_institutions_ksu

STAR-FS01

/star_institution/ksu

0.80

star_institutions_lbl

STAR-FS01

/star_institution/lbl

9.80

star_data03

STAR-FS02

/star_data03

1.80

17.22

19.75

star_data04

STAR-FS02

/star_data04

1.00

star_data08

STAR-FS02

/star_data08

1.00

star_data09

STAR-FS02

/star_data09

1.00

star_data16

STAR-FS02

/star_data16

1.66

star_data25

STAR-FS02

/star_data25

0.83

star_data26

STAR-FS02

/star_data26

0.84

star_data31

STAR-FS02

/star_data31

0.83

star_data36

STAR-FS02

/star_data36

1.66

star_data46

STAR-FS02

/star_data46

6.60

star_data05

STAR-FS03

/star_data05

2.24

18.51

21.40

BA02

star_data13

STAR-FS03

/star_data13

1.79

star_data34

STAR-FS03

/star_data34

1.79

star_data35

STAR-FS03

/star_data35

1.79

star_data48

STAR-FS03

/star_data48

6.40

star_data53

STAR-FS03

/star_data53

1.50

star_data54

STAR-FS03

/star_data54

1.50

star_data55

STAR-FS03

/star_data55

1.50

star_data18

STAR-FS04

/star_data18

1.00

16.86

19.45

star_data19

STAR-FS04

/star_data19

0.80

star_data20

STAR-FS04

/star_data20

0.80

star_data21

STAR-FS04

/star_data21

0.80

star_data22

STAR-FS04

/star_data22

0.80

star_data27

STAR-FS04

/star_data27

0.80

star_data47

STAR-FS04

/star_data47

6.60

star_institutions_mit

STAR-FS04

/star_institutions/mit

0.96

star_institutions_ucla

STAR-FS04

/star_institutions/ucla

1.60

star_institutions_uta

STAR-FS04

/star_institutions/uta

0.80

star_institutions_vecc

STAR-FS04

/star_institutions/vecc

0.80

star_rcf

STAR-FS04

/star_rcf

1.10

star_emc

STAR-FS05

/star_emc

?

1.042

2.05

BA4

star_grid

STAR-FS05

/star_grid

0.05

star_scr2a

STAR-FS05

/star_scr2a

?

star_scr2b

STAR-FS05

/star_scr2b

?

star_starlib

STAR-FS05

/star_starlib

0.02

star_stsg

STAR-FS05

/star_stsg

?

star_svt

STAR-FS05

/star_svt

?

star_timelapse

STAR-FS05

/star_timelapse

?

star_tof

STAR-FS05

/star_tof

?

star_tpc

STAR-FS05

/star_tpc

?

star_tpctest

STAR-FS05

/star_tpctest

?

star_trg

STAR-FS05

/star_trg

?

star_trga

STAR-FS05

/star_trga

?

star_u

STAR-FS05

/star_u

0.97

star_xtp

STAR-FS05

/star_xtp

0.002

star_data01

STAR-FS06

/star_data01

0.83

14.94

16.90

star_data02

STAR-FS06

/star_data02

0.79

star_data06

STAR-FS06

/star_data06

0.79

star_data14

STAR-FS06

/star_data14

0.89

star_data15

STAR-FS06

/star_data15

0.89

star_data38

STAR-FS06

/star_data38

1.79

star_data39

STAR-FS06

/star_data39

1.79

star_data40

STAR-FS06

/star_data40

1.79

star_data41

STAR-FS06

/star_data41

1.79

star_data43

STAR-FS06

/star_data43

1.79

star_simu

STAR-FS06

/star_simu

1.80

star_data07

STAR-FS07

/star_data07

0.89

16.40

19.15

star_data10

STAR-FS07

/star_data10

0.89

star_data12

STAR-FS07

/star_data12

0.76

star_data17

STAR-FS07

/star_data17

0.89

star_data24

STAR-FS07

/star_data24

0.89

star_data28

STAR-FS07

/star_data28

0.89

star_data29

STAR-FS07

/star_data29

0.89

star_data30

STAR-FS07

/star_data30

0.89

star_data32

STAR-FS07

/star_data32

1.75

star_data33

STAR-FS07

/star_data33

0.89

star_data37

STAR-FS07

/star_data37

1.66

star_data42

STAR-FS07

/star_data42

1.66

star_data44

STAR-FS07

/star_data44

1.79

star_data45

STAR-FS07

/star_data45

1.66


Reshape proposal

 

 

 

 

Action effect (+/- impact in TB unit)

 

Action

FS01

FS02

FS03

FS04

FS05

FS06

FS07

SATA

2008/08/15

Move/backup data25, 26, 31, 36 to SATA

 

4.56

 

 

 

 

 

-4.56

2008/08/18

Drop 25, 26, 31, 36 from FS01 and expand on SATA to 5 TB

 

 

 

 

 

 

 

-15.84

2008/08/22

Shrink 46 to 5 TB, move to SATA and make it available at 5 TB

 

6.60

 

 

 

 

 

-5.00

 

 

 

 

 

 

 

 

 

 

2008/08/19

Move institutions/ksu and institutions/iucf to FS02

1.60

-1.60

 

 

 

 

 

 

2008/08/19

Expand ksu and iucf to 2 TB

 

-0.80

 

 

 

 

 

 

2008/08/22

Move institutions/bnl to FS02

3.50

-3.50

 

 

 

 

 

 

 

Expand bnl to 4 TB

 

-0.50

 

 

 

 

 

 

 

Expand lbl by 4.2 TB (i.e. 14 TB)

-4.20

 

 

 

 

 

 

 

 

Expand emn to 2 TB 

-0.40

 

 

 

 

 

 

 

 

Expand data03 to 2.5 TB

 

-0.70

 

 

 

 

 

 

 

Expand data04 to 2 TB

 

-1.00

 

 

 

 

 

 

 

Expand data08 to 2 TB

 

-1.00

 

 

 

 

 

 

 

Expand data16 to 2 TB

 

-0.34

 

 

 

 

 

 

 

Expand data09 to 2 TB

 

-1.00

 

 

 

 

 

 

Checkpoint

 

0.50

0.72

0.00

0.00

0.00

0.00

0.00

-25.40

 

Action

FS01

FS02

FS03

FS04

FS05

FS06

FS07

SATA

2008/08/22

Shrink data 48 to 5 TB,move to SATA

 

 

6.40

 

 

 

 

-5.00

 

Expand data05 to 3 TB

 

 

-0.76

 

 

 

 

 

 

Expand 13, 34, 35, 53, 54 and 55 to 2.5 TB

 

 

-5.13

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2008/08/22

Shrink and move data47 to SATA

 

 

 

6.60

 

 

 

-5.00

 

Move 18,19, 20, 21 to SATA

 

 

 

3.40

 

 

 

-3.40

 

Expand data18, 19, 20, 21 to 2.5 TB

 

 

 

 

 

 

 

-6.60

 

Add to FS02 a institutions/uky at 1 TB

 

 

 

-1.00

 

 

 

 

 

Add to FS02 a institutions/psu at 1 TB

 

 

 

-1.00

 

 

 

 

 

Add to FS02 a institutions/rice at 1 TB

 

 

 

-1.00

 

 

 

 

 

Expand vecc to 2 TB

 

 

 

-1.20

 

 

 

 

 

Expand ucla to 3 TB

 

 

 

-1.40

 

 

 

 

 

Expand 22 and 27 to 1.5 TB

 

 

 

-1.40

 

 

 

 

 

Expand /star/rcf to 3 TB

 

 

 

-1.90

 

 

 

 

Checkpoint

 

0.50

0.72

0.51

1.10

0.00

0.00

0.00

-45.40

 

Action

FS01

FS02

FS03

FS04

FS05

FS06

FS07

SATA

 

Free (HPSS archive) emc, src2a, src2b, stsg, timelapse, tof

 

 

 

 

0.00

 

 

 

 

Free (HPSS archive) tpc, tpctest, trg, trga

 

 

 

 

0.00

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Move 40, 41, 43 to SATA

 

 

 

 

 

5.37

 

-5.37

 

Expand 01 to 2 TB

 

 

 

 

 

-1.17

 

 

 

Expand 02 to 2 TB

 

 

 

 

 

-1.21

 

 

 

Expand star_simu to 3 TB

 

 

 

 

 

-1.20

 

 

Checkpoint

 

0.50

0.72

0.51

1.10

0.00

1.79

0.00

-50.77


 


Missing information and progress records:

  • 2008/08/14 13:44 - Answer from the RCF as per the above plan being approved (and commented it seemed easy)
    • Two caveats: ETA cannot be provided until migration starts (one test example) to get a more accurate estimate
    • While virtualmount point are swapped between one storage pool to another, there may be a fluke in access (will need infomring institutions / production disk will be handled by hard-dismount)
       
  • 2008/08/14 13:42 - Sent an Email requesting infomration regarding disk manager and/or policies for PSU, UKY and RICE - Email sent to council rep and/or designated rep on August 14th 2008
    • Answer from UKY 2008/08/14 13:56 >> Disk space manager=Renee Fatemi, policy = MIT policy
    • Answer from PSU 2008/08/15 16:07 >> Policy is standard
       
  • 2008/08/18
    • Achieved actions marked in Italic + date
    • Date in Italic are ongoing actions
    • If two dates appear, the first is the start of the action and the second the end