CPU and bulk storage purchase 2010

 

Institutional disk space summary

Announcement for institutional disk space was made in starmail on 2010/04/26 12:31.

To date, the following requests were made (either in $ or in TB):

 

Institution Contact   Date $ (k$) TB equivalent Final cost
LBNL Hans Georg Ritter   2010/04/26 15:24 20 5 $17,006.00
ANL Harold Spinka   2010/04/26 16:29 - 1 $3,401.00
UCLA Huan Huang   2010/04/26 16:29 - 1 $3,401.00
UTA Jerry Hoffmann   2010/04/27 14:59 - 1 $3,401.00
NPI Michal Sumbera & Jana Bielcikova   2010/04/20 10:00 30 8 $27,210.00
PSU Steven Heppelmann   2010/04/29 16:00 - 1 $3,401.00
BNL Jamie Dunlop   2010/04/29 16:45 - 5 $17,006.00
IUCF Will Jacobs   2010/04/29 20:18 - 2 $6,802.00
MIT Bernd Surrow   2010/05/08 18:07 - 2
$6,802.00
        Totals 24
$88,430.00

The storage cost for 2010 was estimated at 3.4k$ / TB. Detail pricing below.

 

Central storage cost estimates

Since the past storage stretches on the number of servers and scalability, we would (must) buy a pair of mercury servers which recently cost us $95,937. The storage itself would be based on a recent pricing i.e. a recent configuration quoted it as: (96) 1TB SATA drives, price $85,231 + $2,500 installation yielding to 54 TB usable. STAR's target is 50 TB for production +5+10 TB for institution (it will fit and can be slightly expanded). Total cost is hence:

$95,937 + $85,231 + $2,500  =  $183,668 / 54TB = 3401/TB

Detail cost projections may indicate (depending on global volume) a possibly better pricing: the installation price (a haf a day of work for a tech from BlueArc) is fixed and each server pair could hold more than the planned storage (hence the cost for two servers is also fixed). Below a few configurations:

Service installation 2500 2500 2500
Cost for 54+27 TB     127846.5
Cost per 54 TB 85231    
Cost per 27 TB   42615.5  
Two servers 95937 95937 95937
Price with servers 183668 141052.5 226283.5
Price per TB 3401.3 5224.2 3187.1
Price per MB 0.003244 0.004982154 0.003039447

 

CPU estimates, choices, checks

Projected / allowed CPU need additional based on funding guidance (see CSN0474 : The STAR Computing Resource Plan, 2009): 7440 kSi2k / 2436 kSi2k    - projected to be 43% shortage
Projected distributed storage under the same condition (dd model has hidden assumptions): 417 TB / 495 TB  - projected to be at acquired level 130% off optimal solution

The decision was to go for 1U machine, switch to the 2 TB drive Hitachi HUA722020ALA330 SATA 3.0 Gbps drive to compensate from drive space loss (4 slots instead of 6 in a 2U). The number of network ports was verified to be adequate for our below projection. The 1U configuration allows recovering more CPU power / desnity. Also, the goal is to move to a boosted memory configuration and enable hyper-threading growing from 8 Batch slots to a consistent 16 slots per node (so another x2 although the performance scaling will not be x2 due to the nature of hyper-threading). Finally, it was decided NOT to retire the older machines this year but keep them on until next year.

Planned numbers

  • Distributed storage additional: 1009.4 TB
    • Only 3/4th of this space is usage * 90% for high watermarking hence, we end up with 681 TB of new storage. The assumption is that one of teh 2 TB disk will go to support production and user analysis (the likely proper number is 1 TB, hence a 16% effect and margin TBC).
    • The total required space for considering all production passes within a year is 1440 TB.
    • The accumulated total usable distributed storage is 277 TB - the total space is hence planned to be 958 TB with the assumptions above (possibly 30% shortfall or only 15% if we recover 1 TB from the OS+TEMP disk).
    • Conclusion: distributed storage will remain constrained as planned (not all productions will be available but near all).
  • The total centralized disk needed for 2010 was 50 TB. The final number will be a 81 TB unit - 24 TB for institutional support = 57 TB storage.
    • Conclusion: The central storage will have a small margin of flexibility allowing expansion of simu space and other similar areas
  • The total needed CPU required was projected to be 11634 kSI2k
    • Within the current procurement, the total CPU will reach 8191 kSI2k with 1U nodes (would have been 6827 kSI2k for 2U nodes).
    • Our shortfall will be ~ 30% off the theoretical projected needs. Initial projection was a fall by 43% (so a 13% gain by balancing cost between storage, memory and CPU).
    • Assuming the hyper-threading will allow for at least a gail factor of x1.4 (TBC but evidence through beta-testing indicates this is likely), the shortfall may be as little as 16% shortfall. This number is within reacheable enhanced duty factor.
    • Conclusion: the shortfall, if the initial projections remain accurate, is assumed to be from 16 to 30%.

Reality checks:

 

Accounting check - post install

Since we had many problems with missmatch of purchased/provided space with the RCF in past years, keeping track of the space accounting is a good idea. Below is an account of where the space went (we should total to 55 TB of production space and 26 TB of institution space).

Disk Initial space Final size Total
lbl_prod 5 5 10
lbl 14 0 14
anl 0 1 1
mit 3 2 5
bnl 6 5 11
iucf 1 2 3
npiascr 3 8 11
psu 1 1 2
ucla 4 1 5
uta 1 1 2
Total added 26  
data08 2 2.5 4.5
data09 2 3 5
data22 2 3.5 5.5
data23 5 0.5 5.5
data27 1.5 4 5.5
data11 (gone in 2009) 5 5
data23 (gone in 2009) 5 5
data85 to 89 N/A 5*5 25
data90 N/A 6 6
Total added so far 54.5  

 

There should be a 0.5 TB unallocated here and there.