CPU and bulk storage purchase 2010
Institutional disk space summary
Announcement for institutional disk space was made in starmail on 2010/04/26 12:31.
To date, the following requests were made (either in $ or in TB):
Institution | Contact | Date | $ (k$) | TB equivalent | Final cost | ||
LBNL | Hans Georg Ritter | 2010/04/26 15:24 | 20 | 5 | $17,006.00 | ||
ANL | Harold Spinka | 2010/04/26 16:29 | - | 1 | $3,401.00 | ||
UCLA | Huan Huang | 2010/04/26 16:29 | - | 1 | $3,401.00 | ||
UTA | Jerry Hoffmann | 2010/04/27 14:59 | - | 1 | $3,401.00 | ||
NPI | Michal Sumbera & Jana Bielcikova | 2010/04/20 10:00 | 30 | 8 | $27,210.00 | ||
PSU | Steven Heppelmann | 2010/04/29 16:00 | - | 1 | $3,401.00 | ||
BNL | Jamie Dunlop | 2010/04/29 16:45 | - | 5 | $17,006.00 | ||
IUCF | Will Jacobs | 2010/04/29 20:18 | - | 2 | $6,802.00 | ||
MIT | Bernd Surrow | 2010/05/08 18:07 | - | 2 |
|
||
Totals | 24 |
|
The storage cost for 2010 was estimated at 3.4k$ / TB. Detail pricing below.
Central storage cost estimates
Since the past storage stretches on the number of servers and scalability, we would (must) buy a pair of mercury servers which recently cost us $95,937. The storage itself would be based on a recent pricing i.e. a recent configuration quoted it as: (96) 1TB SATA drives, price $85,231 + $2,500 installation yielding to 54 TB usable. STAR's target is 50 TB for production +5+10 TB for institution (it will fit and can be slightly expanded). Total cost is hence:
$95,937 + $85,231 + $2,500 = $183,668 / 54TB = 3401/TB
Detail cost projections may indicate (depending on global volume) a possibly better pricing: the installation price (a haf a day of work for a tech from BlueArc) is fixed and each server pair could hold more than the planned storage (hence the cost for two servers is also fixed). Below a few configurations:
Service installation | 2500 | 2500 | 2500 | |
Cost for 54+27 TB | 127846.5 | |||
Cost per 54 TB | 85231 | |||
Cost per 27 TB | 42615.5 | |||
Two servers | 95937 | 95937 | 95937 | |
Price with servers | 183668 | 141052.5 | 226283.5 | |
Price per TB | 3401.3 | 5224.2 | 3187.1 | |
Price per MB | 0.003244 | 0.004982154 | 0.003039447 |
CPU estimates, choices, checks
Projected / allowed CPU need additional based on funding guidance (see CSN0474 : The STAR Computing Resource Plan, 2009): 7440 kSi2k / 2436 kSi2k - projected to be 43% shortage
Projected distributed storage under the same condition (dd model has hidden assumptions): 417 TB / 495 TB - projected to be at acquired level 130% off optimal solution
The decision was to go for 1U machine, switch to the 2 TB drive Hitachi HUA722020ALA330 SATA 3.0 Gbps drive to compensate from drive space loss (4 slots instead of 6 in a 2U). The number of network ports was verified to be adequate for our below projection. The 1U configuration allows recovering more CPU power / desnity. Also, the goal is to move to a boosted memory configuration and enable hyper-threading growing from 8 Batch slots to a consistent 16 slots per node (so another x2 although the performance scaling will not be x2 due to the nature of hyper-threading). Finally, it was decided NOT to retire the older machines this year but keep them on until next year.
Planned numbers
- Distributed storage additional: 1009.4 TB
- Only 3/4th of this space is usage * 90% for high watermarking hence, we end up with 681 TB of new storage. The assumption is that one of teh 2 TB disk will go to support production and user analysis (the likely proper number is 1 TB, hence a 16% effect and margin TBC).
- The total required space for considering all production passes within a year is 1440 TB.
- The accumulated total usable distributed storage is 277 TB - the total space is hence planned to be 958 TB with the assumptions above (possibly 30% shortfall or only 15% if we recover 1 TB from the OS+TEMP disk).
- Conclusion: distributed storage will remain constrained as planned (not all productions will be available but near all).
- The total centralized disk needed for 2010 was 50 TB. The final number will be a 81 TB unit - 24 TB for institutional support = 57 TB storage.
- Conclusion: The central storage will have a small margin of flexibility allowing expansion of simu space and other similar areas
- The total needed CPU required was projected to be 11634 kSI2k
- Within the current procurement, the total CPU will reach 8191 kSI2k with 1U nodes (would have been 6827 kSI2k for 2U nodes).
- Our shortfall will be ~ 30% off the theoretical projected needs. Initial projection was a fall by 43% (so a 13% gain by balancing cost between storage, memory and CPU).
- Assuming the hyper-threading will allow for at least a gail factor of x1.4 (TBC but evidence through beta-testing indicates this is likely), the shortfall may be as little as 16% shortfall. This number is within reacheable enhanced duty factor.
- Conclusion: the shortfall, if the initial projections remain accurate, is assumed to be from 16 to 30%.
Reality checks:
- Event size estimates DAQ + Event size estimates reco - implicitly done in You do not have access to view this node - the bottom line is that the space is tracking fine but overall, we exceeded our goals not by 50% as initially though but by 79% +/- 2%.
- Processing time estimates (extracted from FastOflfine, final times are unclear due to slow time caused by the so-called Speeding up DB access using SSD or Memory (see also Effect of stream data on database performance, a 2010 study).
Accounting check - post install
Since we had many problems with missmatch of purchased/provided space with the RCF in past years, keeping track of the space accounting is a good idea. Below is an account of where the space went (we should total to 55 TB of production space and 26 TB of institution space).
Disk | Initial space | Final size | Total |
lbl_prod | 5 | 5 | 10 |
lbl | 14 | 0 | 14 |
anl | 0 | 1 | 1 |
mit | 3 | 2 | 5 |
bnl | 6 | 5 | 11 |
iucf | 1 | 2 | 3 |
npiascr | 3 | 8 | 11 |
psu | 1 | 1 | 2 |
ucla | 4 | 1 | 5 |
uta | 1 | 1 | 2 |
Total added | 26 | ||
data08 | 2 | 2.5 | 4.5 |
data09 | 2 | 3 | 5 |
data22 | 2 | 3.5 | 5.5 |
data23 | 5 | 0.5 | 5.5 |
data27 | 1.5 | 4 | 5.5 |
data11 | (gone in 2009) | 5 | 5 |
data23 | (gone in 2009) | 5 | 5 |
data85 to 89 | N/A | 5*5 | 25 |
data90 | N/A | 6 | 6 |
Total added so far | 54.5 |
There should be a 0.5 TB unallocated here and there.
- Printer-friendly version
- Login or register to post comments