Disk space for FY09

Institution disk space

The below is what was gathered as the call sent to starsoft "Inquiry - institutional disk space for FY09" (with delay, a copy was sent to starmail on the 14th of April 2009). The deadline was provided as the end of Tuesday the 14th 2009, feedback was accepted until Wednesday the 15th (anything afterward could have been ignored).

 

Institution # TB confirmed
LBNL 5 April 21st 17:30
BNL hi 2 [self]
BNL me 1  [self]
NPI/ASCR 3 April 22nd 05:54
UCLA 1  
Rice 4 April 21st 18:47
Purdue 1 April 22nd 15:12
Valpo 1 April 22nd 17:59
MIT 2 April 22nd 15:56
Total 20  

The pricing on the table is as initially advertised i.e. a BlueArc Titan 3200 based solution at 4.3 k$/ TB for fiber channel based storage. For a discussion of fiber channel versus SATA, please consult this posting in starsofi. A quick performance overview of the Titan 3200 is showed below:

  Titan 3200
IOPS 200,000
Throughput Up to 20Gbps (2.5 GB/sec)
Scalability Up to 4PB in a single namespace
Ethernet Ports 2 x 10GbE or 6 x GbE
Fibre Channel Ports Eight 4Gb
Clustering Ports

Two 10GbE

Solution enables over 60,000 user sessions and thousands of compute nodes to be served concurrently.

The first scalability statement is over the top comparing to RHIC/STAR need but the second is by far reached at the RCF environment.

Production space

SATA based solution will be priced at 2.2 k$ / TB. While the price is lower than the fiber channel solution (and may be tempting), this solution is NOT recommended for institutional disk as the scalability for read IO at the level we are accustom to is doubtful (doubtful is probably an under-statement as we know by 5 years ago experience we will have to apply IO throttling).

As a space for production however (and considering resource constrained demanding cheaper solutions coupled with a Xrootd fast IO based aggregation solution which will remain the primary source of data access to users), the bet is that it will work if used as a buffer space (production jobs write locallyto the worker nodes, move files to central disk at the end as an additional copy along an HPSS data migration). There will be minimal guarantees of read performance access for analysis on those "production reserved" storage.

One unit of Thumper at 20k$ / 33 TB usable will be also purchased and tried out in special context. This solution is even less scalable and hence, requires a reduced amount of users and IO. The space targeted for this lower end may include (TBC):

  • data06 & data07 (2 TB) - reserved for specific projects and not meant for analysis, performance would not an issue
  • data08                (2 TB) - meant for Grid, IO is minimal there but we may need to measure data transfers compatible with KISTI based production
  • /star/rcf               (5 TB) - production log space (delayed IO, mostly a one time saving and will be fine)

Final breakdown