PowerEdge R610 disk/RAID setup for File Catalogue servers (and other purposes)

With this hardware, we have a couple of primary goals in mind:

1.  High performance of the File Catalog Servers, which to date has largely been limited by disk I/O

2.  Redundancy of both server load capacity and hardware to handle failures gracefully and quickly (within reason).

 

Operating System:

Initial OS installation was RedHat Enterprise Linux 5.3 Server, 64-bit version.  However, the RHEL Server version required additional money to use with the BNL RHN Satellite server, so I switched all the nodes to Scientific Linux 5.3 (also 64-bit).

 

Basic hardware info:

Dell PowerEdge R610

Quad Core Intel Xeon E5520, 2.26GHz, 8M cache, with Hyper-Threading

12 GB (6 x 2 GB) 1333MHz RAM (DDR3-1333 aka PC3-10600R)

4 Gigabit ethernet ports

6 x 72GB SAS 15K RPM disk drives (2.5" enclosure).

DELL PERC 6/i RAID Controller (based on LSI chipset - uses the megaraid_sas driver)

Disk configuration is 2 disks in a RAID 1 array, with the remaining 4 disks in a RAID 5 array.  

System partitions (/, /boot and swap + Dell utilities) will be on the RAID 1

The RAID 5 array will have one partition mounted on /db01 (with the noatime option) and the db temp space will reside on this partition

2 GB of swap (no particular reason for this number - we want little or no swap to be used)

 

The RAID controller:

This RAID controller, according to the documentation, is able to import foreign arrays -- if for instance one of these systems suffers a catastrophic motherboard failure, then the disks should be transferable to one of the sister nodes.  I have successfully installed the disks from one machine into another (twice) and imported the foreign arrays (<phew!>), so it looks like this requirement is satisfied.   (The step(s) to import the configuration are straightforward, automated even -- the foreign disks were detected at boot; it offered to import the configuration with a push of the 'f' key, and the system booted normally.)

Here is some useful information about this RAID controller:

(somewhat generic, but with links) http://www.dell.com/content/topics/topic.aspx/global/products/pvaul/topics/en/us/raid_controller?c=us&cs=555&l=en&s=biz

(more detailed) http://support.dell.com/support/edocs/storage/RAID/PERC6/en/PDF/en_ug.pdf

 

Interacting with the RAID controller can be done in several ways:

1)  During system boot, there is an opportunity to press <Ctrl-R> to enter the configuration utility.  

 

2)  LSI has a Java-based GUI (with an underlying full-time process called "mrmonitor" (presumably this stands for MegaRAID monitor)).  Using this interface, arrays and disks can be managed, and mrmonitor can be configured to send emails on various events to designated recipients.  There are four levels of events (info, warning, critical and fatal).  Info events are numerous and there's no point in recieving notices of these, but anything above that is likely something of interest.  This system also includes the ability to monitor controllers on multiple nodes, though I am not planning on using this feature - introducing the network element and the accompanying security issues is not worth it.

 

3)  There is a command line interface "megacli", which by default is installed in /opt/MegaRAID/MegaCLI/  (the 64-bit version is /opt/MegaRAID/MegaCLI/MegaCLI64).  For instance, this command prints info about the controller itself:  "MegaCli64 -AdpAllInfo -aALL".  "MegaCli64 -help" provides a list of possible commands, though it is somewhat cryptic.  For more info on the available commands, a good starting point is here:  http://tools.rapidsoft.de/perc/perc-cheat-sheet.html

 

The GUI and CLI software can be downloaded from here:  http://www.lsi.com/storage_home/products_home/internal_raid/megaraid_sas/megaraid_sas_8480e/index.html?remote=1&locale 

Here is LSI's documentation on this:  http://www.lsi.com/DistributionSystem/AssetDocument/80-00156-01_RevF.pdf

One note on the GUI installation - the README for the GUI indicates that a libstdc++ package (which is included in the bundle) must be installed on RHEL 5.  However, on the test node, this package had a conflict with an already installed package.  Nonetheless, the GUI appears to work and I successfully received several monitoring emails after configuring the monitor (Tools -> Configure -> Monitor Configurator)

 

Update, Sept. 14, 2011:  The  original installations of the LSI Storage Manager had a number of GUI oddities making it awkward (at best) to use (mouse control was peculiar and pop-up windows were too small to display any elements and couldn't be resized or scrolled).  Moreover, I recently discovered it had stopped working altogether on fc1-4 (see STAR RT # 2179).  So I went to LSI and was able to download a recent release (August 11, 2011) and it is working much better.  Note though that the installation is still peculiar - some RPMs have to be updated by hand ("sas_*") while  others are updated automatically, the compat-libstdc++ note is still in the readme and still irrelevent it seems, and there are rpm pre and/or post scripts that fail, but in the end, the GUI works anyway, which is good enough for my purposes.  (And keeping in mind that I am only using the local installation - no remote management here.)  Will see if installing this new version also fixes the more serious problem(s) on fc1-4.