Backups

<placeholder> Describes STAR's database backup system. </placeholder>

 

Sept. 5, 2008, WB -- seeing no documentation here, I am going to "blog" some as I go through dbbak to clean up from a full disk and restore the backups.  (For the full disk problem, see http://www.star.bnl.gov/rt2/Ticket/Display.html?id=1280 .)  I expect to get the details in and then come back to clean it up...

dbbak.starp.bnl.gov has Perl scripts running in cron jobs to get mysql dumps of nearly all of STAR's databases and store them on a daily basis.   The four nightly (cron) backup scripts are attached (as of Nov. 24, 2008).

The scripts keep the last 10 dumps in /backups/dbBackups/{conditions,drupal,duvall,robinson,run}.  Additionally, on the 1st and 15th of each month, the backups are put in "old" directories within each respective backup folder.

The plan of action as Mike left it was to, once or twice each year, move the "old" folders to RCF and from there put them into HPSS (apparently into his own HPSS space)

We have decided to revive the stardb RCF account and store the archives via that account.  /star/data07/db has been suggested as the temporary holding spot, so we have a two-step  sequence like this:

  1. As stardb on an rcas node, scp the "old" directories to /star/data07/db/<name>
  2. Use htar to tar each "old" directory and store in HPSS.

Well, that's "a" plan anyway.  Other plans have been mentioned, but this is what I'm going to try first.  Let's see how it goes in reality...

  1. Login to an interactive rcas node as stardb (several possible ways to do this - for this manual run, I login to rssh as myself, then ssh to stardb@rcas60XX .  I added my ssh key to the stardb authorized_keys file, so no password is required, even for the scp from dbbak.)
  2.  oops, first problem -- /star/data07/db/ is owned by deph and does not have group or world write permission, so "mkdir /star/data07/db_backup_temp" then make five subdirectories (conditions, drupal, duvall, robinson, run)
  3. [rcas6008] ~/> scp -rp root@dbbak.starp.bnl.gov:/backups/dbBackups/drupal/old /star/data07/db_backup_temp/drupal/
  4. [rcas6008] /star/data07/db_backup_temp/> htar -c -f dbbak_htars/drupal.09_05_2008 drupal/old
  5. Do a little dance for joy, because the output is:  "HTAR: HTAR SUCCESSFUL"
  6. Verify with hsi:  [rcas6008] /star/data07/db_backup_temp/> hsi
    Username: stardb  UID: 3239  CC: 3239 Copies: 1 [hsi.3.3.5 Tue Sep 11 19:31:24 EDT 2007]
    ? ls
    /home/stardb:
    drupal.09_05_2008        drupal.09_05_2008.index 
    ? ls -l
    /home/stardb:
    -rw-------   1 stardb    star       495516160 Sep  5 19:37 drupal.09_05_2008
    -rw-------   1 stardb    star           99104 Sep  5 19:37 drupal.09_05_2008.index
     
  7. So far so good.  Repeat for the other 4 database bunches (which are a couple of orders of magnitude bigger)

 

Update, Sept. 18, 2008:

This was going fine until the last batch (the "run" databases).  Attempting to htar "run/old" resulted in an error:

[rcas6008] /star/data07/db_backup_temp/> htar -c -f dbbak_htars/run.09_16_2008 run/old
ERROR: Error -22 on hpss_Open (create) for dbbak_htars/run.09_16_2008
HTAR: HTAR FAILED
 

 

I determined this to be a limit in *OUR* HPSS configuration - there is a 60GB max file size limit, which the run databases were exceeding at 87GB.  Another limit to be aware of, however, is an 8GB limit on member files ( see the "File Size" bullet here: https://computing.llnl.gov/LCdocs/htar/index.jsp?show=s2.3 -- though this restriction was removed in versions of htar after Sept. 10, 2007 ( see the changelog here: https://computing.llnl.gov/LCdocs/htar/index.jsp?show=s99.4 ),  HTAR on the rcas node is no newer than August 2007, so I believe this limit is present.) 

 

There was in fact one file exceeding 8 GB in these backups (RunLog-20071101.sql.gz, at 13 GB).   I used hsi to put this file individually into HPSS (with no tarring).

Then I archived the run database backups piecemeal.  All in all, this makes a small mess of the structure and naming convention.  It could be improved, but for now, here is the explanation:

 

DB backup structure in HPSS
HPSS file (relative to /home/stardb)

Corresponding dbbak path

(relative to /backups/dbBackups )

Descriptionsingle file or tarball?
dbbak_htars/conditions.09_08_2008conditions/old/twice monthly conditions database backups (Jan. - Aug. 2008) tarball
dbbak_htars/drupal.09_05_2008drupal/oldtwice monthly drupal database backups (Nov. 2007 - Aug. 2008) tarball
dbbak_htars/duvall.09_13_2008duvall/oldtwice monthly duvall database backups (Jan. - Aug. 2008) tarball
dbbak_htars/robinson.09_15_2008robinson/oldtwice monthly robinson database backups (Jan. 2007 - Aug. 2008) tarball
RunLog-20071101.sql.gzrun/old/RunLog-20071101.sql.gz RunLog database backup (Nov. 1, 2007) single file (13GB)
dbbak_htars/RunLog_2007.09_18_2008run/old/RunLog-2007*twice monthly RunLog database backups (Jan. - Mar. 2007, Nov. 15 2007 - Dec. 2007 ) tarball
dbbak_htars/RunLog_Jan-Feb_2008.09_18_2008run/old/RunLog-20080[12]*twice monthly RunLog database backups (Jan. - Feb. 2008) tarball
dbbak_htars/run.09_18_2008run/old/RunLog-20080[345678]*twice monthly run database backups (Mar. - Aug. 2008) tarball
    
 dbbak_htars/Missing_Items.txt N/A a text file explaining that there are no backups for Sept. 1 or Sept 15, 2008. single file