FileCatalog

The STAR FileCatalog is an a set of tools and API providing users access to the MeataData, File and Replica information pertaining to all data produced by the RHIC/STAR experiment.  The STAR FileCatalog in other words provides users access to meta-data, file and replica information through a unified schema-agnostic API. The user never needs to know the details of the relation between elements (or keywords) but rather, is provided with a flexible yet powerful query API allowing them to request any combination of 'keywords' based on sets of conditions composed of sequences of keyword operation values combinations. The user manual provides a list of keywords.

The STAR FIleCatalog also provides multi-site support through the same API. In other words, the same set of tools and programmatic interface allows to register, update, maintain a global catalog for the experiment and serve as a core component to the Data Management system. To date, the STAR FileCatalog holds information on 22 Million files and 52 Million active replicas.

 

The history & version information7

Manual

XML(s) examples

Examples and other documentation

 

A few examples will be left here to guide users and installer.

Data dictionary

This dictionary was created on 2012/03/12.

CollisionTypes

Field Type Null Default Comments
collisionTypeID smallint(6) No    
firstParticle varchar(10) No    
secondParticle varchar(10) No    
collisionEnergy float No 0  
collisionTypeIDate timestamp No CURRENT_TIMESTAMP  
collisionTypeCreator smallint(6) No 1  
collisionTypeCount int(11) Yes NULL  
collisionTypeComment text Yes NULL  

Creators

Field Type Null Default Comments
creatorID bigint(20) No    
creatorName varchar(15) Yes unknown  
creatorIDate timestamp No CURRENT_TIMESTAMP  
creatorCount int(11) Yes NULL  
creatorComment varchar(512) Yes NULL  

DetectorConfigurations

Field Type Null Default Comments
detectorConfigurationID int(11) No    
detectorConfigurationName varchar(50) Yes NULL  
dTPC tinyint(4) Yes NULL  
dSVT tinyint(4) Yes NULL  
dTOF tinyint(4) Yes NULL  
dEMC tinyint(4) Yes NULL  
dEEMC tinyint(4) Yes NULL  
dFPD tinyint(4) Yes NULL  
dFTPC tinyint(4) Yes NULL  
dPMD tinyint(4) Yes NULL  
dRICH tinyint(4) Yes NULL  
dSSD tinyint(4) Yes NULL  
dBBC tinyint(4) Yes NULL  
dBSMD tinyint(4) Yes NULL  
dESMD tinyint(4) Yes NULL  
dZDC tinyint(4) Yes NULL  
dCTB tinyint(4) Yes NULL  
dTPX tinyint(4) Yes NULL  
dFGT tinyint(4) Yes NULL  

DetectorStates

Field Type Null Default Comments
detectorStateID int(11) No    
sTPC tinyint(4) Yes NULL  
sSVT tinyint(4) Yes NULL  
sTOF tinyint(4) Yes NULL  
sEMC tinyint(4) Yes NULL  
sEEMC tinyint(4) Yes NULL  
sFPD tinyint(4) Yes NULL  
sFTPC tinyint(4) Yes NULL  
sPMD tinyint(4) Yes NULL  
sRICH tinyint(4) Yes NULL  
sSSD tinyint(4) Yes NULL  
sBBC tinyint(4) Yes NULL  
sBSMD tinyint(4) Yes NULL  
sESMD tinyint(4) Yes NULL  
sZDC tinyint(4) Yes NULL  
sCTB tinyint(4) Yes NULL  
sTPX tinyint(4) Yes NULL  
sFGT tinyint(4) Yes NULL  

EventGenerators

Field Type Null Default Comments
eventGeneratorID smallint(6) No    
eventGeneratorName varchar(30) No    
eventGeneratorVersion varchar(10) Yes 0  
eventGeneratorParams varchar(200) Yes NULL  
eventGeneratorIDate timestamp No CURRENT_TIMESTAMP  
eventGeneratorCreator smallint(6) No 1  
eventGeneratorCount int(11) Yes NULL  
eventGeneratorComment varchar(512) Yes NULL  

FileData

Field Type Null Default Comments
fileDataID bigint(20) No    
runParamID int(11) No 0  
fileName varchar(255) No    
baseName varchar(255) No   Name without extension
sName1 varchar(255) No   Will be used for name+runNumber
sName2 varchar(255) No   Will be used for name before runNumber
productionConditionID mediumint(9) Yes NULL  
numEntries mediumint(9) Yes 0  
md5sum varchar(32) Yes 0  
fileTypeID smallint(6) No 0  
fileSeq smallint(6) Yes NULL  
fileStream smallint(6) Yes 0  
fileDataIDate timestamp No CURRENT_TIMESTAMP  
fileDataCreator smallint(6) No 1  
fileDataCount int(11) Yes NULL  
fileDataComment text Yes NULL  

FileLocations

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileLocationsID

Field Type Null Default Comments
fileLocationID bigint(20) No    

FileLocations_0

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileLocations_1

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileLocations_2

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileLocations_3

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileParents

Field Type Null Default Comments
parentFileID bigint(20) No 0  
childFileID bigint(20) No 0  

FilePaths

Field Type Null Default Comments
filePathID bigint(6) No    
filePathName varchar(255) No    
filePathIDate timestamp No CURRENT_TIMESTAMP  
filePathCreator smallint(6) No 1  
filePathCount int(11) Yes NULL  
filePathComment varchar(512) Yes NULL  

FileTypes

Field Type Null Default Comments
fileTypeID smallint(6) No    
fileTypeName varchar(30) No    
fileTypeExtension varchar(15) No    
fileTypeIDate timestamp No CURRENT_TIMESTAMP  
fileTypeCreator smallint(6) No 1  
fileTypeCount int(11) Yes NULL  
fileTypeComment varchar(512) Yes NULL  

Hosts

Field Type Null Default Comments
hostID smallint(6) No    
hostName varchar(30) No localhost  
hostIDate timestamp No CURRENT_TIMESTAMP  
hostCreator smallint(6) No 1  
hostCount int(11) Yes NULL  
hostComment varchar(512) Yes NULL  

ProductionConditions

Field Type Null Default Comments
productionConditionID smallint(6) No    
productionTag varchar(10) No    
libraryVersion varchar(10) No    
productionConditionIDate timestamp No CURRENT_TIMESTAMP  
productionConditionCreator smallint(6) No 1  
productionConditionCount int(11) Yes NULL  
productionConditionComment varchar(512) Yes NULL  

RunParams

Field Type Null Default Comments
runParamID int(11) No    
runNumber bigint(20) No 0  
dataTakingStart timestamp No 0000-00-00 00:00:00  
dataTakingEnd timestamp No 0000-00-00 00:00:00  
dataTakingDay smallint(6) Yes 0  
dataTakingYear smallint(6) Yes 0  
simulationParamsID int(11) Yes NULL  
runTypeID smallint(6) No 0  
triggerSetupID smallint(6) No 0  
detectorConfigurationID mediumint(9) No 0  
detectorStateID mediumint(9) No 0  
collisionTypeID smallint(6) No 0  
magFieldScale varchar(50) No    
magFieldValue float Yes NULL  
runParamIDate timestamp No CURRENT_TIMESTAMP  
runParamCreator smallint(6) No 1  
runParamCount int(11) Yes NULL  
runParamComment varchar(512) Yes NULL  

RunTypes

Field Type Null Default Comments
runTypeID smallint(6) No    
runTypeName varchar(255) No    
runTypeIDate timestamp No CURRENT_TIMESTAMP  
runTypeCreator smallint(6) No 1  
runTypeCount int(11) Yes NULL  
runTypeComment varchar(512) Yes NULL  

SimulationParams

Field Type Null Default Comments
simulationParamsID int(11) No    
eventGeneratorID smallint(6) No 0  
simulationParamIDate timestamp No CURRENT_TIMESTAMP  
simulationParamCreator smallint(6) No 1  
simulationParamCount int(11) Yes NULL  
simulationParamComment varchar(512) Yes NULL  

StorageSites

Field Type Null Default Comments
storageSiteID smallint(6) No    
storageSiteName varchar(30) No    
storageSiteLocation varchar(50) Yes NULL  
storageSiteIDate timestamp No CURRENT_TIMESTAMP  
storageSiteCreator smallint(6) No 1  
storageSiteCount int(11) Yes NULL  
storageSiteComment varchar(512) Yes NULL  

StorageTypes

Field Type Null Default Comments
storageTypeID mediumint(9) No    
storageTypeName varchar(6) No    
storageTypeIDate timestamp No CURRENT_TIMESTAMP  
storageTypeCreator smallint(6) No 1  
storageTypeCount int(11) Yes NULL  
storageTypeComment varchar(512) Yes NULL  

TriggerCompositions

Field Type Null Default Comments
fileDataID bigint(20) No 0  
triggerWordID mediumint(9) No 0  
triggerCount mediumint(9) Yes 0  

TriggerSetups

Field Type Null Default Comments
triggerSetupID smallint(6) No    
triggerSetupName varchar(50) No    
triggerSetupComposition varchar(255) No    
triggerSetupIDate timestamp No CURRENT_TIMESTAMP  
triggerSetupCreator smallint(6) No 1  
triggerSetupCount int(11) Yes NULL  
triggerSetupComment varchar(512) Yes NULL  

TriggerWords

Field Type Null Default Comments
triggerWordID mediumint(9) No    
triggerWordName varchar(50) No    
triggerWordVersion varchar(6) No V0.0  
triggerWordBits varchar(6) No    
triggerWordIDate timestamp No CURRENT_TIMESTAMP  
triggerWordCreator smallint(6) No 1  
triggerWordCount int(11) Yes NULL  
triggerWordComment varchar(512) Yes NULL  

Tables creation and attributes

#use FileCatalog;

#
# All IDs are named after their respective table. This MUST
# remain like this.
#  eventGeneratorID        -> eventGenerator+ID       in 'EventGenerators'
#  detectorConfigurationID ->detectorConfiguration+ID in 'DetectorConfigurations'
#
# etc...
#

DROP TABLE IF EXISTS EventGenerators;
CREATE TABLE EventGenerators
(
  eventGeneratorID      SMALLINT     NOT NULL    AUTO_INCREMENT,
  eventGeneratorName    VARCHAR(30)        NOT NULL,
  eventGeneratorVersion VARCHAR(10)     NOT NULL,
  eventGeneratorParams  VARCHAR(200),

  eventGeneratorIDate   TIMESTAMP       NOT NULL,
  eventGeneratorCreator CHAR(15)        DEFAULT 'unknown' NOT NULL,
  eventGeneratorCount   INT,
  eventGeneratorComment TEXT,
  UNIQUE        EG_EventGeneratorUnique (eventGeneratorName, eventGeneratorVersion, eventGeneratorParams),
  PRIMARY KEY (eventGeneratorID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS DetectorConfigurations; CREATE TABLE DetectorConfigurations
(
  detectorConfigurationID               INT             NOT NULL        AUTO_INCREMENT,
  detectorConfigurationName             VARCHAR(50)        NULL           UNIQUE,
  dTPC                                  TINYINT,
  dSVT                                  TINYINT,
  dTOF                                  TINYINT,
  dEMC                                  TINYINT,
  dEEMC                                 TINYINT,
  dFPD                                  TINYINT,
  dFTPC                                 TINYINT,
  dPMD                                  TINYINT,
  dRICH                                 TINYINT,
  dSSD                                  TINYINT,
  dBBC                                  TINYINT,
  dBSMD                                 TINYINT,
  dESMD                                 TINYINT,
  PRIMARY KEY (detectorConfigurationID)
) TYPE=MyISAM;


# Trigger related tables
DROP TABLE IF EXISTS TriggerSetups; CREATE TABLE TriggerSetups
(
   triggerSetupID               SMALLINT     NOT NULL    AUTO_INCREMENT,
   triggerSetupName             VARCHAR(50)        NOT NULL       UNIQUE,
   triggerSetupComposition      VARCHAR(255) NOT NULL,

   triggerSetupIDate            TIMESTAMP       NOT NULL,
   triggerSetupCreator          CHAR(15)       DEFAULT 'unknown' NOT NULL,
   triggerSetupCount            INT,
   triggerSetupComment          TEXT,
   PRIMARY KEY                  (triggerSetupID)
) TYPE=MyISAM;


DROP TABLE IF EXISTS TriggerCompositions; CREATE TABLE TriggerCompositions
(
  fileDataID                    BIGINT          NOT NULL,
  triggerWordID                 INT             NOT NULL,
  triggerCount                  MEDIUMINT       DEFAULT 0,
  PRIMARY KEY                   (fileDataID, triggerWordID)
) TYPE=MyISAM;



DROP TABLE IF EXISTS TriggerWords;
CREATE TABLE TriggerWords (
  triggerWordID         mediumint(9)   NOT NULL auto_increment,
  triggerWordName       varchar(50)  NOT NULL default '',
  triggerWordVersion    varchar(6)        NOT NULL default 'V0.0',
  triggerWordBits       varchar(6)   NOT NULL default '',
  triggerWordIDate      timestamp(14)       NOT NULL,
  triggerWordCreator    varchar(15)       NOT NULL default 'unknown',
  triggerWordCount      int(11)     default NULL,
  triggerWordComment    text,
  PRIMARY KEY           (triggerWordID),
  UNIQUE KEY TW_TriggerCharacteristic (triggerWordName,triggerWordVersion,triggerWordBits)
) TYPE=MyISAM;




DROP TABLE IF EXISTS CollisionTypes; CREATE TABLE CollisionTypes
(
  collisionTypeID SMALLINT NOT NULL AUTO_INCREMENT,
  firstParticle VARCHAR(10) NOT NULL,
  secondParticle VARCHAR(10) NOT NULL,
  collisionEnergy FLOAT NOT NULL,
  PRIMARY KEY (collisionTypeID)
) TYPE=MyISAM;


#
# A few dictionary tables
#
DROP TABLE IF EXISTS ProductionConditions; CREATE TABLE ProductionConditions
(
  productionConditionID         SMALLINT       NOT NULL      AUTO_INCREMENT,
  productionTag                 VARCHAR(10)   NOT NULL,
  libraryVersion                VARCHAR(10)   NOT NULL,

  productionConditionIDate      TIMESTAMP       NOT NULL,
  productionConditionCreator    CHAR(15)        DEFAULT 'unknown' NOT NULL,
  productionConditionCount      INT,
  productionConditionComments   TEXT,
  PRIMARY KEY                   (productionConditionID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS StorageSites; CREATE TABLE StorageSites
(
  storageSiteID                 SMALLINT      NOT NULL     AUTO_INCREMENT,
  storageSiteName               VARCHAR(30)  NOT NULL,
  storageSiteLocation           VARCHAR(50),

  storageSiteIDate              TIMESTAMP       NOT NULL,
  storageSiteCreator            CHAR(15)       DEFAULT 'unknown' NOT NULL,
  storageSiteCount              INT,
  storageSiteComment            TEXT,
  PRIMARY KEY                   (storageSiteID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS FileTypes; CREATE TABLE FileTypes
(
  fileTypeID                    SMALLINT NOT NULL        AUTO_INCREMENT,
  fileTypeName                  VARCHAR(30)    NOT NULL   UNIQUE,
  fileTypeExtension             VARCHAR(15)        NOT NULL,

  fileTypeIDate                 TIMESTAMP       NOT NULL,
  fileTypeCreator               CHAR(15) DEFAULT 'unknown' NOT NULL,
  fileTypeCount                 INT,
  fileTypeComment               TEXT,
  PRIMARY KEY                   (fileTypeID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS FilePaths; CREATE TABLE FilePaths
(
  filePathID                    BIGINT         NOT NULL         AUTO_INCREMENT,
  filePathName                  VARCHAR(255)   NOT NULL         UNIQUE,

  filePathIDate                 TIMESTAMP       NOT NULL,
  filePathCreator               CHAR(15) DEFAULT 'unknown' NOT NULL,
  filePathCount                 INT,
  filePathComment               TEXT,
  PRIMARY KEY                   (filePathID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS Hosts; CREATE TABLE Hosts
(
  hostID                        SMALLINT       NOT NULL         AUTO_INCREMENT,
  hostName                      VARCHAR(30)    NOT NULL DEFAULT 'localhost' UNIQUE,

  hostIDate                     TIMESTAMP       NOT NULL,
  hostCreator                   CHAR(15)     DEFAULT 'unknown' NOT NULL,
  hostCount                     INT,
  hostComment                   TEXT,
  PRIMARY KEY                   (hostID)
) TYPE=MyISAM;


DROP TABLE IF EXISTS RunTypes; CREATE TABLE RunTypes
(
  runTypeID                     SMALLINT  NOT NULL AUTO_INCREMENT,
  runTypeName                   VARCHAR(255)    NOT NULL   UNIQUE,

  runTypeIDate                  TIMESTAMP       NOT NULL,
  runTypeCreator                CHAR(15)  DEFAULT 'unknown' NOT NULL,
  runTypeCount                  INT,
  runTypeComment                TEXT,
  PRIMARY KEY                   (runTypeID)
) TYPE=MyISAM;


DROP TABLE IF EXISTS StorageTypes; CREATE TABLE StorageTypes
(
  storageTypeID                 MEDIUMINT       NOT NULL    AUTO_INCREMENT,
  storageTypeName               VARCHAR(6)   NOT NULL  UNIQUE,

  storageTypeIDate              TIMESTAMP       NOT NULL,
  storageTypeCreator            CHAR(15)       DEFAULT 'unknown' NOT NULL,
  storageTypeCount              INT,
  storageTypeComment            TEXT,
  PRIMARY KEY                   (storageTypeID)
) TYPE=MyISAM;





DROP TABLE IF EXISTS SimulationParams; CREATE TABLE SimulationParams
(
  simulationParamsID            INT             NOT NULL     AUTO_INCREMENT,
  eventGeneratorID              SMALLINT    NOT NULL,
  detectorConfigurationID       INT             NOT NULL,
  simulationParamComments       TEXT,
  PRIMARY KEY                   (simulationParamsID),
  INDEX         SP_EventGeneratorIndex          (eventGeneratorID),
  INDEX         SP_DetectorConfigurationIndex   (detectorConfigurationID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS RunParams;
CREATE TABLE RunParams
(
  runParamID                  INT        NOT NULL AUTO_INCREMENT,
  runNumber                   BIGINT     NOT NULL UNIQUE,
  dataTakingStart             TIMESTAMP,
  dataTakingEnd               TIMESTAMP,
  simulationParamsID          INT       NULL,
  runTypeID                   SMALLINT     NOT NULL,
  triggerSetupID              SMALLINT      NOT NULL,
  detectorConfigurationID     INT            NOT NULL,
  collisionTypeID             SMALLINT             NOT NULL,
  magFieldScale               VARCHAR(50)    NOT NULL,
  magFieldValue               FLOAT,
  runComments                 TEXT,
  PRIMARY KEY                          (runParamID),
  INDEX RP_RunNumberIndex              (runNumber),
  INDEX RP_DataTakingStartIndex        (dataTakingStart),
  INDEX RP_DataTakingEndIndex          (dataTakingEnd),
  INDEX RP_MagFieldScaleIndex          (magFieldScale),
  INDEX RP_MagFieldValueIndex          (magFieldValue),
  INDEX RP_SimulationParamsIndex       (simulationParamsID),
  INDEX RP_RunTypeIndex                (runTypeID),
  INDEX RP_TriggerSetupIndex           (triggerSetupID),
  INDEX RP_DetectorConfigurationIndex  (detectorConfigurationID),
  INDEX RP_CollisionTypeIndex          (collisionTypeID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS FileData; CREATE TABLE FileData
(
  fileDataID                    BIGINT          NOT NULL AUTO_INCREMENT,
  runParamID                    INT             NOT NULL,
  fileName                      VARCHAR(255)       NOT NULL,
  baseName                      VARCHAR(255)       NOT NULL COMMENT 'Name without extension',
  sName1                        VARCHAR(255) NOT NULL COMMENT 'Will be used for name+runNumber',
  sName2                        VARCHAR(255) NOT NULL COMMENT 'Will be used for name before runNumber',
  productionConditionID         INT             NULL,
  numEntries                    MEDIUMINT,
  md5sum                        CHAR(32)     DEFAULT 0,
  fileTypeID                    SMALLINT NOT NULL,
  fileSeq                       SMALLINT,
  fileStream                    SMALLINT,
  fileDataComments              TEXT,
  PRIMARY KEY                   (fileDataID),
  INDEX         FD_FileNameIndex                (fileName(40)),
  INDEX         FD_BaseNameIndex                (baseName),
  INDEX         FD_SName1Index                  (sName1),
  INDEX         FS_SName2Index                  (sName2),
  INDEX         FD_RunParamsIndex               (runParamID),
  INDEX         FD_ProductionConditionIndex     (productionConditionID),
  INDEX         FD_FileTypeIndex                (fileTypeID),
  INDEX         FD_FileSeqIndex                 (fileSeq),
  UNIQUE        FD_FileDataUnique               (runParamID, fileName, productionConditionID, fileTypeID, fileSeq)
) TYPE=MyISAM;



# FileParents
DROP TABLE IF EXISTS FileParents; CREATE TABLE FileParents
(
  parentFileID                  BIGINT          NOT NULL,
  childFileID                   BIGINT          NOT NULL,
  PRIMARY KEY                   (parentFileID, childFileID)
) TYPE=MyISAM;

# FileLocations
DROP TABLE IF EXISTS FileLocations; CREATE TABLE FileLocations
(
  fileLocationID                BIGINT          NOT NULL      AUTO_INCREMENT,
  fileDataID                    BIGINT          NOT NULL,
  filePathID                    BIGINT          NOT NULL,
  storageTypeID                 MEDIUMINT       NOT NULL,
  createTime                    TIMESTAMP,
  insertTime                    TIMESTAMP       NOT NULL,
  owner                         VARCHAR(30),
  fsize                         BIGINT,
  storageSiteID                 SMALLINT      NOT NULL,
  protection                    VARCHAR(15),
  hostID                        BIGINT          NOT NULL DEFAULT 1,
  availability                  TINYINT         NOT NULL DEFAULT 1,
  persistent                    TINYINT         NOT NULL DEFAULT 0,
  sanity                        TINYINT         NOT NULL DEFAULT 1,
  PRIMARY KEY                   (fileLocationID),
  INDEX         FL_FilePathIndex                (filePathID),
  INDEX         FL_FileDataIndex                (fileDataID),
  INDEX         FL_StorageTypeIndex             (storageTypeID),
  INDEX         FL_StorageSiteIndex             (storageSiteID),
  INDEX         FL_HostIndex                    (hostID),
  UNIQUE        FL_FileLocationUnique           (fileDataID, storageTypeID, filePathID, storageSiteID, hostID)
) TYPE=MyISAM;

XML configuration

 

<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<!DOCTYPE SCATALOG [
   <!ELEMENT SCATALOG (SITE*)>
       <!ATTLIST SCATALOG VERSION CDATA #REQUIRED>
   <!ELEMENT SITE (SERVER+)>
       <!ATTLIST SITE name (BNL | LBL) #REQUIRED>
       <!ATTLIST SITE description CDATA #IMPLIED>
       <!ATTLIST SITE URI CDATA #IMPLIED>
   <!ELEMENT SERVER (HOST+)>
       <!ATTLIST SERVER SCOPE (Master | Admin | User) #REQUIRED>
   <!ELEMENT HOST (ACCESS+)>
       <!ATTLIST HOST NAME CDATA #REQUIRED>
       <!ATTLIST HOST DBTYPE CDATA #IMPLIED>
       <!ATTLIST HOST DBNAME CDATA #REQUIRED>
       <!ATTLIST HOST PORT CDATA #IMPLIED>
   <!ELEMENT ACCESS EMPTY>
       <!ATTLIST ACCESS USER CDATA #IMPLIED>
       <!ATTLIST ACCESS PASS CDATA #IMPLIED>
]>



<SCATALOG VERSION="1.0.1">
        <SITE name="BNL">
                <SERVER SCOPE="Master">
                        <HOST NAME="mafata.wherever.net" DBNAME="Catalog_XXX" PORT="1234">
                                <ACCESS USER="Moi" PASS="HelloWorld"/>
                        </HOST>
                        <HOST NAME="mafata.wherever.net" DBNAME="Catalog_YYY" PORT="1235">
                                <ACCESS USER="Moi" PASS="HelloWorld"/>
                        </HOST>
                        <HOST NAME="duvall.star.bnl.gov" DBNAME="FileCatalog" PORT="">
                                <ACCESS USER="FC_master" PASS="AllAccess"/>
                        </HOST>
                </SERVER>
                <SERVER SCOPE="Admin">
                        <HOST NAME="duvall.star.bnl.gov" DBNAME="FileCatalog_BNL" PORT="">
                                <ACCESS USER="FC_admin" PASS="ExamplePassword"/>
                        </HOST>
                </SERVER>
                <SERVER SCOPE="User">
                        <HOST NAME="duvall.star.bnl.gov" DBNAME="FileCatalog_BNL" PORT="">
                                <ACCESS USER="FC_user" PASS="FCatalog"/>
                        </HOST>
                </SERVER>
        </SITE>
</SCATALOG>

Migration and notes from V01.265 to V01.275

This document is intended for FileCatalog managers only who have previously deployed an earlier version of API and older database table layout. It is NOT intended for users.

Reasoning for this upgrade and core of the upgrade

One of the major problem with the preceding database layout started to show itself when we reached 4 Million entries (for some reason, we seem to have magic numbers). A dire restriction was the presence of the field 'path' and 'nodename' in the FileLocations table. This table became unnecessarily large (of the order of GB) and sorting and queries would become slow and IO demanding (regardless of our careful indexing). The main action was to move both field to separate tables. This change requires a two step modification :

  1. reshape of the database (leaving the old field), deployment of the database API in cross mode support
  2. run the normalization scripts filling the new table and fields, deployment of the final API and drop of the obsolete columns (+ index rebuild)

The steps are more carefully described below ...

Step by step migration instructions

Has to be made in several steps for safety a least interruption of service (although a pain to the manager). Note that you can do that much faster by cutting the Master/slave relationship, disabling all daemons auto-updating the database, proceed with table reshape and normalization script execution, drop and rebuild index, deploy the point-of-no-return API and restore Master/slave relation).

This upgrade is best if you have perl 5.8 or upper. Note that this transition will be the LAST one using perl 5.6 (get ready for a perl upgrade on your cluster).

We will assume you know how to connect to your database from an account able to manipulate and create any tables in the FileCatalog database.

Steps in Phase I

  1. (0) Create the following tables
      DROP TABLE IF EXISTS FilePaths; CREATE TABLE FilePaths
      (
        filePathID                    BIGINT         NOT NULL         AUTO_INCREMENT,
        filePathName                  VARCHAR(255)   NOT NULL         UNIQUE,
        filePathCount                 INT,
        PRIMARY KEY                   (filePathID)
      ) TYPE=MyISAM;
    
      DROP TABLE IF EXISTS Hosts; CREATE TABLE Hosts 
     (
        hostID      smallint(6) NOT NULL auto_increment,
        hostName    varchar(30) NOT NULL default 'localhost',
        hostIDate   timestamp(14) NOT NULL,
        hostCreator varchar(15) NOT NULL default 'unknown',
        hostCount   int(11) default NULL,
        hostComment text,
        PRIMARY KEY (hostID),
        UNIQUE KEY  hostName (hostName)
      ) TYPE=MyISAM;
    
    
  2. Modify some table and recreate one
         
         ALTER TABLE `FileLocations` ADD `filePathID` bigint(20) NOT NULL default '0' AFTER `fileDataID`;
         ALTER TABLE `FileLocations` ADD `hostID` bigint(20) NOT NULL default '1' AFTER `protection`;
         UPDATE TABLE `FileLocations` SET hostID=0;
    
         # note that I did that one from the Web interface (TBC)
         INSERT INTO Hosts VALUES(0,'localhost',NOW()+0,'',0,'Any unspecified node'); 
    
         ALTER TABLE `FileLocations` ADD INDEX ( `filePathID` )  
    
         ALTER TABLE `FilePaths` ADD `filePathIDate` TIMESTAMP NOT NULL AFTER `filePathName` ;
         ALTER TABLE `FilePaths` ADD `filePathCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `filePathIDate` ;
         ALTER TABLE `FilePaths` ADD `filePathComment` TEXT AFTER `filePathCount`;
    
         ALTER TABLE `StorageSites` ADD  `storageSiteIDate` TIMESTAMP NOT NULL AFTER `storageSiteLocation` ;
         ALTER TABLE `StorageSites` ADD  `storageSiteCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `storageSiteIDate` ;
         ALTER TABLE `StorageSites` DROP `storageComment`;
         ALTER TABLE `StorageSites` ADD  `storageSiteComment` TEXT AFTER `storageSiteCount`;
    
         ALTER TABLE `StorageTypes` ADD `storageTypeIDate` TIMESTAMP NOT NULL AFTER `storageTypeName` ;
         ALTER TABLE `StorageTypes` ADD `storageTypeCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `storageTypeIDate` ;
    
    
         ALTER TABLE `FileTypes` ADD `fileTypeIDate` TIMESTAMP NOT NULL AFTER `fileTypeExtension` ;
         ALTER TABLE `FileTypes` ADD `fileTypeCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `fileTypeIDate` ;
         ALTER TABLE `FileTypes` ADD `fileTypeComment` TEXT AFTER `fileTypeCount`;
    
    
         ALTER TABLE `TriggerSetups` ADD `triggerSetupIDate` TIMESTAMP NOT NULL AFTER `triggerSetupComposition` ;
         ALTER TABLE `TriggerSetups` ADD `triggerSetupCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `triggerSetupIDate`;
         ALTER TABLE `TriggerSetups` ADD `triggerSetupCount`   INT AFTER `triggerSetupCreator`;
         ALTER TABLE `TriggerSetups` ADD `triggerSetupComment` TEXT  AFTER `triggerSetupCount`;
    
         ALTER TABLE `EventGenerators` ADD `eventGeneratorIDate` TIMESTAMP NOT NULL AFTER `eventGeneratorParams` ;
         ALTER TABLE `EventGenerators` ADD `eventGeneratorCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `eventGeneratorIDate` ;
         ALTER TABLE `EventGenerators` ADD `eventGeneratorCount`   INT AFTER `eventGeneratorCreator`;
    
         ALTER TABLE `RunTypes` ADD `runTypeIDate` TIMESTAMP NOT NULL AFTER `runTypeName` ;
         ALTER TABLE `RunTypes` ADD `runTypeCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `runTypeIDate` ;
    
         ALTER TABLE `ProductionConditions` DROP `productionComments`; 
         ALTER TABLE `ProductionConditions` ADD  `productionConditionIDate`   TIMESTAMP NOT NULL AFTER `libraryVersion`;
         ALTER TABLE `ProductionConditions` ADD  `productionConditionCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `productionConditionIDate`;
         ALTER TABLE `ProductionConditions` ADD  `productionConditionComment` TEXT AFTER `productionConditionCount`;
    
    
    
         #
         # This table was not shaped as a dictionary so needs to be re-created
         # Hopefully, was not filled prior (but will be this year)
         #
         DROP TABLE IF EXISTS TriggerWords; CREATE TABLE TriggerWords
         (
            triggerWordID           MEDIUMINT       NOT NULL        AUTO_INCREMENT,
            triggerWordName         VARCHAR(50)     NOT NULL,
            triggerWordVersion      CHAR(6)         NOT NULL DEFAULT "V0.0",
            triggerWordBits         CHAR(6)         NOT NULL,  
            triggerWordIDate        TIMESTAMP       NOT NULL,
            triggerWordCreator      CHAR(15)        DEFAULT 'unknown' NOT NULL,
            triggerWordCount        INT,
            triggerWordComment      TEXT,
            UNIQUE   TW_TriggerCharacteristic (triggerWordName, triggerWordVersion, triggerWordBits),
            PRIMARY KEY             (triggerWordID)
         ) TYPE=MyISAM;
  3. Deploy the new API CVS version 1.62 of FileCatalog.pm

  4. Run the following utility scripts

    util/path_convert.pl
    util/host_convert.pl

    Note that those scripts use a new method $fC->connect_as("Admin"); which assumes that the Master Catalog will be accessed using the XML connection description. Also, it should be obvious that

    use lib "/WhereverYourModulAPIisInstalled"; should be replaced by the appropriate path for your site (or test area). Finally, it uses API CVS version 1.62 which supports Xpath and Xnode transitional keywords allowing us to transfer the information from one field to one table.

  5. Check that Hosts table was filled properly and automatically with Creator/IDate
  6. Paranoia step : Re-run the scripts mentioned 2 steps ago

    At this stage and ideally, nothing should happen (as you have already modified the records).
    A few tips prior from doing that
    • % fC_cleanup.pl -modif node=localhost -cond node='' -doit
      would hopefully do nothing but if you have messed something up in the past, hostName would be NULL and the above would be necessary.
    • After a full update, the following queries should return NOTHING
      % get_file_list.pl -keys flid -cond rfpid=0 -all -alls -as Admin
      % get_file_list.pl -keys flid -cond rhid=0 -all -alls -as Admin

      Those are equivalent to the SQL statements
      >SELECT FileLocations.fileLocationID FROM FileLocations WHERE FileLocations.filePathID = 0 LIMIT 0, 100
      >SELECT FileLocations.fileLocationID FROM FileLocations WHERE FileLocations.hostID = 0 LIMIT 0, 100 
    If it does return anything, contact me for further investigation and database repairs. As a side note, the -as keyword was introduced recently and you should update your get_file_list.pl script if not available.
  7. Make a backup copy of the database for security (optional but safer) Backup can be done by easer a dump of mysql or more trivially, a cp -r of the database directory.
  8. Leave it running for a few days (should be fine) for confidence consolidation ;-)

You are ready for phase II. Hang on tight now ...

Steps in Phase II

Those steps are no VERY intrusive and potentially destructive. Be careful from here on ...

  1. Stop all daemons, be sure that during the rest of the operations, NO command attempts to manipulate the database. If you want to shield your users from the upgrade, stop all Master/slave relations.
  2. Connect to the master FileCatalog as administrator for that database and execute the following SQL commands
      > ALTER TABLE `FileLocations` ADD INDEX FL_HostIndex (hostID);
      > ALTER TABLE `FileLocations` DROP INDEX `FL_FileLocationUnique`, ADD UNIQUE (fileDataID, storageTypeID, filePathID, storageSiteID, hostID);
    
      # drop the columns not in use anymore / should also get rid of the associated
      # indexes.
      > ALTER TABLE `FileLocations` DROP COLUMN nodeName;
      > ALTER TABLE `FileLocations` DROP COLUMN filePath;
    
      # "rename" index / was created with a name difference to avoid clash for transition
      # now renamed for consistency
      > ALTER TABLE `FileLocations` DROP INDEX `filePathID`, ADD INDEX  FL_FilePathIndex (filePathID);
  3. OK, you should be done. Deploy either CVS version 1.63 which correspond to the FileCatalog API version V01.275 and above ... (by the way, get_file_list.pl -V gives the API version).


 

A few notes

  • The new API is XML connection aware via a non-mandatory module named XML::Simple . You should install that module but there are some limitations if you are using perl 5.6 i.e., you MUST use the schema with ONLY one choice per category (Admin, Master or User).
  • Your scripts will likely need to change if your Database Master and Slave are not on the same node (i.e. the administration account for the FileCatalog can be used only on the database Master and the regular user account on the Slave). There are a few forms of this such as the one below
    # Get connection fills the blanks while reading from XML
    # However, USER/PASSWORD presence are re-checked
    #$fC->debug_on();
    ($USER,$PASSWD,$PORT,$HOST,$DB) = $fC->get_connection("Admin");
    $port = $PORT if ( defined($PORT) );
    $host = $HOST if ( defined($HOST) );
    $db   = $DB   if ( defined($DB) );
    
    
    if ( defined($USER) ){   $user = $USER;}
    else {                   $user = "FC_admin";}
    
    if ( defined($PASSWD) ){ $passwd = $PASSWD;}
    else {                   print "Password for $user : ";
                             chomp($passwd = );}
    
    #
    # Now connect using a fully specified user/passwd/port/host/db
    #
    $fC->connect($user,$passwd,$port,$host,$db);

    or counting on the full definition in the XML file

    $fC    = FileCatalog->new();
    $fC->connect_as("Admin");
  • Note a small future convenience when XML is ON. connect_as() does not only select as who you want to connect to but where as well. In fact, the proper syntax is intent=SITE::User (for example BNL::Admin is valid as well as LBL::User). This is only partly supported however.
  • The new version of the API automatically add information in dictionary tables. Especially, the account under which a new dictionary value was inserted (Creator) and the insertion date (IDate) are filled automatically. A side effect being that the new API is NOT compatible with previous database table layout (no backward support will be attempted).

Migration and notes from V01.275 to V01.280

This document is intended for FileCatalog managers only who have previously deployed an earlier version of API and older database table layout. It is NOT intended for users.

Reasoning for this upgrade and core of the upgrade

This upgrade is a minor one, making support for two more detector sub-systems. The new API supports this modification. You need to alter the table DetectorConfigurations and add two columns. API are always forward compatible in that regard so it is completely safe to alter the table and deploy the API later.

ALTER TABLE `DetectorConfigurations` ADD dBSMD TINYINT;
ALTER TABLE `DetectorConfigurations` ADD dESMD TINYINT;
UPDATE `DetectorConfigurations` SET dBSMD=0;
UPDATE `DetectorConfigurations` SET dESMD=0;

And deploy the API V01.280 or later. You are done.