This section contains various pages explaining how to do typical STAR DB administrative tasks. Warning: this section and recipies are NOT for users, but for administrator - following this recipies may be harmful to STAR, if improperly used.
To ease maintainence of databases, a common storage structure is used where applicable. Such a structure has been devised for such database types as Conditions, Configurations, Calibrations, and Geometry. Other database types (e.g. RunLog & Production Database) require additional low level structures specific to the access requirements for the data they contain. These shall be documented independently as they are developed. The base table structures are discussed according to their function. Within each sub-section, the real table names are written with purple leters.
Schema DefinitionThe data objects stored in DB are pre-defined by c-structs in header files (located in $STAR/StDb/include directories). The database is fed the information about these c-structs which it stores in a pair of tables used to re-construct the c-struct definitions used for storing the data. This information can also be used to provide the memory allocation for the tables returned in the C++ API, but this can also be over-ridden to request any c-struct type allowed by the request. That is, as long as the table requested contained (at some time) items in the user-input c-struct description, the data will be return in the format requested. To perform such operations, the two DB tables used are,
|
|
The schema table holds the names of all elements associated with a c-struct, thier relative order in the c-struct, and in which schema-versions each element exists. The table contains the fields, |
|
*** Will replace "position" field and add information for total binary store of structure but requires that the schema-evolution model be modified. This will be for next interation of database storage structure |
Named ReferencesThere are c-structs which are used for more than one purpose. For example, Daq's detector structure is common to all types of detectors but referenced by different names when a detector specific set of data is requested or stored. Thus a request for a structure is actually a request for a named reference which in many cases is simply the name of the structure but can be different. This information is kept in the namedRef table which contains; |
|
IndexData is accessed from the database via reference contained in an index table called dataIndex. The purpose of this table is to provide access to specific rows of specific storage tables per the request information of name, timestamp, and version . Additional requests for specific element identifiers are also allowed (e.g. Sector01 of the tpc)via this index. The table contains the following fields; |
|
Data StorageThere are 2 general type of storage tables used in the mysql database. The first type of storage makes use of MySQL basic storage types to store the data as specified by the schema of the c-struct. That is, each element of the c-struct is given a column in which values are recorded and can subsequently be queried against. The name of the table is the same as the name of the structure obtained from the structure table and referenced directly from the namedRef table. Within this structure, large arrays can be handled using binary storage (within mysql) by defining the storage type as being binary. Each row of each storage table contains a unique row id that is used by the dataIndex table. example: The tpcElectronics table contains the following fields; |
|
An additional storage table is used for those table which are more easily stored in purely binary form. This is different then the note above where some "columns" of a c-struct are large arrays and can be independently (column-specific) stored in binary form. What is meant by purely binary form is that the entire c-struct (&/or many rows) is stored as a single binary blob of data. This is (to be) handled via the bytes table. The rows in the bytes table are still pointed to via the dataIndex table.
The content of the stored data is extracted using the information about how it was stored from the namedRef, schema, & structure tables. The bytes table contains the following fields; |
|
ConfigurationsThe term Configurations within the Star Databases refers to ensembles of data that are linked together in a common "named" structure. The extent of a Configuration may span many Star Databases (i.e. a super-configuration) to encompass a entire production configuration but is also complete at each database. For example, there is a data ensemble for the Calibrations_tpc database that can be requested explicity but that can also be linked together with a Calibrations_svt, Calibrations_emc, ... configurations to form a Calibrations configuration. The data ensembles are organized by a pair of tables found in each database. These tables are
The Nodes table contains the following fields; |
|
The NodeRelation table contains the following fields; |
|
CatalogThe Catalog structure in the database is used to organized different configurations into understandable groupings. This is much like a folder or directory system of configurations. Though very similar to the configuration database structure it is different in use. That is, the Nodes and NodeRelation tables de-reference configurations as a unique data ensemble per selection. The Catalog is used to de-references many similar configurations into groupings associated with tasks. The Catalog table structure has only seen limited use in Online databases and is still under development. Currently there 2 tables associated with the catalog,
The Nodes table contains the following fields; |
|
The NodeRelation table contains the following fields; |
|
commentThe Offline production chain does not use the Configurations & Catalog structures provided by the database. Rather an independent configuration is stored and retrieved via an offline table, tables_hierarchy, that is the kept within the St_db_Maker. This table contains the fields; |
|
This table is currently stored as a binary blob in the
params
database.
HOW TO ADD NEW TABLE TO DATABASE:
create a working directory.
$> cvs co StDb/idl (overkill - but so what) $> cp StDb/idl/svtHybridDriftVelocity.idl . $> mkdir include && mkdir xml $> cp svtHybridDriftVelocity.idl include/ $> mv include/svtHybridDriftVelocity.idl include/svtHybridDriftVelocity.h $> cp -r ~deph/updates/pmd/070702/scripts/ . # this directory is in cvs $> vim include/svtHybridDriftVelocity.h
### MAKE SURE your .idl file comments are less than 80 characters due to STIC limitation. "/* " and " */" also counted, so real comment line should be less than 74 chars.
### Check and change datatypes - octets become unsigned char and long becomes int, see table for mapping details:
IDL | C++ | MySQL |
short, 16 bit signed integer | short, 2 bytes | SMALLINT (-32768...32768) |
unsigned short, 16 bit unsigned integer | unsigned short, 2bytes | SMALLINT UNSIGNED (0...65535) |
long, 32 bit signed integer | int, 4 bytes | INT (-2147483648...2147483647) |
unsigned long, 32 bit unsigned integer | unsigned int, 4 bytes | INT UNSIGNED (0...4294967295) |
float, 32 bit IEEE float | float, 4 bytes | FLOAT |
double, 64 bit IEEE double | double, 8 bytes | DOUBLE |
char, 8 bit ISO latin-1 | char, 1 byte | CHAR |
octet, 8 bit byte (0x00 to 0xFF) | unsigned char, 1 byte | TINYINT UNSIGNED (0...255) |
NOW execute a bunch of scripts (italics are output) ( all scripts provide HELP when nothing is passed) .....
1) ./scripts/dbTableXml.pl -f include/svtHybridDriftVelocity.h -d Calibrations_svt
inputfile= include/svtHybridDriftVelocity.h
input database =Calibrations_svt
******************************
*
* Running dbTableXML.pl
*
outputfile = xml/svtHybridDriftVelocity.xml
******************************
2) ./scripts/dbDefTable.pl -f xml/svtHybridDriftVelocity.xml -s robinson.star.bnl.gov -c
#####output will end with create statement####
3) ./scripts/dbGetNode.pl -s robinson.star.bnl.gov -d Calibrations_svt
###retuns a file called svtNodes.xml
4) vim svtNodes.xml
###add this line <dbNode> svtHybridDriftVelocity <StDbTable> svtHybridDriftVelocity </StDbTable> </dbNode>
### this defines the node in the Nodes table
5) ./scripts/dbDefNode.pl -s robinson.star.bnl.gov -f svtNodes.xml
###garbage output here....
###now do the same for NodeRelations table
6) ./scripts/dbGetConfig.pl -s robinson.star.bnl.gov -d Calibrations_svt
###might retrun a couple of different configurations (svt definatetely does)
###we're interested in
7) vim Calibrations_svt_reconV0_Config.xml
###add this line <dbNode> svtHybridDriftVelocity <StDbTable> svtHybridDriftVelocity </StDbTable> </dbNode>
8) /scripts/dbDefConfig.pl -s robinson.star.bnl.gov -f Calibrations_svt_reconV0_Config.xml
IF YOU NEED TO UPDATE TABLE, WHICH NAME IS DIFFERENT FROM C++ STRUCTURE NAME:
$> ./scripts/dbDefTable.pl -f xml/
<table_xml>
.xml -s <dbserver>.star.bnl.gov -n
<table_name>
;
</table_name>
</dbserver>
</table_xml>
How to check if table name does not equal to c++ struct name :
mysql prompt> select * from Nodes where structName = 'c++ struct name'
This will output the "name", which is real db table name, and "structName", which is the c++ struct name.
IF YOU NEED TO CREATE A NEW PROTO-STRUCTURE (no storage):
./scripts/dbDefTable.pl -f xml/<structure>.xml -s robinson.star.bnl.gov -e
IF YOU NEED TO CREATE A NEW TABLE BASED ON THE EXISTING STRUCTURE:
checkout proto-structure IDL file, convert to .h, then to .xml, then do:
$> ./scripts/dbDefTable.pl -f xml/<structure>.xml -s .star.bnl.gov -s <table-name><br />
then add mapping entry to the tableCatalog (manually)
IMPORTANT: WHEN NEW TPC TABLE IS CREATED, THERE SHOULD BE A NEW LOG TRIGGER DEFINED!
please check tmp.dmitry/triggers directory at robinson for details..
-Dmitry
A full description of the deactivation feild can be found here
http://drupal.star.bnl.gov/STAR/comp/db/time-stamps
Briefly, it is the field that removes a record from a particular query based on time. This allows for records that are found to be erroneous, to remain available to older libraries that may have used the values.
The deactive field in the database is an integer, that contains a UNIX Time Stamp.
IMPORTANT - entryTime is a timestamp which will get changed with an update - this needs to remain the same...so set entryTime = entryTime.
To deactivate a record:
1) You must have write privileges on robinson
2) There is no STAR GUI or C++ interface for this operation so it must be done from either the MySQL command line or one of the many MySQL GUIs e.g., PHPAdmin, etc.
3) update <<TABLE>> set entryTime = entryTime, deactive = UNIX_TIMESTAMP(now()) where <<condition e.g., beginTIme between "nnnn" and "nnnn">>
4) do a SELECT to check if OK, that's it!
PHILOSOPHY
Element IDs are part of the OFFLINE Calibrations data primary key.
This is the mechanism that allows more the one row of data returned per time stamp (beginTime).
If the elementID is not set to 0 and is set to n, the API will return between 1 and n rows depending on how many rows have be inserted into the database.
A change INCREASING the elementID is backward compatable with a table that had element ID originally set to 0. The only thing to watch out for is a message that will read something like n rows looked for 1 return.
IMPLIMENTATION
In order for this to work three actions need to be taken:
1) Insert the rows with the element ID incrementing 1,2,3...
2) an new table called blahIDs needs to be created - the syntax is important and blah should be replaced with something descriptive.
3) in the nodes table identify the index by update in relevant node (table) and update the index feild with blah
For example the pmd SMCalib table wants to return 24 rows:
A table is created and filled (onlthing different will be the table name and the third "descriptive column" and the number of rows:
mysql> select * from pmdSmIDs;
+----+-----------+----+
| ID | elementID | SM |
+----+-----------+----+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 4 | 4 |
| 5 | 5 | 5 |
| 6 | 6 | 6 |
| 7 | 7 | 7 |
| 8 | 8 | 8 |
| 9 | 9 | 9 |
| 10 | 10 | 10 |
| 11 | 11 | 11 |
| 12 | 12 | 12 |
| 13 | 13 | 13 |
| 14 | 14 | 14 |
| 15 | 15 | 15 |
| 16 | 16 | 16 |
| 17 | 17 | 17 |
| 18 | 18 | 18 |
| 19 | 19 | 19 |
| 20 | 20 | 20 |
| 21 | 21 | 21 |
| 22 | 22 | 22 |
| 23 | 23 | 23 |
| 24 | 24 | 24 |
+----+-----------+----+
24 rows in set (0.01 sec)
The the Nodes table is update to read as follows:
mysql> select * from Nodes where name = 'pmdSMCalib' \G
*************************** 1. row ***************************
name: pmdSMCalib
versionKey: default
nodeType: table
structName: pmdSMCalib
elementID: None
indexName: pmdSm
indexVal: 0
baseLine: N
isBinary: N
isIndexed: Y
ID: 6
entryTime: 2005-11-10 21:06:11
Comment:
1 row in set (0.00 sec)
+++++++++++++++++++++++++++++
note the index field reads pmdSm not the default none.
HOW-TO enable 'elementID' support for a given offline table
Let's see how elementID support could be added on example of "[bemc|bprs|bsmde|bsmdp]Map" tables.
1. Lets' try [bemc|bprs]Map table first:
1.1 Create table 'TowerIDs' (4800 channels)
CREATE TABLE TowerIDs ( ID smallint(6) NOT NULL auto_increment, elementID int(11) NOT NULL default '0', Tower int(11) NOT NULL default '0', KEY ID (ID), KEY Tower (Tower) ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
1.2 Fill this table with values (spanning from 1 to [maximum number of rows]):
INSERT INTO TowerIDs VALUES (1,1,1); INSERT INTO TowerIDs VALUES (2,2,2); INSERT INTO TowerIDs VALUES (3,3,3); ... INSERT INTO TowerIDs VALUES (4800,4800,4800);
Sample bash script to generate ids
#!/bin/sh echo "USE Calibrations_emc;" for ((i=1;i<=4800;i+=1)); do echo "INSERT INTO TowerIDs VALUES ($i,$i,$i);" done for ((i=1;i<=18000;i+=1)); do echo "INSERT INTO SmdIDs VALUES ($i,$i,$i);" done
1.3 Update 'Nodes' table to make it aware of new index :
UPDATE Nodes SET Nodes.indexName = 'Tower' WHERE Nodes.structName = 'bemcMap'; UPDATE Nodes SET Nodes.indexName = 'Tower' WHERE Nodes.structName = 'bprsMap';
2. Now, smdChannelIDs (18000 channels)
1.1 Create db index table:
CREATE TABLE SmdIDs ( ID smallint(6) NOT NULL auto_increment, elementID int(11) NOT NULL default '0', Smd int(11) NOT NULL default '0', KEY ID (ID), KEY Smd (Smd)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
1.2 Fill this table with values:
INSERT INTO SmdIDs VALUES (1,1,1); INSERT INTO SmdIDs VALUES (2,2,2); INSERT INTO SmdIDs VALUES (3,3,3); ... INSERT INTO SmdIDs VALUES (18000,18000,18000);
(see helper bash script above)
1.3 Update 'Nodes' to make it aware of additional index :
UPDATE Nodes SET Nodes.indexName = 'Smd' WHERE Nodes.structName = 'bsmdeMap'; UPDATE Nodes SET Nodes.indexName = 'Smd' WHERE Nodes.structName = 'bsmdpMap';
LAST STEP: Simple root4star macro to check that everything is working fine ( courtesy of Adam Kocoloski ):
{ gROOT->Macro("LoadLogger.C"); gROOT->Macro("loadMuDst.C"); gSystem->Load("StDbBroker"); gSystem->Load("St_db_Maker"); St_db_Maker dbMk("StarDb", "MySQL:StarDb"); // dbMk.SetDebug(10); // Could be enabled to see full debug output dbMk.SetDateTime(20300101, 0); dbMk.Init(); dbMk.GetInputDB("Calibrations/emc/y3bemc"); dbMk.GetInputDB("Calibrations/emc/map"); }
The process for defining new database tables has several steps, many are predicated on making the data available to the offline infrastructure. The steps are listed here and detailed further down. Currently, the limitations on people performing these steps are those of access protection either in writing to the database or updating the cvs repository. Such permissions are given to specific individuals responsible per domain level (e.g. Tpc, Svt,...) .
Several of the steps involve running perl scripts. Each of these scripts require arguments. Supplying no arguments will generate a description of what the script will do and what arguments are needed/optional.
Schema Definition in DatabaseThe data objects stored in DB are pre-defined by c-structs in header files (located in $STAR/StDb/include directories). The database is fed the information about these c-structs which it stores for use during later read and write operations. The process is performed by a set of perl scripts which do the following; |
|
These steps are done pointing to a specific database (and a specific MySQL server). That is, the schema used to store data in a database is kept within that database so there is no overlap between, say, a struct defined for the tpc Calibrations and that defined for the emc Calibrations.
Example of taking a new c-struct through to making it available in a Configuration (such as used in Offline).
|
Data Loading into DataBaseThe data loading into the database can take many forms. The most typical is via the C++ API and can be done via a ROOT-CINT macro once the c-struct has been compiled and exists in a shared library.
|
Idl definitions in the Offline RepositoryFor data to be accessed in the offline chain, a TTable class for the database struct must exist. The current pre-requesite for this to occur is the existance of an idl file for the structure. This is the reason that the dbMakeFiles.pl was written. Once the dbMakeFiles.pl script is run for a specific structure to produce the idl file from the database record of the schema, this file must be placed in a repository which the Offline build scripts know about in order for the appropriate TTable to be produced. This repository is currently;
For testing new tables in the database with the TTable class;
|
Table (Query) List for St_db_Maker
(Now Obsolete)To access a set of data in the offline chain, The St_db_Maker needs to know what data to request from the database. This is predetermined as the St_db_Maker's construction and Init() phases. Specifically, the St_db_Maker is constructed with "const char* maindir" argument of the form,
The structure of this list is kept only inside the St_db_Maker and stored within a binary blob inside the params database. Updates must be done only on "www.star.bnl.gov".
|
The C++ API was written to understand the Configuration tables structure and to provide access to an ensemble of data with a single query. For example, the configuration list is requested by version name and provides "versioning" at the individual table level. It also prepares the list with individual element identifiers so that different rows (e.g. tpc-sectors, emc-modules, ...) can be accessed independently as well as whether the table is carries a "baseLine" or "time-Indexed" attribute for which provide information about how one should reference the data instances.
Storage of a Configuration can be done via the C++ API (see StDbManager::storeConfig()) but is generally done via the perl script dbDefConfig.pl. The idea behind the configuration is to allow sets of structures to be linked together in an ensemble. Many such ensembles could be formed and identified uniquely by name. However, we currently do not use this flexibility in the Offline access model hence only 1 named ensemble exists per database. In the current implementation of the Offline access model, the St_db_Maker requests this 1 configuration from the StDbBroker which passes the request to the C++ API. This sole configurations is returned to the St_db_Maker as a array of c-structs identifying each table by name, ID, and parent association.
Currently, STAR is able to provide nightly Offline and FC snapshots to allow easy external db mirror setup.
Generally, one will need three things :
Current STAR policy is that every institution must have a local maintainer with full access to db settings.
Here are a few steps to get external mirror working :
Steps 1-8 should complete replication setup. Following checks are highly recommended :
If you require direct assistance with MySQL mirror setup from STAR DB Administrator, you should allow DB admin's ssh key to be installed to root@your-db-mirror.edu.
At this moment, STAR DB Administrator is Dmitry Arkhipkin (arkhipkin@bnl.gov) - feel free to contact me about ANY problems with STAR or STAR external mirror databases, please.
Typical MySQL config for STAR OFFLINE mirror (/etc/my.cnf) :
[mysqld]<br />
user = mysql<br />
basedir = /db01/offline/mysql<br />
datadir = /db01/offline/data<br />
port = 3316<br />
socket = /tmp/mysql.3316.sock<br />
pid-file = /var/run/mysqld/mysqld.pid<br />
skip-locking<br />
log-error = /db01/offline/data/offline-slave.log.err<br />
max_allowed_packet = 16M<br />
max_connections = 4096<br />
table_cache=1024<br />
query-cache-limit = 1M<br />
query-cache-type = 1<br />
query-cache-size = 256M<br />
sort_buffer = 1M<br />
myisam_sort_buffer_size = 64M<br />
key_buffer_size = 256M<br />
thread_cache = 8<br />
thread_concurrency = 4<br />
relay-log=/db01/offline/data/offline-slave-relay-bin<br />
relay-log-index=/db01/offline/data/offline-slave-relay-bin.index<br />
relay-log-info-file=/db01/offline/data/offline-slave-relay-log.info<br />
master-info-file=/db01/offline/data/offline-slave-master.info<br />
replicate-ignore-db = mysql<br />
replicate-ignore-db = test<br />
replicate-wild-ignore-table = mysql.%<br />
read-only<br />
report-host = [your-host-name]<br />
report-port = 3316<br />
server-id = [server PID] <br />
# line below should be commented out after 'CHANGE MASTER TO ...' statement applied.<br />
skip-slave-start [mysqld_safe]<br />
log=/var/log/mysqld.log<br />
pid-file=/var/run/mysqld/mysqld.pid<br />
timezone=GMT
Typical 'CHANGE MASTER TO ...' statement for OFFLINE db mirror :
CHANGE MASTER TO MASTER_HOST='robinson.star.bnl.gov', MASTER_USER='starmirror', MASTER_PASSWORD='[PASS]', MASTER_PORT=3306, MASTER_LOG_FILE='robinson-bin.00000[ZYX]', MASTER_LOG_POS=[XYZ], MASTER_CONNECT_RETRY=60; START SLAVE; SHOW SLAVE STATUS \G
You can find [PASS] in master.info, robinson-bin.00000[ZYX] and [XYZ] in relay-log.info files in offline-snapshot.tgz archive
Database query, used by Gene to calculate [run start - first event] time differences...
select ds.runNumber,ds.firstEventTime, (ds.firstEventTime-rd.startRunTime),(rd.endRunTime-ds.lastEventTime) from runDescriptor as rd left join daqSummary as ds on ds.runNumber=rd.runNumber where rd.endRunTime>1e9 and ds.lastEventTime>1e9 and ds.firstEventTime>1e9;
Moved to protected page, because it is not recommended to use direct online database queries for anybody but online experts.