Temporary placeholder for two presentations made on :
a) Online data processing review
b) FileCatalog performance improvement
Please see attached files
Work in progress. Page is located here .
The purpose of the C++ API is to provide a set of standardized access methods to Star-DB data from within (client) codes that is independent of specific software choices among groups within STAR. The standardized methods hide the "client" code from most details of the storage structure, including all references to the low-level DB infrastructure such as (My)SQL query strings. Specifically, the DB-API reformulates requests for data by name, timestamp, and version into the necessary query structure of the databases in order to retrieve the data requested. None of the low-level query structure is seen by the client.
The API is contained withing a shared library StDbLib.so. It has two versions built from a common source. The version in Offline (under $STAR/lib) contains additional code generated by "rootcint" preprocessor in order to provide command line access to some methods. The Online version does not contain the preprocessed "rootcint" code. In addition to standard access methods, the API provides the tools needed to facilitate those non-standard access patterns that are known to exist. For example, there will be tasks that need special SQL syntax to be supplied by client codes. Here, a general use C++MySQL object can be made available to the user code on an as needed basis. The following write-up is intended as a starting point for understanding the C++ API components. Since most clients of database data have an additional software-layer between their codes and the DB-API (e.g St_db_Maker in offline), none of these components will be directly seen by the majority of such users. There will, however, be a number of clients which will need to access the API directly in order to perform some unique database Read/Write tasks. Click here To view a block diagram of how the C++ API fits general STAR code access. Click here To view a block diagram of how the C++ API classes work together to provide data to client codes The main classes which make up the C++ DB-API are divided here into four categories.
|
StDbManager | StDbServer | tableQuery & mysqlAccessor | StDbDefs
StDbManager: (Available at Root CLI)
The StDbManager class acts as the principle connection between the DB-API and the client codes. It is a singleton class that is responcible for finding Servers & databases, providing the information to the StDbServer class in order that it may connect with the database requested, and forwarding all subsequent (R/W) requests on to the appropriate StDbServer object. Some public methods that are important to using the DB-API via the manager:
Some public methods that are primarily used internally in the DB-API:
The StDbServer class acts as the contact between the StDbManager and the specific Server-&-Database in which a requested data resides. It is initialized by the StDbManager with all the information needed to connect to the database and it contains an SQL-QueryObject that is specifically structured to navigate the database requested. It is NOT really a user object except in specific situations that require access to a real SQL-interface object which can be retrieved via this object. Public methods accessed from the StDbManager and forwarded to the SQL-Query Object:
The tableQuery object is an interface of database queries while mysqlAccessor object is a real implementation based on access to MYSQL. The real methods in mysqlAccessor are those that contain the specific SQL content needed to navigate the database structures. Public methods passed from StDbServer :
Not a class but a header file containing enumerations of StDbType and StDbDomain that are used to make contact to specific databases. Use of such enumerations may disappear in favor of a string lookup but the simple restricted set is good for the initial implementation.
|
StDbTable: (Available at Root CLI)
The StDbTable class contains all the information needed to access a specific table in the database. Specifically, it contains the "address" of the table in the database (name, version, validity-time, ...), the "descriptor" of the c-struct use to fill the memory, the void* to the memory, the number of rows, and whether the data can be retrieved without time-stamp ("BaseLine" attribute). Any initial request for a table, either in an ensemble list or one-by-one, sets up the StDbTable class instance for the future data request without actually retrieving any data. Rather the database-name, table-name, version-name, and perhaps number of rows & id for each row, are assigned either by the ensemble query via the StDbConfigNode or simply by a single request. In addition, an "descriptor" object can also be requested from the database or set from the client code. After this initial "request", the table can be used with the StDbManager's timestamp information to read/write data from/to the database. if no "descriptor" is in the StDbTable class, the database provides one (the most recent one loaded in the database) upon the first real data access attempted. Some usefull public methods in StDbTable
StDbConfigNode: (Available at Root CLI) The StDbConfigNode class provides 2 functions to the C++ API. The first is as a container for a list of StDbTable objects over which codes can iterate. In fact, the StDbTable constructor need not be called directly in the user codes as the StDbConfigNode class has a method to construct the StDbTable object, add it to its list, and return to the user a pointer to the StDbTable object created. The destructor of the StDbConfigNode will delete all tables within its list. The second is the management of ensembles of data (StDbTables) in a list structure for creation (via a database configuration request) and update. The StDbConfigNode can build itself from the database and a single "Key" (version string). The result such a "ConfigNode" query will be several lists of StDbTables prepared with the necessary database addresses of name, version, & elementID as well as any characteristic information such as the "descriptor" and the baseline attribute. Some usefull public methods in StDbConfigNode
|
MysqslDb class provides infrastructure (& sometimes client) codes easy use of SQL queries without being exposed to any of the specific/particular implementations of the MySQL c-api. That is, the MySQL c-api has specific c-function calls returning mysql-specific c-struct (arrays) and return flags. Handling of these functions is hidden by this class.
Essentially there are 3 public methods used in MysqlDb
The StDbBuffer class inherits from the pure virtual StDbBufferI class & implements MySQL I/O. The syntax of the methods were done to be similar with TBuffer as an aid in possible expanded use of this interface. The Buffer handles binary data & performs byte-swapping as well as direct ASCII I/O with MySQL. The binary data handler writes all data in Linux format into MySQL. Thus when accessing the buffer from the client side, one should always set it to "ClientMode" to ensure that data is presented in the architecture of the process.
Public methods used in StDbBufferI
STAR MySQL API: SSL (AES 128/AES 256), Compression tests.
IDEAS:
a) SSL encryption will allow to catch mysterious network problems eary (integrity checks).
b) Data compression will allow more jobs to run simultaneously (limited network bandwidth);
BFC chain used to measure db response time: bfc.C(5,"pp2009a,ITTF,BEmcChkStat,btofDat,Corr3,OSpaceZ2,OGridLeak3D","/star/rcf/test/daq/2009/085/st_physics_10085024_raw_2020001.daq")
time is used to measure 20 sequential BFC runs :
1. first attempt:
SSL OFF, COMPRESSION OFF : 561.777u 159.042s 24:45.89 48.5% 0+0k 0+0io 6090pf+0w
WEAK SSL ON, COMPRESSION OFF : 622.817u 203.822s 28:10.64 48.8% 0+0k 0+0io 6207pf+0w
STRONG SSL ON, COMPRESSION OFF : 713.456u 199.420s 28:44.23 52.9% 0+0k 0+0io 11668pf+0w
STRONG SSL ON, COMPRESSION ON : 641.121u 185.897s 29:07.26 47.3% 0+0k 0+0io 9322pf+0w
2. second attempt:
SSL OFF, COMPRESSION OFF : 556.853u 159.315s 23:50.06 50.0% 0+0k 0+0io 4636pf+0w
WEAK SSL ON, COMPRESSION OFF : 699.388u 202.783s 28:27.83 52.8% 0+0k 0+0io 3389pf+0w
STRONG SSL ON, COMPRESSION OFF : 714.638u 212.304s 29:54.05 51.6% 0+0k 0+0io 5141pf+0w
STRONG SSL ON, COMPRESSION ON : 632.496u 157.090s 28:14.63 46.5% 0+0k 0+0io 1pf+0w
3. third attempt:
SSL OFF, COMPRESSION OFF : 559.709u 158.053s 24:02.37 49.7% 0+0k 0+0io 9761pf+0w
WEAK SSL ON, COMPRESSION OFF : 701.501u 199.549s 28:53.16 51.9% 0+0k 0+0io 7792pf+0w
STRONG SSL ON, COMPRESSION OFF : 715.786u 203.253s 30:30.62 50.2% 0+0k 0+0io 4560pf+0w
STRONG SSL ON, COMPRESSION ON : 641.293u 164.168s 29:06.14 46.1% 0+0k 0+0io 6207pf+0w
Preliminary results from 1st run :
SSL OFF, COMPRESSION OFF : 1.0 (reference time)
"WEAK" SSL ON, COMPRESSION OFF : 1.138 / 1.193 / 1.201
"STRONG" SSL ON, COMPRESSION OFF : 1.161 / 1.254 / 1.269
"STRONG" SSL ON, COMPRESSION ON : 1.176 / 1.184 / 1.210
Compression check:
1. bfc 100 evts, compression ratio : 0.74 [compression enabled / no compression]. Not quite what I expected, probably I need to measure longer runs to see effect - schema queries cannot be compressed well...
First impression: SSL encryption and Data compression do not significantly affect operations. For only ~15-20% slow-down per job, we get data integrity check (SSL) and 1.5x network bandwidth...
WORK IN PROGRESS...
Addendum :
1. Found an interesting article at mysql performance blog:
http://www.mysqlperformanceblog.com/2007/12/20/large-result-sets-vs-compression-protocol/
"...The bottom line: if you’re fetching big result sets to the client, and client and MySQL are on different boxes, and the connection is 100 Mbit, consider using compression. It’s a matter of adding one extra magic constant to your application, but the benefit might be pretty big..."
listed items are linked to more detailed discussions on:
|
Description:
The Conditions Database serves to record the experimental running conditions. The database system is a set of "subsystem" independent databases written to by Online "subsystems" and used to develop calibration and diagnostic information for later analyses. Some important characteristics of the Conditions/DB are:
There are essentially 4 types of use scenarios for the Conditions/DB. (1) Online updates: Each Online sub-system server (or subsystem code directly) needs the capability to update thier database with the data representing the sub-system operation. These updates can be triggered by a periodic (automated) sub-system snap-shots, a manually requested snap-shot, or an alarm generated recording of relevant data. In any of these cases, the update record includes the TimeStamp associated with the measurement of the sub-system data for future de-referencing. (2) Online diagnostics: The snap-shots, which may include a running histogram of Conditions data, should be accessible from the sub-system monitors to diagnose the stability of the control & detector systems and correlations between detector performance and system conditions. (3) Offline diagnostics: The same information as (2) is needed from Offline codes (running, for example, in the Online-EventPool environment) to perform more detailed analyses of the detector performance. (4) Offline calibrations: The conditions/DB data represent the finest grained & most fundamental set of data from which detector calibrations are evaluated (excepting, of course, for the event data). The Conditions data is input to the production of Calibration data and, in some cases, Calibration data may simply be the re-time-stamp of Conditions data by the time interval of the Calibration's stability rather than that of some automated snap-shot frequency. |
Description:
The Configuration Database serves as the repository of detector-hardware "set-points". This repository is used to quickly configure the systems via several standard "named" configurations. The important characteristics of the Configuration/DB are;
There are essentially 3 types of use scenarios for the configurations database. (1) Online Registration: A configuration for a sub-system is created through the Online SubSystem Server control by direct manipulation of the subsystem operator. That is, a "tree-structured" object catalog is edited to specify the "named" objects included in the configuration. The individual named objects can, if new, have input values specified or, if old, will be loaded as they are stored. The formed configuration can then be registered for later access by configuration name. There exists an Online GUI to perform these basic tasks (2) Online Configuration: Once registered, a configuration is made available for enabling the detector sybsystem as specified by the configuration's content. The Online RunServer control can requesta named configuration under the Online protocols and parse the subsytem Keys (named collection) to the subsystem servers which access the Configurations/DB directly. (3) Offline use: In general Offline codes will use information derived from the conditions/DB to perform reconstruction/analysis tasks (e.g. not the set-points but the measured points). However, some general information about the setup can be quickly obtained from the Configurations/DB as referenced from the RunLog. This will tell, for example, what set of detectors were enabled for the period in question. |
Description:
The Calibration/DB contains data used to correct signals from the detectors into their physically interpretable form. This data is largely derived from the conditions/DB information and event data by reformulating such information into usefull quantities for reconstruction &/or analysis tasks. There are essentially 3 types of use scenarios for the calibrations database. (1) Offline in Online: It is envisioned that much of the calibration data will be done produced via Offline processing running in the Online Event-Pool. These calibration runs will be fed some fraction of the real data produced by the DAQ Event-Builder. This data is then written or migrated into the Calibration/DB for use in Offline production and analyses. (2) Offline in Offline: Further reprocessing of the data in the Offline environment, again with specific calibration codes, can be done to produce additional calibration data. This work will include refinements to original calibration data with further iterations or via access on data not available in the Online Event Pool. Again the calibration data produced is written or migrated to the calibration database which resides in Offline. (3) Offline reconstruction & analyses: The calibration data is used during production and analysis tasks in, essentially, a read-only mode. |
Description:
The Geometry database stores the geometrical description of the STAR detectors and systems. It is really part of the calibration database except that the time-constant associated with valid entries into the database will be much longer than typical calibration data. Also, it is expected that many applications will need geometrical data from a variety of sub-systems while not needing similar access to detector-specific (signal) calibration data. Thus the access interface to the geometry database should be segragated from the calibration database in order to optimize access to its data. There are a few generic categories of geometry uses that while not independent may suggest indecate differen access scenarios. (1) Offline Simulations: (2) Offline Reconstruction: (3) Offline Analyses & Event Displays: |
Description:
The RunLog holds the summary information describing the contents of an experimental run and "pointers" to the detailed information (data files) that are associated with the run. A more complete description can be found on the Online web pages. (1) Online Generation: The RunLog begins in Online which the run is registered, stored when the run is enabled, and updated when the run has ended. Furhter updates from Online may be necessary e.g. once verification is made as to the final store of the event data on HPSS. (2) Online (& Offline) Summaries: The RunLog can be loaded and properties displayed in order to assertain progress toward overall goals that span Runs. (3) Offline Navigation : The RunLog will have a transient representation in the offline infrustructure which will allow the processes to navigate to other database entities (i.e. Configurations & Scalers) |
METHOD | DESCRIPTION | NOTES |
---|---|---|
standard methods | ||
GET | get latest entry => either storage or sensor may reply via personal REPLY | |
PUT | store new entry | |
POST | sensor entry change update | |
DELETE | delete [latest] entry | |
HEAD | request schema descriptor only, without data | |
PATCH | modify schema descriptor properties | |
OPTIONS | get supported methods | |
extra methods | ||
REPLY | personal reply address in REQUEST / RESPONSE pattern. Ex. topic: DCS / REPLY / <CLIENT_UID>. Example: COMMAND acknowledgements, GET replies | |
COMMAND | commands from control system: ON / OFF / REBOOT / POWEROFF | |
STATUS | retrieve status of the device: ON / OFF / REBOOT / POWEROFF / BUSY |
Search (essentially, Filter) Capabilities and Use-Cases
To request filtering of the result, special field could be added to the request body: "dcs_filter". Contents of the "dcs_filter" define the rules of filtering - see below.
-------------------------------------------------------------
[x] 1. Constraint: WHERE ( A = B )
dcs_filter: { "A" : B }
[x] 2. Constraint: WHERE ( A = B && C = "D" )
dcs_filter: { "A": B, "C": "D" }
[x] 3. Constraint: WHERE ( A = B || C = "D" )
dcs_filter: {
'_or': { "A": B, "C": "D" }
}
[x] 4. Constraint: WHERE ( A = B || A = C || A = D )
dcs_filter: {
"A": {
'_in': [ B, C, D ]
}
}
[x] 5. Constraint: WHERE ( A = B && ( C = D || E = F ) )
dcs_filter: {
"A": B,
"_or" : { C: D, E: F }
}
-------------------------------------------------------------
[x] 6.1 Constraint: WHERE ( A > B )
dcs_filter: {
A: { '_gt': B }
}
[x] 6.2 Constraint: WHERE ( A >= B )
dcs_filter: {
A: { '_ge': B }
}
[x] 7.1 Constraint: WHERE ( A < B )
dcs_filter: {
A: { '_gt': B }
}
[x] 7.2 Constraint: WHERE ( A <= B )
dcs_filter: {
A: { '_ge': B }
}
[x] 8. Constraint: WHERE ( A > B && A < C )
dcs_filter: {
A: { '_gt': B, '_lt': C }
}
-------------------------------------------------------------
...To Be Continued
---
Load Balancer Configuration File
This file is for sites that have a pool of database servers they would like to load balance between (e.g., BNL, PDSF).
This file should be pointed to by the environmental variable DB_SERVER_LOCAL_CONFIG.
Please replace the DNS names of the nodes in the pools with your nodes/slave. Pools can be added and removed with out any problems but the needs to be at least one pool of available slaves for general load balancing.
Below is a sample xml with annotations:
<!--Below is a pool of servers accessble only by user priv2 in read only mode
This pool would be used for production or any other type of operation that needed
exclusive access to a group of nodes
--!>
<Server scope="Production" user="priv2" accessMode="read">
<Host name="db02.star.bnl.gov" port="3316"/>
<Host name="db03.star.bnl.gov" port="3316"/>
<Host name="db04.star.bnl.gov" port="3316"/>
<Host name="db05.star.bnl.gov" port="3316"/>
</Server>
<!--Below is a pool of servers access by ANYBODY in read only mode
This pool is for general consumption
--!>
<Server scope="Analysis" accessMode="read">
<Host name="db07.star.bnl.gov" port="3316"/>
<Host name="db06.star.bnl.gov" port="3316"/>
<Host name="db08.star.bnl.gov" port="3316"/>
</Server>
<!--Below is an example of Pool (one in this case) of nodes that Only becone active at "Night"
Night is between 11 pm and 7 am relative to the local system clock
--!>
<Server scope="Analysis" whenActive="night" accessMode="read">
<Host name="db01.star.bnl.gov" port="3316"/>
</Server>
<!--Below is an example of Pool (one in this case) of nodes that is reserved for the for users assigned to it.
This is useful for a development node.
--!>
<Server scope="Analysis" user="john,paul,george,ringo" accessMode="read">
<Host name="db01.star.bnl.gov" port="3316"/>
</Server>
<!--Below is an example of Pool (one in this case) of nodes that is reserved for write. Outside of BNL, this should only be allowed on
nodes ONLY being used for development and debugging. At BNL this is reserved for the MASTER. The element accessMode corresponds
to an environmental variable which is set to read by default
--!>
<Server scope="Analysis" accessMode="write">
<Host name="robinson.star.bnl.gov" port="3306"/>
</Server>
</Scatalog>
The label assigned to scope does not matter to the code, it is for bookkeeping purposes only.
Nodes can be moved in and out of pools at the administrators discretion. A node can also be a member of more than one pool.
a list of possible features is as follows:
for Sever - attributes are:
host - attributes are:
Machine power is a weighting mechanism - determining the percentage of jobs that an administrator wants to direct to a particular node. The default value =1, So
a machine power of 100 means most requests will go to that node also a machinePower of 0.1 means propotional to the other nodes very few requests will go to that node.
For example
<Server scope="Analysis" whenActive="night" accessMode="read">
<Host name="db1.star.bnl.gov" port="3316" machinePower = 90/>
<Host name="db2.star.bnl.gov" port="3316"/>
<Host name="db3.star.bnl.gov" port="3316" machinePower = 10/>
</Server>
says that node db1 will get most requests
db2 almost nothing (default value = 1)
db3 very few requests
Cap is a limit of connections allowed on a particular node
Please refer to the attached paper for detailed discussion about each of these attributes/features.
The load balancer makes its decision as to which node to connect to, based on the number of active connections on each node.
It will choose the node with the least number of connections.
In order to do this it must make a connection to each node in a group.
The load balancer will need an account on the database server with a password associated with it.
The account is:
user = loadbalancer
please contact an administrator for the password associated with the account.
631-344-2499
The load this operation creates is minimal.
so
something like
grant process on *.* to 'loadbalancer'@'%.bnl.gov' identified by 'CALL 631-344-2499';
of coarse the location should be local.
---
This page holds STAR database API v2 progress. Here are major milestones :
--------------------------------------------------------------------------------------------------------------
New Load Balancer (abstract interface + db-specific modules) :
Should we support <databases></databases> tag with new configuration schema?
<StDbServer>
<server> run2003 </server>
<host> onldb.starp.bnl.gov </host>
<port> 3501 </port>
<socket> /tmp/mysql.3501.sock </socket>
<databases> RunLog, Conditions_rts, Scalers_onl, Scalers_rts </databases>
</StDbServer>
--- HYPERCACHE ---
Definitions :
1. persistent representation of STAR Offline db on disk;
2. "database on demand" feature;
Each STAR Offline DB request is :
3. data on disk is to be partitioned by :
a) "db path" AND ("validity time" OR "run number");
POSSIBLE IMPLEMENTATIONS:
a) local sqlite3 database + data blobs as separate files, SINGLE index file like "/tmp/STAR/offline.sqlite3.cache" for ALL requests;
b) local sqlite3 database + data blobs as separate files, MULTIPLE index files, one per request path. Say, request is "bemc/mapping/2003/emcPed", therefore, we will have "/tmp/STAR/sha1(request_path).sqlite3.cache" file for all data entries;
c) local embedded MySQL server (possibly, standalone app) + data blobs as separate files;
d) other in-house developed solution;
-----------------------------------------------------------------------------------------------------------------------------
SQLITE 3 database table format : [char sha1( path_within_subsystem ) ] [timestamp: beginTime] [timestamp: endTime] [seconds: expire] [char: flavor]
SQLITE 3 table file : /tmp/STAR_OFFLINE_DB/ sha1 ( [POOL] / [DOMAIN] / [SUBSYSTEM] ) / index.cache
SQLITE 3 blob is located at : /tmp/STAR_OFFLINE_DB/ sha1 ( [POOL] / [DOMAIN] / [SUBSYSTEM] ) / [SEGMENT] / sha1( [path_within_subsystem][beginTime][endTime] ).blob.cache
[SEGMENT] = int [0]...[N], for faster filesystem access
Name | Backends | C/C++ | Linux/Mac version | Multithreading lib/drivers | RPM available | Performance | Licence |
OpenDBX | Oracle,MySQL, PostgreSQL, Sqlite3 + more | yes/yes | yes/yes | yes/yes | yes/authors | fast, close to native drivers | LGPL |
libDBI | MySQL, PostgreSQL, Sqlite3 | yes/external | yes/yes | yes/some | yes/Fedora | fast, close to native drivers | LGPL/GPL |
SOCI | Oracle,MySQL, PostgreSQL | no/yes | yes/yes | no/partial | yes/Fedora | average to slow | Boost |
unixODBC | ALL known RDBMS | yes/external | yes/yes | yes/yes | yes/RHEL | slow | LGPL |
While other alternatives exist ( e.g. OTL,QT/QSql ), I'd like to keep abstraction layer as thin as possible, so my choice is OpenDBX. It supports all databases we plan to use in a mid-term (MySQL, Oracle, PostreSQL, Sqlite3) and provides both C and C++ APIs in a single package with minimal dependencies on other packages.
STAR @ RHIC is cross-platform application, available for Web, Android, WinPhone, Windows, Mac and Linux operating systems, which provides a convenient aggregated access to various STAR Online tools, publicly available at STAR collaboration website(s).
STAR @ RHIC is the HTML5 app, packaged for Android platform using Crosswalk, a HTML application runtime, optimized for performance (ARM+x86 packages, ~50 MB installed). Crosswalk project was founded by Intel's Open Source Technology Center. Alternative packaging (Android universal, ~5 MB, WinPhone ~5 MB) is done via PhoneGap build service, which is not hardware-dependent, as it does not package any html engine, but it could be affected by system updates. Desktop OS packaging is done using NodeWebKit (NW.js) software.
Security: This application requires STAR protected password to function (asks user at start). All data transmissions are protected by Secure Socket Layer (SSL) encryption.
Note: sorry iOS users (iPhone, iPad) - Apple is very restrictive and does not allow to package or install applications using self-signed certificates. While there is a technical possibility to make iOS package of the application, it will cost ~$100/year to buy iOS developer access which includes certificate.
STAR ONLINE Status Viewer
Location: development version of this viewer is located here : http://online.star.bnl.gov/dbPlots/
Rationale: STAR has many standalone scripts written to show online status of some specific subsystem during Run time. Usually, plots or histograms are created using either data from Online Database, or Slow Controls Archive or CDEV inteface. Almost every subsystem expert writes her own script, because no unified API/approach is available at the moment. I decided to provide generic plots for various subsystems data, recorded in online db by online collector daemons + some data fetched directly from SC Archive. SC Archive is used because STAR online database contains primarily subsystem-specific data like voltages, and completely ignores basic parameters like hall temperature and humidity (not required for calibrations). There is no intention to provide highly specialized routines for end-users (thus replacing existing tools like SPIN monitor), only basic display of collected values is presented.
Implementation:
- online.star.bnl.gov/dbPlots/ scripts use single configuration file to fetch data from online db slave, thus creating no load on primary online db;
- to reduce CPU load, gnuplot binary is used instead of php script to process data fetched from database and draw plots. This reduces overall CPU usage by x10 factor to only 2-3% of CPU per gnuplot call (negligible).
- Slow Controls data is processed in this way: a) cron-based script, located at onldb2(slave) is polling SC Archive every 5 minutes and writes data logs in a [timestamp] [value] format; b) gnuplot is called to process those files; c) images are shipped to dean.star.bnl.gov via NFS exported directory;
Maintenance: resulting tool is really easy to maintain, since it has only one config file for database, and only one config file for graphs information - setup time for a new Run is about 10 minutes. In addition, it is written in a forward-compatible way, so php version upgrade should not affect this viever.
Browser compatibility: Firefox, Konqueror, Safari, IE7+, Google Chrome are compatible. Most likely, old netscape should work fine too.
STAR database improvements proposal
D. Arhipkin
1. Monitoring strategy
Proposed database monitoring strategy suggests simultaneous host (hardware), OS and database monitoring to be able to prevent db problems early. Database service health and response time depends strongly on underlying OS health and hardware, therefore, solution, covering all aforementioned aspects needs to be implemented. While there are many tools available on a market today, I propose to use Nagios host and service monitoring tool.
Nagios is a powerful monitoring tool, designed to inform system administrators of the problems before end-users do. The monitoring daemon runs intermittent checks on hosts and services you specify using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via web browser.
Nagios is already in use at RCF. Combined the Nagios server ability to work in a slave mode, this will allow STAR to integrate into BNL ITD infrastructure smoothly.
Some of the Nagios features include:
Monitoring of network services (SMTP, POP3, HTTP, NNTP, PING, etc.)
Monitoring of host resources (processor load, disk and memory usage, running processes, log files, etc.)
Monitoring of environmental factors such as temperature
Simple plugin design that allows users to easily develop their own host and service checks
Ability to define network host hierarchy, allowing detection of and distinction between hosts that are down and those that are unreachable
Contact notifications when service or host problems occur and get resolved (via email, pager, or other user-defined method)
Optional escalation of host and service notifications to different contact groups
Ability to define event handlers to be run during service or host events for proactive problem resolution
Support for implementing redundant and distributed monitoring servers
External command interface that allows on-the-fly modifications to be made to the monitoring and notification behavior through the use of event handlers, the web interface, and third-party applications
Retention of host and service status across program restarts
Scheduled downtime for suppressing host and service notifications during periods of planned outages
Ability to acknowledge problems via the web interface
Web interface for viewing current network status, notification and problem history, log file, etc.
Simple authorization scheme that allows you restrict what users can see and do from the web interface
2. Backup strategy
There is an obvious need for unified, flexible and robust database backup system for STAR databases array. Databases are a part of growing STAR software infrastructure, and new backup system should be easy to manage and scalable enough to perform well under such circumstances​
Zmanda Recovery Manager (MySQL ZRM, Community Edition) is suggested to be used, as it would be fully automated, reliable, uniform database backup and recovery method across all nodes. It also has an ability to restore from backup by tools included with standard MySQL package (for convenience). ZRM CE is a freely downloadable version of ZRM for MySQL, covered by GPL license.
ZRM allows to:
Schedule full and incremental logical or raw backups of your MySQL database
Centralized backup management
Perform backup that is the best match for your storage engine and your MySQL configuration
Get e-mail notification about status of your backups
Monitor and obtain reports about your backups (including RSS feeds)
Verify your backup images
Compress and encrypt your backup images
Implement Site or Application specific backup policies
Recover database easily to any point in time or to any particular database event
Custom plugins to tailor MySQL backups to your environment
ZRM CE is dedicated to use with MySQL only.
3. Standards compliance
OS compliance. Scientific Linux distributions comply to the Filesystem Hierarchy Standard (FHS), which consists of a set of requirements and guidelines for file and directory placement under UNIX-like operating systems. The guidelines are intended to support interoperability of applications, system administration tools, development tools, and scripts as well as greater uniformity of documentation for these systems. All MySQL databases used in STAR should be configured according to underlying OS standards like FHS to ensure effective OS and database administration during the db lifetime.
MySQL configuration recommendations. STAR MySQL servers should be configured in compliance to both MySQL for linux recommendations and MySQL server requirements. All configuration files should be complete (no parameters should be required from outer sources), and contain supplementary information about server primary purpose and dependent services (like database replication slaves).
References
http://www.zmanda.com/backup-mysql.html
http://proton.pathname.com/fhs/
INTRODUCTION
This page will provide summary of SSD vs SAS vs DRAM testing using SysBench tool. Results are grouped like this :
Filesystem IO results are important to understand several key aspects, like :
Simulated MySQL load test is critical to understand strengths and weaknesses of MySQL itself, deployed on a different types of storage. This test should provide baseline for real data (STAR Offline DB) performance measurements and tuning - MySQL settings could be different for SAS and SSD.
Finally, STAR Offline API load tests should represent real system behavior using SAS and SSD storage, and provide an estimate of the potential benefits of moving to SSD in our case of ~1000 parallel clients per second per db node (we have dozen of Offline DB nodes at the moment).
While we do not expect DRAM to become our primary storage component, these tests will allow to estimate the benefit of partial migration of our most-intensively used tables to volatile but fast storage.
Basic results of filesystem IO testing :
Summary :
Simulated MySQL Load testing :
Summary : quite surprising results were uncovered.
STAR Offline DB testing :
TBD
Summary :
CONCLUSIONS:
TBD
SysBench parameters: table with 20M rows, readonly. No RAM limit, /dev/shm was used as MySQL data files location.
SysBench parameters: table with 20M rows, readonly. Allowed RAM limit: 2Gb to reduce fs caching effects.
<h1 class="rtecenter">Simulated DB Load : Solid State Disk</h1>
SysBench parameters: table with 20M rows, readonly. Allowed RAM limit: 2Gb to reduce fs caching effects.