Infrastructure

The pages in this tree relates to the Infrastructure sub-group of the S&C team.

The areas comprise: General infrastructure (software, web service, security,...), Online computing, operations and user support.

 

Online Computing

General

The online Web server front page is available here. This Drupal section will hold complementary informations.
A list of all operation manuals (beyond detector sub-systems) is available at You do not have access to view this node.
Please use it a startup page.

Detector sub-systems operation procedures - Updated 2008, requested confirmation for 2009

 

Online computing run preparation plans

This page will list by year action items, run plans and opened questions. It will server as a repository for documents serving as basis for drawing the requirements. To see documents in this tree, you must belong to the Software and Computing OG (the pages are not public).

Run 19

Feedback from software coordinators

Active feedback

Sub-system Coordinator Calibration POC Online monitoring POC
MTD Rongrong Ma - same - - same -
EMC

Raghav Kunnawalkam Elayavalli

Nick Lukow

- same -

Note: L2algo, bemc and  bsmdstatus

EPD Prashant Shanmuganathan N/A - same -
BTOF Frank Geurts - same - Frank Geurts
Zaochen Ye
ETOF Florian Seck - same - Florian Seck
Philipp Weidenkaff
HLT Hongwei Ke - same - - same -

Other software coordinators

sub-system Coordinator
iTPC (TPC?) Irakli Chakaberia
Trigger Akio Ogawa
DAQ Jeff Landgraf
...  

Run 20

Status of calibration timeline initialization

In RUN: EEMC, EMC, EPD, ETOF, GMT, TPC, MTD, TOF
Test: FST, FCS, STGC (no tables)
Desired init dates where announced to all software coordinators:

- Geometry tag has a timestamp of 20191120
- Simulation timeline [20191115,20191120[
- DB initialization for real data [20191125,...]

     Please initialize your table content appropriate yi.e.
sim flavor initial values are entered at 20191115 up to 20191119
(please exclude the edge),  ofl initial values at 20191125
(run starting on the 1st of December, even tomorrow's cosmic
and commissioning would pick the proper values).

 

 

Status - 2019/12/10

EMC  = ready
ETOF = ready - initialized at 2019-11-25, no sim (confirming)
TPC  = NOT ready [look at year 19 for comparison]
MTD  = ready
TOF  = Partially ready? INL correction, T0, TDC, status and alignement tables initialized
EPD  = gain initialized at 2019-12-15 (!?), status not initialized, no sim

EEMC = ready? (*last init at 2017-12-20)
GMT  = ready (*no db tables)



Status - 2019/12/09

EMC  = ready
ETOF = ready? initialized at 2019-11-25, no sim
TPC  = NOT ready
MTD  = ready
TOF  = NOT ready
EPD  = gain initialized at 2019-12-15 (!?), status not initialized, no sim

EEMC = ready? (*last init at 2017-12-20)
GMT  = ready (*no db tables)

 

 

Software coordinator feedback for Run 20 - Point of Contacts

Sub-system Coordinator Calibration POC Online monitoring POC
MTD Rongrong Ma - same - - same -
EMC
EEMC

Raghav Kunnawalkam Elayavalli

Nick Lukow

- same -

Note: L2algo, bemc and  bsmdstatus

EPD [ TBC] - same - - same -
BTOF Frank Geurts - same - Frank Geurts
Zaochen Ye
ETOF Florian Seck - same - Florian Seck
Philipp Weidenkaff
HLT Hongwei Ke - same - - same -
TPC Irakli Chakaberia - same -
Flemming Videbaek
Trigger detectors Akio Ogawa - same - - same -
DAQ Jeff Landgraf N/A  


---




Run 21

Status of calibration timeline initialization

- Geometry tag has a timestamp of 20201215
- Simulation timeline [20201210, 20201215]
- DB initialization for real data [20201220,...]

Status - 2020/12/10

 

Software coordinator feedback for Run 21 - Point of Contacts

Sub-system Coordinator Calibration POC Online monitoring POC
MTD Rongrong Ma - same - - same -
EMC
EEMC

Raghav Kunnawalkam Elayavalli

Nick Lukow

- same -

Note: L2algo, bemc and  bsmdstatus

EPD Prashanth Shanmuganathan (TBC) Skipper Kagamaster - same -
BTOF Zaochen - same - Frank Geurts
Zaochen Ye
ETOF Philipp Weidenkaff - same - Philipp Weidenkaff
HLT Hongwei Ke - same - - same -
TPC Yuri Fisyak - same - Flemming Videbaek
Trigger detectors Akio Ogawa - same - - same -
DAQ Jeff Landgraf N/A  
Forward Upgrade Daniel Brandenburg - same - FCS - Akio Ogawa
sTGC - Daniel Brandenburg
FST - Shenghui Zhang/Zhenyu Ye
       

---

Run 22

 

Status of calibration timeline initialization

- Geometry tag has a timestamp of 20211015
- Simulation timeline [20211015, 20211020[
- DB initialization for real data [20211025,...]

Status - 2021/10/13

 

Software coordinator feedback for Run 22 - Point of Contacts (TBC)

Sub-system Coordinator Calibration POC Online monitoring POC
MTD Rongrong Ma - same - - same -
EMC
EEMC

Raghav Kunnawalkam Elayavalli
Navagyan Ghimire

- same -

Note: L2algo, bemc and  bsmdstatus

EPD Prashanth Shanmuganathan (TBC) Skipper Kagamaster - same -
BTOF Zaochen - same - Frank Geurts
Zaochen Ye
ETOF Philipp Weidenkaff - same - Philipp Weidenkaff
HLT Hongwei Ke - same - - same -
TPC Yuri Fisyak - same - Flemming Videbaek
Trigger detectors Akio Ogawa - same - - same -
DAQ Jeff Landgraf N/A  
Forward Upgrade Daniel Brandenburg - same - FCS - Akio Ogawa
sTGC - Daniel Brandenburg
FST - Shenghui Zhang/Zhenyu Ye
       

---

Run XIII

Preparation meeting minutes

Database initialization check list

TPC Software  – Richard Witt          NO
GMT Software  – Richard Witt          NO
EMC2 Software - Alice Ohlson          Yes
FGT Software  - Anselm Vossen         Yes
FMS Software  - Thomas Burton         Yes
TOF Software  - Frank Geurts          Yes
Trigger Detectors  - Akio Ogawa       ??
HFT Software  - Spyridon Margetis     NO (no DB interface, hard-coded values in preview codes)

 

Calibration Point of Contacts per sub-system

If a name is missing, the POC role falls onto the coordinator.
                Coordinator           Possible POC
                ------------          ---------------
TPC Software  – Richard Witt          
GMT Software  – Richard Witt          
EMC2 Software - Alice Ohlson          Alice Ohlson  
FGT Software  - Anselm Vossen         
FMS Software  - Thomas Burton         Thomas Burton    
TOF Software  - Frank Geurts          
Trigger Detectors  - Akio Ogawa       
HFT Software  - Spyridon Margetis     Hao Qiu

Online Monitoring POC

The final list from the SPin PWGC can be found at You do not have access to view this node . The table below includes the Spin PWGC feedback and other feedbacks merged.

  Directories we inferred are being used (as reported in the RTS Hypernews)
  scaler Len Eun and Ernst Sichtermann (LBL) This directory usage was indirectly reported
  SlowControl James F Ross (Creighton)  
  HLT Qi-Ye Shou The 2012 directory had a recent timestamp but owned by mnaglis. Aihong Tang contacted 2013/02/12
Answer from  Qi-Ye Shou 2013/02/12 - will be POC.

  fmsStatus Yuxi Pan (UCLA) This was not requested but the 2011 directory is being overwritten by user=yuxip
FMS software coordinator contacted for confirmation 2013/02/12
Yuxi Pan confirmed 2013/02/13 as POC for this directory

     
Spin PWG monitoring related directories follows
  L0trg Pibero Djawotho (TAMU)  
  L2algo Maxence Vandenbroucke (Temple)  
  cdev Kevin Adkins (UKY)  
  zdc Len Eun and Ernst Sichtermann (LBL)  
  bsmdStatus Keith Landry (UCLA)  
  emcStatus Keith Landry (UCLA)  
  fgtStatus Xuan Li (Temple) This directory is also being written by user=akio causing protection access and possible clash problems.
POC contacted on 2013/02/08, both Akio and POC contacted again 2013/02/12 -> confirmed as OK.

  bbc Prashanth (KSU)  



Run XIV


Preparation meeting meetings, links

  • You do not have access to view this node
  • You do not have access to view this node
  • You do not have access to view this node
  • You do not have access to view this node
  • You do not have access to view this node
  • You do not have access to view this node
  • You do not have access to view this node
  • You do not have access to view this node
  • You do not have access to view this node
  • You do not have access to view this node

Notes

  • 2013/11/15
    • Info gathering begins (directories/areas and Point of Contacts)
      Status:
      2013/11/22, directory structure, 2 people provided feedback, Renee coordinated the rest
      2013/11/25, calibration POC, 3 coordinators provided feedback - Closed 2013/12/04
      2013/12/04, geometry for Run 14,
       
    • Basic check: CERT for online is old if coming from the Wireless
      Status: fixed at ITD level, 2013/11/18 - the reverse proxy did not have the proper CERT
  • 2013/1125

Database initialization check list

This actions suggested by this section has not started yet.

Sub-system Coordinator Check done
DAQ
Jeff Landgraf  
TPC Richard Witt  
GMT Richard Witt  
EMC2 Mike Skoby
Kevin Adkins
 
FMS Thomas Burton  
TOF Daniel Brandenburg  
MTD Rongrong Ma  
HFT Spiros Margetis (not known)
Trigger Akio Ogawa  
FGT Xuan Li  


Calibration Point of Contacts per sub-system

"-" indicates no feedback was provided. But if a name is missing, the POC role falls onto the coordinator.

Sub-system Coordinator Calibration POC
DAQ Jeff Landgraf -
TPC Richard Witt -
GMT Richard Witt -
EMC2 Mike Skoby
Kevn Adkins
-
FMS Thomas Burton -
TOF Daniel Brandenburg -
MTD Rongrong Ma Bingchu Huan
HFT Spiros Margetis Jonathan Bouchet
Trigger Akio Ogawa -
FGT Xuan Li N/A


Online Monitoring POC


scaler   Not needed 2013/11/25
SlowControl Chanaka DeSilva OKed on second Run preparation meeting
HLT Zhengquia Zhang  Learn incidently on 2014/01/28
HFT Shusu Shi Learn about it on 2014/02/26
fmsStatus   Not needed 2013/11/25
L0trg Zilong Chang
Mike Skoby
 
Informed 2013/11/10 and created 2013/11/15
L2algo  Nihar Sahoo Informed 2013/11/25
cdev   Not needed 2013/11/25
zdc   may not be used (TBC)
bsmdStatus  Janusz Oleniacz Info will be passed from Keith Landry 2014/01/20
Possible backup, Leszek Kosarzewski 2014/03/26
emcStatus  Janusz Oleniacz Info will be passed from Keith Landry 2014/01/20
Possible backup, Leszek Kosarzewski 2014/03/26
fgtStatus   Not needed 2013/11/25
bbc
 Akio Ogawa Informed 2013/11/15, created same day


Run XV

Run 15 was preapred essentiallydiscussing with indviduals and a comprehensive page not maintained.

Run XVI


This page will contain feedback related to the preparation of the online setup.

 

Notes



 

Online Monitoring POC

scaler    
SlowControl    
HLT Zhengqiao Feedback 2015/11/24
HFT Guannan Xie Spiros: Feedback 2015/11/24
fmsStatus   Akio: Possibly not needed (TBC). 2016/01/13 noted this was not used in Run 15 and wil probably never be used again.
fmsTrg   Confirmed neded 2016/01/13
fps   Akio: Not neded in Run 16? Perhaps later.
L0trg Zilong Chang Zilong: Feedback 2015/11/24
L2algo Kolja Kauder Kolja: will be POC - 2015/11/24
cdev Chanaka DeSilva  
zdc    
bsmdStatus Kolja Kauder Kolja: will be POC - 2015/11/24
bemcTrgDb Kolja Kauder Kolja: will be POC - 2015/11/24
emcStatus Kolja Kauder Kolja: will be POC - 2015/11/24
fgtStatus   Not needed since Run 14 ... May drop from the list
bbc
Akio Ogawa Feedback 2015/11/24, needed
rp    

 

Calibration Point of Contacts per sub-system

Sub-system Coordinator Calibration POC
DAQ Jeff Landgraf -
TPC Richard Witt
Yuri Fisyak
-
GMT Richard Witt -
EMC2 Kolja Kauder
Ting Lin
-
FMS Oleg Eysser -
TOF Daniel Brandenburg -
MTD Rongrong Ma (same confirmed 2015/11/24)
HFT Spiros Margetis Xin Dong
HLT Hongwei Ke (same confirmed 2015/11/24)
Trigger Akio Ogawa -
RP Kin Yip -

 

Database initialization check list



 

Shift Accounting

This page will now hold the shift accounting pages. They complement the Shift Sign-up process by documenting it.

Run 18 shift dues


Run 18 Shift Dues & Notes


Period coordinators

As usual, period coordinators are pre-assigned, as arranged by the Spokespersons.

Special arrangements and requests

  1. Under the family-related policy, the following 6 weeks of offline QA shifts were pre-assigned:
    MAR 27 Kevin Adkins (Kentucky)
    APR 03 Kevin Adkins
    APR 10 Sevil Salur (Rutgers)
    APR 17 Richard Witt (USNA/Yale)
    MAY 22 Juan Romero (UC Davis)
    JUN 12 Terry Tarnowsky (Michigan State)
     
  2. Lanny Ray (UT Austin), as QA coordinator, always is pre-assigned the first QA week.
     
  3. FIAS remains in “catch-up mode” and is taking extra shifts above their dues. Pre-assigned shifts can be requested in this scenario. FIAS has been pre-assigned 4 Detector Op shifts.
     
  4. Bob Tribble (TAMU) requests the evening Shift leader slot during Apr 10-17.

Run 19 special requests

The following pre-assigned slot requests were made.
    9 WEEKS PRE-ASSIGNED QA AS FOLLOWS
    ==================================
    Lanny Ray (UT Austin) QA Mar 5
    Richard Witt (USNA/Yale) QA Mar 19
    Sevil Salur (Rutgers) QA Apr 16
    Wei Li (Rice) QA Apr 23
    Kevin Adkins (Kentucky) QA May 14
    Juan Romero (UC Davis) QA May 21
    Jana Bielcikova (NPI, Czech Acad of Sci) QA May 28  
    Yanfang Liu (TAMU) QA June 25 
    Yanfang Liu (TAMU) QA July 02
    
    8 WEEKS PRE-ASSIGNED REGULAR SHIFTS AS FOLLOWS
    ==================================
    Bob Tribble (BNL) Feb 05 SL evening 
    Daniel Kincses (Eotvos) Mar 12  DO Trainee Day
    Daniel Kincses (Eotvos) Mar 19  DO Day
    Mate Csanad (Eotvos) Mar 12 SC Day
    Ronald Pinter (Eotvos) Mar 19 SC Day
    Carl Gagliardi (TAMU)  May 14  SL day
    Carl Gagliardi (TAMU)  May 21 SL day 
    Grazyna Odyniec (LBNL) July 02 SL evening
    
    

Shift Dues and Special Requests Run 20

For the calculation of shift dues, there are two considerations.
1) The length of time of the various shift configurations (2 person, 4 person no trainees, 4 person with trainees, plus period coordinators/QA shifts)
2) The percent occupancy of the training shifts

For many years, 2) has hovered about 45%, which is what we used to calculate the dues.  Since STAR gives credit for training shifts (as we should) this needs to be factored in or we would not have enough shifts.

The sum total of shifts needed are then divided by the total number of authors minus authors from Russian institutions who can not come to BNL.

date                  weeks           crew           training           PC           OFFLINE          
11/26-12/10    2                  2                      0                  0           0           
12/10-12/24    2                  4                      2                 1            0   
12/24-6/30      27                4                      2                 1            1   
7/02-7/16        2                  4                      0                 1            1   

Adding these together (3x a shift for crew, 3x45% for training, plus pc plus offline) gives a total of 522 shifts.
The total number of shifters is 303 - 30 Russian collaborators = 273 people
Giving a total due of 1.9 per author.

For a given institution, their load is calculated as # of authors - # of expert credits x due -> Set to an integer value as cutting collaborators into pieces is non-collegial behavior.

However, this year, this should have been:
date                  weeks           crew           training           PC           OFFLINE          
11/26-12/10    2                  2                      0                  0           0           
12/10-12/24    2                  4                      2                 1            0   
12/24-6/02      23                4                      2                 1            1   
6/02-6/16        2                  4                      0                 1            1   

Adding these together (3x a shift for crew, 3x45% for training, plus pc plus offline) gives a total of 456 shifts for a total due of 1.7 per author.

We allowed some people to pre-sign up, due to a couple different reasons.

Family reasons so offline QA:
James Kevin Adkins
Jana BielĨíková
Sevil Selur
Md. Nasim
Yanfang Liu

Additionally, Lanny Ray is given the first QA shift of the year as our experience QA shifter.

This year, to add an incentive to train for shift leader, we allowed people who were doing shift leader training to sign up for both their training shift and their "real" shift early:
Justin Ewigleben
Hanna Zbroszczyk
Jan Vanek
Maria Zurek
Mathew Kelsey
Kun Jiang
Yue-Hang Leung

Both Bob Tribble and Grazyna Odyniec sign up early for a shift leader position in recognition of their schedules and contributions

This year because of the date of Quark Matter and the STAR pre-QM meeting, several people were traveling on Tuesday during the sign up.  These people I signed up early as I did not want to punish some of our most active colleagues for the QM timing:
James Daniel  Brandenburg
Sooraj Radhakrishnan

3 other cases that were allowed to pre-sign up:
Panjab University had a single person who had the visa to enter the US, and had to take all of their shifts prior to the end of their contract in March.  So that the shifter could have some spaces in his shifts for sanity, I signed up:
Jagbir Singh
Eotvos Lorand University stated that travel is complicated for their group, and so it would be good if they could insure that they were all on shift at the same time.  Given that they are coming from Europe I signed up:
Mate Csanad
Daniel Kincses
Roland Pinter
Srikanta Tripathy
Frankfurt Institute for Advanced Studies (FIAS) wanted to be able to bring Masters students to do shift, but given the training requirements and timing with school and travel for Europe, this leaves little availability for shift.  So I signed up:
Iouri Vassiliev
Artemiy Belousov
Grigory Kozlov

Tools

This is to serve as a repository of information about various STAR tools used in experimental operations.

Implementing SSL (https) in Tomcat using CA generated certificates

The reason for using a certificate from a CA as opposed to a self-signed  certificate is that the browser gives a warning screen and asks you to except the certificate in the case of a self-signed  certificate. As there already exists a given list of trusted CAs in the browser this step is not needed.
 
The following list of certificates and a key are needed:

/etc/pki/tls/certs/wildcard.star.bnl.gov.Nov.2012.cert – host cert.
/etc/pki/tls/private/wildcard.star.bnl.gov.Nov.2012.key – host key (don’t give this one out)
/etc/pki/tls/certs/GlobalSignIntermediate.crt – intermediate cert.
/etc/pki/tls/certs/GlobalSignRootCA_ExtendedSSL.crt –root cert.
/etc/pki/tls/certs/ca-bundle.crt – a big list of many cert.

Concatenate the following certs into one file in this example I call it: Global_plus_Intermediate.crt
cat /etc/pki/tls/certs/GlobalSignIntermediate.crt > Global_plus_Intermediate.crt
cat /etc/pki/tls/certs/GlobalSignRootCA_ExtendedSSL.crt >> Global_plus_Intermediate.crt
cat /etc/pki/tls/certs/ca-bundle.crt >> Global_plus_Intermediate.crt

Run this command. Note that -name tomcat” and -caname root should not be changed to any other value. The command will still work but will fail under tomcat. If it works you will be asked for a password, that password should be set to "changeit".

 openssl pkcs12 -export -in wildcard.star.bnl.gov.Nov.2012.cert -inkey wildcard.star.bnl.gov.Nov.2012.key -out mycert.p12 -name tomcat -CAfile Global_plus_Intermediate.crt -caname root -chain

Test the new p12 output file with this command:

keytool -list -v -storetype pkcs12 -keystore mycert.p12

Note it should say: "Certificate chain length: 3"


In tomcat’s the server.xml file add a connector that looks like this:
 

<Connector port="8443" protocol="HTTP/1.1" SSLEnabled="true"
           maxThreads="150" scheme="https" secure="true"
           keystoreFile="/home/lbhajdu/certs/mycert.p12" keystorePass="changeit"
           keystoreType="PKCS12" clientAuth="false" sslProtocol="TLS"/>


Note the path should be set to the correct path of the certificate.  And the p12 file should only be readable by the Tomcat account because it holds the host key. 

Online Linux pool

March 15, 2012:

THIS PAGE IS OBSOLETE!  It was written as a guide in 2008 for documenting improvements in the online Linux pool, but has not been updated to reflect additional changes to the state of the pool, so not all details are up to date. 

One particular detail to be aware of:  the name of the pool nodes is now onlNN.starp.bnl.gov, where 01<=NN<=14.  The "onllinuxN" names were retired several years ago.

 

Historical page (circa 2008/9):

Online Linux pool for general experiment support needs

 

GOAL: 

Provide a Linux environment for general computing needs in support of the experiemental operations.

HISTORY (as of approximately June 2008):

A pool of 14 nodes, consisting of four different hardware classes (all circa 2001) has been in existence for several years.  For the last three (or more?) years, they have had Scientific Linux 3.x with support for the STAR software environment, along with access to various DAQ and Trigger data sources.  The number of significant users has probably been less than 20, with the heaviest usage related to L2.  User authentication was originally based on an antique NIS server, to which we had imported the RCF accounts and passwords.  Though still alive, we have not kept this NIS information maintained over time.  Over time, local accounts on each node became the norm, though of course this is rather tedious.  Home directories come in three categories:  AFS, NFS on onllinux5, and local home directories on individual nodes.  Again, this gets rather tedious to maintain over time.

There are several "special" nodes to be aware of:

  1. Three of the nodes (onllinux1, 2 and 3) are in the Control Room for direct console login as needed.  (The rest are in the DAQ room.)
  2. onllinux5 has the NFS shared home directories (in /online/users).  (NB.  /online/users is being backed up by the ITD Networker backup system.)
  3. onllinux6 is (was?) used for many online database maintenance scripts (check with Mike DePhillps about this -- we had planned to move these scripts to onldb).
  4. onllinux1 was configured as an NIS slave server, in case the NIS master (starnis01) fails.

 

PLAN:

For the run starting in 2008 (2009?), we are replacing all of these nodes with newer hardware.

The basic hardware specs for the replacement nodes are:

Dual 2.4 GHZ Intel Xeon processors

1GB RAM

2 x 120 GB IDE disks

 

These nodes should be configured with Scientific Linux 4.5 (or 4.6 if we can ensure compatibility with STAR software) and support the STAR software environment.

They should have access to various DAQ and Trigger NFS shares.  Here is a starter list of mounts:

 

Shared DAQ and Trigger resources

SERVER DIRECTORY on SERVER LOCAL MOUNT PONT MOUNT OPTIONS
 evp.starp  /a  /evp/a  ro
 evb01.starp  /a  /evb01/a  ro
 evb01  /b  /evb01/b  ro
 evb01  /c  /evb01/c  ro
 evb01  /d  /evb01/d  ro
 evb02.starp  /a  /evb02/a  ro
 evb02  /b  /evb02/b  ro
 evb02  /c  /evb02/c  ro
 evb02  /d  /evb02/d  ro
 daqman.starp  /RTS  /daq/RTS  ro
 daqman  /data  /daq/data  rw
 daqman  /log  /daq/log  ro
 trgscratch.starp  /data/trgdata  /trg/trgdata  ro
 trgscratch.starp  /data/scalerdata  /trg/scalerdata  ro
 startrg2.starp  /home/startrg/trg/monitor/run9/scalers  /trg/scalermonitor  ro
 online.star  /export  /onlineweb/www  rw

 

 

WISHLIST Items with good progress:

  • <Uniform and easy to maintain user authentication system to replace the current NIS and local account mess.  Either a local LDAP, or a glom onto RCF LDAP seems most feasible> -- An ldap server (onlldap.starp.bnl.gov) has been set-up and the 15 onllinux nodes are authenticating to it *BUT* it is using NIS!
  • <Shared home directories across the nodes with backups> -- onlldap is also hosting the home directories and sharing them via NFS.  EMC Networker is backing up the home directories and Matt A. is recieving the email notifications.
  • <Integration into SSH key management system (mechanism depends upon user authentication method(s) selected).> --  The ldap server has been added to the STAR SSH key management system, and users are able to login to the new onlXX nodes with keys now.
  • <Common configuration management system> -- Webmin is in use.
  • <Ganglia monitoring of the nodes> -- I think this is done...
  • <Osiris monitoring of the nodes> -- I think this is done - Matt A. and Wayne B. are receiveing the notices...

WISHLIST Items still needing significant work:

  • None?

 

SSH Key Management

Overview 

An SSH public key management system has been developed for STAR (see D. Arkhipkin et al 2008 J. Phys.: Conf. Ser. 119 072005), with two primary goals stemming from the heightened cyber-security scrutiny at BNL:

  • Use of two-factor authentication for remote logins
  • Identification and management of remote users accessing our nodes (in particular, the users of "group" accounts which are not tied to one individual) and achieve accountability

A benefit for users also can be seen in the reduction in the number of passwords to remember and type.

 

In purpose, this system is similar to the RCF's key management system, but is somewhat more powerful because of its flexibility in the association of hosts (client systems), user accounts on those clients, and self-service key installation requests.

Here is a typical scenario of the system usage: 

  1. A sysadmin of a machine named FOO creates a user account named "JDOE" and, if not done already, installs the key_services client.
  2. A user account 'JDOE' on host 'FOO' is configured in the Key Management system by a key management administrator.
  3. John Doe uploads (via the web) his or her public ssh key (in openssh format).
  4. John Doe requests (via the web) that his key be added to JDOE's authorized_keys file on FOO.
  5. A key management administrator approves the request, and the key_services client places the key in ~JDOE/.ssh/authorized_keys.

At this point, John Doe has key-based access to JDOE@FOO.  Simple enough?  But wait, there's more!  Now John Doe realizes that he also needs access to the group account named "operator" on host BAR.  Since his key is already in the key management system he has only to request that his key be added to operator@BAR, and voila (subject to administrator approval), he can now login with his key to both JDOE@FOO and operator@BAR.  And if Mr. Doe should leave STAR, then an administrator simply removes him from the system and his keys are removed from both hosts.

Slightly Deeper...

There are three things to keep track of here -- people (and their SSH keys of course), host (client) systems, and user accounts on those hosts:

People want access to specific user accounts at specific hosts.

So the system maintains a list of user accounts for each host system, and a list of people associated with each user account at each host.
(To be clear -- the system does not have any automatic user account detection mechanism at this time -- each desired "user account@host" association has to be added "by hand" by an administrator.)

This Key Management system, as seen by the users (and admins), consists simply of users' web browsers (with https for encryption) and some PHP code on a web server (which we'll call "starkeyw") which inserts uploaded keys and user requests (and administrator's commands) to a backend database (which could be on a different node from the web server if desired). 

Behind the scenes, each host that is participating in the system has a keyservices client installed that runs as a system service.  The keyservices_client periodically (at five minute intervals by default) interacts a different web server (serving different PHP code that we'll call starkeyd).  The backend database is consulted for the list of approved associations and the appropriate keys are downloaded by the client and added to the authorized_keys files accordingly.

In our case, our primary web server at www.star.bnl.gov hosts all the STAR Key Manager (SKM) services (starkeyw and starkeyd via Apache, and a MySQL database), but they could each be on separate servers if desired.

Perhaps a picture will help.  See below for a link to an image labelled "SKMS in pictures".

Deployment Status and Future Plans

We have begun using the Key Management system with several nodes and are seeking to add more (currently on a voluntary basis).  Only RHEL 3/4/5 and Scientific Linux 3/4/5 with i386 and x86_64 kernels have been tested, but there is no reason to believe that the client couldn't be built on other Linux distributions or even Solaris.  We do not anticipate "forcing" this tool onto any detector sub-systems during the 2007 RHIC run, but we do expect it (or something similar) to become mandatory before any future runs.  Please contact one of the admins (Wayne Betts, Jerome Lauret or Mike Dephillips) if you'd like to volunteer or have any questions.

User access is currently based on RCF Kerberos authentication, but may be extended to additional authentication methods (eg., BNL LDAP) if the need arises.

Client RPMs (for some configurations) and SRPM's are available, and some installation details are available here: 

http://www.star.bnl.gov/~dmitry/skd_setup/

An additional related project is the possible implementation of a STAR ssh gateway system (while disallowing direct login to any of our nodes online) - in effect acting much like the current ssh gateway systems role in the SDCC.  Though we have an intended gateway node online (stargw1.starp.bnl.gov, with a spare on hand as well), it's use is not currently required.

 

Anxious to get started? 

Here you go: https://www.star.bnl.gov/starkeyw/ 

You can use your RCF username and Kerberos password to enter.

When uploading keys, use your SSH public keys - they need to be in OpenSSH format. If not, please consult SSH Keys and login to the SDCC.

 
 

Software Infrastructure

On the menu today ...

 

General Information

SOFI stands for SOFtware infrastructure and Infrastructure. It includes any topics related to code standards, tools compiling your code, problems with base code and Infrastructure. SOFI also addresses (or try to address) your need in terms of monitoring or easily manage activities and resources in the STAR environment.

 

Infrastructure & Software

Reporting problems

  • General RCF problems should be reported using Computing Facility Issue reporting system (RT).
    You should NOT use this system to report STAR-specific problems. Instead, use the STAR Request Tracking system described below.
  • To report STAR specific problems: Request Tracking (bug and issues tracker) system, 
    Submitting a problem (bug), help request or issue to the Request Tracking system using Email
    You can always submit a report to the bug tracking system by sending an Email directly. There is no need for a personalized account and using the Web Interface is not mandatory. For each BugTracking category, an equivalent @www.star.bnl.gov mailing list exists.
    The currently available queues are
    bugs-high problem with ANY STAR Software with a need to be fixed without delay
    bugs-medium problem with ANY STAR Software and must be fixed for the next release
    bugs-low problem with ANY STAR Software. Should be fixed for the next release
    comp-support General computing operation support (user, hardware and middleware provisioning)
    issues-infrstruct Any Infrastructure issues (General software and libraries, tools, network)
    issues-scheduler Issues related to the SUMS project (STAR Unified Meta-Scheduler)
    issues-xrootd Issues related to the (X)rootd distributed data usage
    issues-simu Issues related to Simulation
    grid-general STAR VO general Grid support : job submission, infrastructure, components, testing problem etc ...
    grid-bnl STAR VO, BNL Grid Operation support
    grid-lbl STAR VO, LBNL Grid Operation support
    wishlist Use it for or suggesting what you would wish to see soon, would be nice to have etc ...

    You may use the guest account for accessing the Web interface. The magic word is here.
    • To create a ticket, select the queue (drop down menu after the create-ticket button). Queues are currently sorted by problem priority. Select the appropriate level. A wishlist queue has been created for your comments and suggestions. After the queue is selected, click on the create-ticket button and fill the form. Please, do not forget the usual information i.e. the result of STAR_LEVELS and of uname -a AND a description of how to reproduce the problem.
    • If you want to request a private account instead of using the guest account, send a message to the wishlist queue. There are 2 main reasons for requesting a personalized account :
      1. If you are planning to be an administrator or a watcher of the bug tracking system (that is, receive tickets automatically, take responsibility for solving them etc ...) you MUST have a private account.
      2. If you prefer to see the summary and progress of your own submitted tickets at login instead of seeing all tickets submitted under the guest account, you should also ask for a private account.
    • At login, the left side panels shows the tickets you have requested and the tickets you own. The right panel shows a the status of all queues. Having a private account setup does NOT mean that you cannot browse the other users tickets. It only affects the left panels summary.
    • To find a particular bug, click on search and follow the instructions.
    • Finally, if you would like to have a new queue created for a particular purpose (sub-system specific problems), feel free to request to setup such a queue.

 

General Tools 

Data location tools

Several tools exists to locate data both on disk and in HPSS. Some tools are available from the production page and we will list here only the tools we are developing for the future.
    • FileCatalog (command line interface get_file_list.pl and Perl module interface).
    • You do not have access to view this node

Resource Monitoring tools

Browsers

Web Sanity, Software & documentation tools

Web based access and tools

Web Sanity

Software & documentation auto-generation

  • Our STAR Software CVS Repositories browser
    Allows browsing the full offline and online CVS repositories with listings showing days since last modification, modifier, and log message of last commit, display and download (checkout) access to code, access to all file versions and tags, and diff'ing between consecutive or arbitrary versions, direct file-level access to the cross-referenced presentation of a file, ... You can also sort by
    1. by user
    2. recent history
  • Doxygen Code documentation (what is already doxygenized )
    and the User documentation (a quick startup ...) Our current Code documentation is generated using the doxygen generator. Two utilities exists to help you with this new documentation scheme :
    1. doxygenize is a utility which takes as argument a list of header files and modify them to include a "startup" doxygen tag documentation. It tries to guess the comment block, the author and the class name based on content. The current version also documents struct and enum lists. Your are INVITED TO CHECK the result before committing anything. I have tested on several class headers but there is always the exception where the parsing fails ...
    2. An interface to doxygen named doxycron.pl was created and installed on our Linux machines to satisfy the need of users to generate the documentation by themselves for checking purposes. That same generator interface is used to produce our Code documentation every day so, a simple convention has been chosen to accomplish both tasks. But why doxycron.plinstead of directly using doxygen? If you are a doxygen expert, the answer is 'indeed, why ?'. If not, I hope you will appreciate that doxycron.pl not only takes care of everything for you (like creating the directory structure, a default actually-functional configuration file, safely creating a new documentation set etc ....) but also adds a few more tasks to its list you normally have to do it yourself when using doxygen base tools (index creation, sorting of run-time errors etc ...). This being said, let me describe this tool now ...

      The syntax for doxycron.pl is
      % doxycron.pl [-i] PathWhereDocWillBeGenerated Path(s)ToScanForCode Project(s)Name SubDir(s)Tag

      The arguments are:
      • -i is used here to disable the doxytag execution, a useless pass if you only want to test your documentation.
      • PathWhereDocWillBeGenerated is the path where the documentation tree will be or TARGETD
      • Path(s)ToScanForCode is the path where the sources are or INDEXD (default is the comma separated list /afs/rhic.bnl.gov/star/packages/dev/include,/afs/rhic.bnl.gov/star/packages/dev/StRoot)
      • Project(s)Name is a project name (list) or PROJECT (default is the comma separated include,StRoot)
      • SubDir(s)Tag an optional tag (list) for an extra tree level or SUBDIR. The default is the comma separated list include, . Note that the last element is null i.e. "". When encountered, the null portion of a SUBDIR list will tell doxycron.pl to generate an searchable index based all previous non-null SUBDIR in the list.

      Note that if one uses lists instead of single values, then, ALL arguments MUST be a list and the first 3 are mandatory.
      To pass an empty argument in a list, you must use quotations as in the following example

      % doxycron.pl /star/u/jeromel/work/doxygen /star/u/jeromel/work/STAR/.$STAR_HOST_SYS/DEV/include,/star/u/jeromel/work/STAR/DEV/StRoot include,StRoot 'include, '

      In order to make it clear what the conventions are, let's describe a step by step example as follow:

      Examples 1 (simple / brief explaination):
      % doxycron.pl `pwd` `pwd`/dev/StRoot StRoot
      would create a directory dox/ in `pwd` containing the code documentation generated from the relative tree dev/StRoot for the project named StRoot. Likely, this (or similar) will generate the documentation you need.

      Example 2 (fully explained):
      % doxycron.pl /star/u/jeromel/work/doxygen /star/u/jeromel/work/STAR/DEV/StRoot Test
      In this example, I scan any source code found in my local cvs checked-out area /star/u/jeromel/work/STAR/DEV starting from StRoot. The output tree structure (where the documentation will end) are requested to be in TARGETD=/star/u/jeromel/work/doxygen. In order to accomplish this, doxycron.pl will check and do the following:
      • Check that the doxygen program is installed
      • Create (if it does not exists) $TARGETD/dox directory where everything will be stored and the tree will start
      • Search for a $TARGETD/dox/$PROJECT.cfg file. If it does not exists, a default configuration file will be created. In our example, the name of the configuration file defaults to /star/u/jeromel/work/doxygen/dox/Test.cfg. You can play with several configuration file by changing the project name. However, changing the project name would not lead to placing the documents in a different directory tree. You have to play with the $SUBDIR value for that.
      • The $SUBDIR variable is not used in our example. If I had chosen it to be, let's say, /bof, the documentation would have been created in $TARGETD/dox/bof instead but the template is still expected to be $TARGETD/dox/$PROJECT.cfg.

      The configuration file should be considered as a template file, not a real configuration file. Any item appearing with a value like Auto-> or Fixed-> will be replaced on the fly by the appropriate value before doxygen is run. This ensures keeping the conventions tidy and clean. You actually, do not have to think about it neither, it works :) ... If it does not, please, let me know. Note that the temporary configuration file will be created in /tmp on the local machine and left there after running.

      What else does one need to know : the way doxycron.pl works is the safest I could think off. Each new documentation set is re-generated from scratch, that is, using temporary directories, renaming old ones and deleting very old ones. After doxycron.pl has completed its tasks, you will end up with the directories $TARGETD/dox$SUBDIR/html and $TARGETD/dox$SUBDIR/latex. The result of the preceding execution of doxycron.pl will be in directories named html.old and latex.old.
      One thing will not work for users though : the indexing. The installation of indexing mechanism in doxygen is currently not terribly flexible and fixed values were chosen so that clicking on the Search index link will go to the cgi searching the entire main documentation pages.

      As a last note, doxygen understands ABSOLUTE path names only and therefore, doxycron.pl will die out if you try to use relative paths as the arguments. Just as a reminder, /titi/toto is an absolute path while things like ./or ./tata are relative path.

HPSS tools & services

  • How to retrieve files from HPSS. Please, use the Data Carousel and ONLY the DataCarousel.
    Note: DO NOT use hsi to
    retrieve files from HPSS - this access mode locks tape drives for exclusive use (only you, not shared with any other user) and have dire impacts on STAR;s operations from production to data restores. If you are caught using it, you will be banned from accessing HPSS (your privilege to access HPSS resources will be revoked).
    Again - Please, use the Data Carousel.
  • Archiving into HPSS
    Several utilities exists. You can find the reference on the RCF HPSS Service page. Those utilities will bring you directly in the Archive class of service. Note that the DataCarousel can retrieve files from ANY class of service. The prefered mode for archining is the use of htar.
    NOTE: You should NOT abuse those services to retreive massive amount of files from HPSS (your operation will otherwise clash with other operations, including stall or slow down data production). Use the DataCarousel instead for massive file retreival. Abuse may lead to supression of access to archival service.
    • For rftp, history is in an Hypernews post Using rftp . If you save individual files and have lots of files in a directory, please avoid causing a Meta_data lookup. Meta-data lookup happens when you 'ls -l'. As a reminder, please keep in mind that HPSS is NOT made for neither small files and large amount of files in directories but for massive large files storage (on 2007/10/10 for example, a user crashed HPSS with a single 'ls -l' lookup of a 3000 files directory). In that regard, rftp is most useful if you create first an archive of your files yourself (tar, zip,...) and push the archive into HPSS afterward. If this is not your mode of operation, the preferred method is the use of htar which provides a command-line direct HPSS archive creation interface.
    • htar is the recommended mode for archining into HPSS. This utility provides a tar-like interface allowing for bundling together several files or an entire directory tree. Note the syntax of htar and especiallythe extract below from this thread:
      If you want the file to be created in /home/<username>/<subdir1> and <subdir1> does not existed yet, use
      % htar -Pcf /home/<username>/<subdir1>/<filename> <source>
      
      If you want the file to be created into /home/<username>/<subdir2> and <subdir2> already exists, use
      % htar -cf /home/<username>/<subdir2>/<filename> <source>
      
      Please consult the help on the web for more information about htar.
    • File size is limited to be <55GB, and if exceeded you will get Error -22. In this case consider using split-tar. A simple example/syntax on how to use split-tar is:
      % split-tar -s 55G -c blabla.tar blabla/
      This will at least create blabla-000.tar but also the next sequences (001, 002, ...) each of 55 GBytes until all files from directory blabla/ are packed. The magic 55 G suggested herein and in many posts works for any generation of drive for the past decade. But a limit of 100-150 GB should also  work on most media at BNL as per 2016. See this post for a summary of past pointers.
    • You may make split-tar archive cross-compatible with htar by creating afterward the htar indexes. To do this, use a command such as 
      % htar -X -E -f blabla-000.tar
      this will create blabla-000.tar.idx you will need to save in HPSS along the archive.

       

Batch system, resource management system

SUMS (aka, STAR Scheduler)


SUMS, the product of the STAR Scheduler project, stands for Star Unified Meta-Scheduler. This tool is currently documented on its own pages. SUMS provides a uniform user interface to submitting jobs on "a" farm that is, regardless of the batch system used, the language it provides (in XML) is identical. The scheduling is controlled by policies handling all the details on fitting your jobs in the proper queue, requesting proper resource allocation and so on. In other words, it isolates users from the infrastructure details.

You would benefit from starting with the following documents:

LSF

LSF was dropped from BNL facility support in July 2008 due to licensing cost. Please, refer to the historical revision for information about it. If a link brought you here, please update or send a note to the page owner. Information have been kept un-published You do not have access to view this node.

Condor

Quick start ...

Condor Pools at BNL

The condor pools are segmented into four pools extracted from this RACF page:

production jobs +Experiment = "star" +Job_Type = "crs" high priority CRS jobs, no time limit, may use all the slots on CRS nodes and up to 1/2 available job slots per system on CAS ; the CRS portion is not available to normal users and using this Job_Type for user will fail
users normal jobs +Experiment = "star" +Job_Type = "cas" short jobs, 3 to 5 hour soft limit (when resources are requested by others), 40 hour hard limit - this has higher priority than the "long" Job_Type.
user long jobs +Experiment = "star" +Job_Type = "long" long running jobs, 5 day soft limit (when resources are requested by others), 10 day hard limit, may use 1 job slot per system on a subset of machines
general queue +Experiment = "general"
+Experiment = "star"
  General queue shared by multiple experiments, 2 hours guaranteed time minimum (can be evicted afterward by any experiment's specific jobs claiming the slot)

 

The Condor configurations do not have create a simple notion of queues but generates a notion of pools. Pools are group of resources spanning all STAR machines (RCAS and RCRS nodes) and even other experiment's nodes. The first column tend to suggest four of such pools although we will see below that life is more complicated than that.

First, it is important to understand that the +Experiment attribute is only used for accounting purposes and what makes the difference between a user job or a production job or a general job is really the other attributes. 

Selection of how your jobs will run is the role of +Job_Type attribute. When it is unspecified, the general queue (spanning all RHIC machines at the facility) is assumed but your job may not have the same time limit. We will discuss the restriction later. The 4th column of the table above shows the CPU time limits and additional constraints such as the number of slots within a given category one may claim. Note that the +Job_type="crs" is reserved and its access will be enforced by Condor (only starreco may access this type).

In addition of using +Job_type which as we have seen controls what comes as close as possible to a queue in Condor, one may need to restrict its jobs to run on a subset of machines by using the CPU_Type attribute in the Requirements tag (if you are not completely lost by now, you are good ;-0 ).  An example to illustrate this:

+Experiment = "star"
+Job_type = "cas"
Requirements = (CPU_type != "crs") && (CPU_Experiment == "star")

In this example, a cas job (interpret this as "a normal user analysis job") is being run on behalf of the experiment star. The CPU / nodes requested are the CPU belonging to the star experiment and the nodes are not RCRS nodes. By specifying those two requirements, the user is trying to make sure that jobs will be running on RCAS nodes only (or != "crs") AND, regardless of a possible switch to +Experiment="general", the jobs will still be running on the nodes belonging to STAR only.

In this second example

+Experiment = "star"
+Job_type = "cas"
Requirements = (CPU_Experiment == "star")

we have pretty much the same request as before but the jobs may also run on RCRS nodes. However, if data production runs (+Job_type="crs" only starreco may start), the user's job will likely be evicted (as production jobs will have higher priorities) and the user may not want to risk that hence specifying the first Requirements tag.

Pool rules

A few rules apply (or summarized) below:

  • Production jobs cannot be evicted on their claimed slots ... since they have higher priority than user jobs even on CAS nodes, this means that as soon as production jobs starts, its pool of slots will slowly but surely be taken - user's jobs may use those slots at low-downs of utlization.
  • Users jobs can be evicte. Eviction happens after 3 hours of runtime from the time they start but only if the slot they are running in is claimed by other jobs. For example, if a production job wants a node being used by a user job that has been running for two hours then that user job has one hour left before it gets kicked out ...
  • This time limit comes into effect when a higher priority job wants the slot (i.e. production vs. user or production)
  • general queue jobs are evicted after two hours of guaranteed time when the slot is wanted by ANY STAR job (production, user)
  • general queue jobs will be evicted however if they consume more than 750 MB of memory

This provides the general structure of the Condor policy in place for STAR. The other policy options in place goes as follows:

  1. The following options apply to all machines: the 1 mn load has to be less than 1.4 on a two CPU node for a job to start
  2. General queue jobs will not start on any node unless 1 min < 1.4, swap > 200M, memory > 100M.
  3. User fairshare is in place.

In the land of confusion ...

Also, users are often confused of the meaning of their job priority. Condor will consider a user's job priority and submit jobs in priority order (where the larger the number, more likely the job willl start) but those priorities have NO meaning across two distinct users. In other words, it is not because user A sets job priorities larger by an order of magitude comparing to user B that his job will start first. Job priority only providesa mechanism for a user to specify which of their idle jobs in the queue are most important. Jobs with higher numerical priority should run before those with lower priority, although because jobs can be submitted from multiple machines, this is not always the case. Job prioritties are listed by the condor_q command in the PRIO column.

The effective user priority is dynamic, on the other hand, and changes as a user has been given access to resources over a period of time. A lower numerical effective user priority (EUP) indicates a higher priority. Condor's fairshare mechanism is implemented via EUP. The condor_userprio command hence provides an indication of your faireshareness.

You should be able to use condor_qedit to manually modify the "Priority" parameter, if desired. If a job does not run for weeks, there is likely a problem with its submitfile or one of its input, and in particular its Requirements line. You can use condor_q -analyze JOBID, or condor_q -better-analyze JOBID to determine why it cannot be scheduled.

 

What you need to know about Condor

First of all, we recommend you use SUMS to submit to Condor as we would take care of adding codes, tricks, tweaks to make sure your jobs run smoothly. But if you really don't want to, here are a few issues you may encounter:

  • Unless you use the GetEnv=true  datacard directive in your condor job description, Condor jobs will start with a blank set of environment variables unlike a shell startup. Especially, none of
    SHELL, HOME, LOGNAME, PATH, TERM and MAIL
    will be define. The absence of $HOME will have for side effect that, whenever a job starts, your .cshrc and .login will not be seen hence, your STAR environment will not be loaded. You must take this into account and execute the STAR login by hand (within your job file).
    Note that using GetEnv=true has its own sde effects which includes a full copy of the environment variables as defined from the submitter node. This will NOT be suitable for distributed computing jobs. The use of getenv() C primitive in your code is especially questionable (it will unlikely return a valid value) and
    • STAR user may look at this post for more information on how to use ROOT function calls for defining some of the above.
    • You may also use getent shell command (if exists) to get the value of your home directory
    • A combinations of getpwuid(), getpwnam() would allow to define $USER and $HOME
       
  • Condor follows a multi-submitter node model with no centralized repository for all jobs. As a consequence, whenever you use a command such as condor_rm, you would kill the jobs you have submitted from that node only. To kill jobs submitted from other submitter nodes (any interactive node at BNL is a potential submitter node), you need to loop over the possibilities and use the -name command line option.
     
  • Condor will keep your jobs indefinitely in the Pool unless you either remove the jobs or specify a condition allowing for jobs to be automatically removed upon status and expiration time. A few examples below could be used for the PeriodicRemove Condor datacard
    • To automatically remove jobs which have been in the queue for more than 2 days but marked as status 5 (held for a reason or another and not moving) use
      (((CurrentTime - EnteredCurrentStatus) > (2*24*3600)) && JobStatus == 5)
    • To automatically remove jobsruning the the queue for more than 2 days but using less than 10% of the CPU (probably looping or inefficient jobs blocking a job slot), use
      (JobStatus == 2 && (CurrentTime - JobCurrentStartDate > (54000)) && 
                          ((RemoteUserCpu+RemoteSysCpu)/(CurrentTime-JobCurrentStartDate) < 0.10))
    The full current condition SUMS add to each job is
    PeriodicRemove  = (JobStatus == 2 && (CurrentTime - JobCurrentStartDate > (54000)) && 
                       ((RemoteUserCpu+RemoteSysCpu)/(CurrentTime-JobCurrentStartDate) < 0.10)) || 
                      (((CurrentTime - EnteredCurrentStatus) > (2*24*3600)) && JobStatus == 5)

Some condor commands

This is not meant to be an exhaustive set of commands nor a tutorial. You are invited to read to the manpages for condor_submit, condor_rm, condor_q, condor_status. Those will be most of what you will need to use on a daily basis. Help for version 6.9 is available online.

  • Query and information
    • condor_q -submitter $USER
      List jobs of specific submitter $USER from all the queues in the pool
    • condor_q -submitter $USER -format "%s\n" ClusterID
      Shows the JobID for all jobs of $USER. This command may succeed although an unconstrained condor_q may fell if we had a large amount of jobs
    • condor_q -analyze $JOBID
      Perform an approximate analysis to determine how many resources are available to run the requested jobs.
    • condor_status -submitters
      shows the numbers of running/idle/held jobs for each user on all machines
    • condor_status -claimed
      Summarize jobs by servers as claimed
    • condor_status -avail
      Summarize resources which are available
       
  • Removing jobs, controlling them
    • condor_rm $USER
      removes all of your jobs submitted from this machine
    • condor_rm -name $node $USER
      removed all jobs for $USER submitted from machine $node
    • condor_rm -forcex $JOBID
      Forces the immediate local removal of jobs in undefined state (only affects jobs already being removed). This is needed if condor_q -submitter shows your job but condor_q -analyze $JOBID does not (indicating an out of sync information at Condor level).
    • condor_release $USER
      releases all of your held jobs back into the pending pool for $USER
    • condor_vacate -fast
      may be used to remove all jobs from the submitter node job queue. This is a fast mode command (no checks) and applies to running jobs (not pending ones)
       
  • More advanced
    • condor_status -constraint 'RemoteUser == "$USER@bnl.gov"'
      lists the machines on which your jobs are currently running
    • condor_q -submitter username -format "%d" ClusterId -format "  %d" JobStatus -format "   %s\n" Cmd
      shows the job id,  status, and command for all of your jobs.  1==Idle, 2==Running for Status.  I use something like this because the default output of condor_q truncates the command at 80 characters and prevents you from seeing the actually scheduler job ID associated with the Condor job.  I'll work on improving this command, but this is what I've got for now.
    • To access the reason for job 26875.0 to be held from a submitter node advertized to be rcas6007, use the following command to have a human readable format
      condor_q -pool condor02.rcf.bnl.gov:9664 -name rcas6007 -format "%s\n" HoldReason 26875.0
       

 

Computing Environment


The pages below will give you a rapid overview of the computing environment at BNL, including information for visitors and employees, accessible printers, best practices, recommended tools for managing Windows.

FAQs and Tips

Software Site Licenses

Do we have a site license for software package XYZ?

The answer is (almost) always: No!
Neither STAR nor BNL have site licences for any Microsoft product, Hummingbird Exceed, WinZIP, ssh.com's software or much of anything intended to run on individual users' desktops. Furthermore, for most purposes BNL-owned computers do not qualify for academic software licenses, though exceptions do exist.

FAQ: PDF creation

How can I create a file in pdf format?

Without Adobe Acrobat (an expensive bit of software), this can be a daunting question. I am researching answers, some of which are available in my Windows software tips. Here is the gist of it in a nutshell as I write this -- there are online conversion services and OpenOffice is capable of exporting PDF documents.

FAQ: X Servers

What X server software should I use in Windows?

I recommend trying the X Server that is available freely with Cygwin, for which I have created some documentation here: Cygwin Tips. If you can't make that work for you, then I next recommend a commercial product called Xmanager, available from http://www.netsarang.com. Last time I checked, you could still download a fully functional version for a time-limited evaluation period.

TIP: Windows Hibernation trick

Hibernate or Standby -- There is a difference which you might find handy: 
  • "Standby" puts the machine in a low power state from which it can be woken up nearly instantly with some stimulus, such as a keystroke or mouse movement (much like a screensaver) but the state requires a continuous power source.  The power required is quite small compared to normal running, but it can eventually deplete the battery (or crash hard if the power is lost in the case of a desktop).
  • "Hibernate" actually dumps everything in memory to disk and turns off the computer, then upon restarting it reloads the saved memory and basically is back to where it was.  While hibernating, no power source is required.  It can't wake up quickly (it takes about as long as a normal bootup), but when it does wake up, (almost) everything is just the way you left it.  One caveat about networking is in order here:  Stateful connections (eg. ssh logins) are not likely to survive a hibernation mode (though you may be able to enable such a feature if you control both the client and server configurations), but most web browsing activity and email clients, which don't maintain an active connection, can happily resume where they left off.

Imagine:  the lightning is starting, and you've got 50 windows open on your desktop that would take an hour to restore from scratch.  You want to hibernate now!  Here's how to enable hibernating if it isn't showing up in the shutdown box: 
Open the Control Panels and open "Power Options".  Go to the "Hibernate" tab and make sure the the box to enable Hibernation is checked.  When you hit "Turn Off Computer" in the Start menu, if you still only see a Standby button, then try holding down a Shift key -- the Standby button should change to a Hibernate button.  Obvious, huh?

For the curious:
There are actually six (or seven depending on what you call "official") ACPI power states, but most motherboards/BIOSes only support a subset of these.  To learn more, try Googling "acpi power state", or you can start here as long as this link works.  (Note there is an error in the main post -- the S5 state is actually "Shutdown" in Microsoft's terminology). 
From the command line, you can play around with these things with such straightforward commands as:

%windir%\System32\rundll32.exe powrprof.dll,SetSuspendState 1 

Even more obvious, right?  If you like that, then try this on for size.

TIP: My new computer is broken!:

It's almost certainly true - your new computer is faulty and the manufacturer knows it!  Unfortunately, that's just a fact of life.  Straight out of the box, or after acquiring a used PC, you might just want to have a peek at the vendor's website for various updates that have been released.  BIOS updates for the motherboard are a good place to start, as they tend to fix all sorts of niggling problems.  Firmware updates for other components are common as are driver updates and software patches for pre-installed software.  I've solved a number of problems applying these types of updates, though it can take hours to go through them thoroughly and most of the updates have no noticeable effect.  And it is dangerous at times.  One anecdote to share here -- we had a common wireless PC Card adapter that was well supported in both Windows and Linux.  The vendor provided an updated firmware for the card, installed under Windows.  But it turned out that the Linux drivers wouldn't work with the updated firmware.  So back we went to reinstall a less new firmware.  You'll want to try to be intelligent and discerning in your choices.  Dell for instance does a decent job with this (your Dell Service Tag is one very useful key here), but still requires a lot from the updater to help ensure things go smoothly.  This of course is in addition to OS updates that are so vital to security and discussed elsewhere.

Printers


STAR's publicly available printers are listed below. 


IP name
Wireless (Corus) CUPS URL
IP address Model Location rcf2 Queue Name Features
lj4700.star.bnl.gov

http://cups.bnl.gov:631/printers/HP_Color_LaserJet_4700_2
130.199.16.220 HP Color LaserJet 4700DN 510, room M1-16 lj4700-star color, duplexing, driver download site
(search for LaserJet 4700, recommend the PCL driver)
lj4700-2.star.bnl.gov

http://cups.bnl.gov:631/printers/lj4700-2.star.bnl.gov
130.199.16.221 HP Color LaserJet 4700DN 510, room M1-16 lj4700-2-star color, duplexing, driver download site
(search for LaserJet 4700, recommend the PCL driver)
hp510hall.star.bnl.gov

http://cups.bnl.gov:631/printers/hp510hall
130.199.16.222 HP LaserJet 2200DN 510, outside 1-164 hp510hall B&W, duplexing
starhp2.star.bnl.gov

http://cups.bnl.gov:631/printers/starhp2.star.bnl.gov
130.199.16.223 HP LaserJet 8100DN 510M, hallway starhp2_p B&W, duplexing
onlprinter1.star.bnl.gov

http://cups.bnl.gov/printers/onlprinter1.star.bnl.gov
130.199.162.165 HP Color LaserJet 4700DN 1006, Control Room staronl1 color, duplexing
chprinter.star.bnl.gov

N/A
130.199.162.178 HP Color LaserJet 3800dtn 1006C, mailroom n/a color, duplexing

There are additional printing resources available at BNL, such as large format paper, plotters, lamination and such.  Email us at starsupport 'at' bnl.gov and we might be able to help you locate such a resource.

 

Printing from the wireless (Corus) network

The "standard" way of printing from the wireless network is to go through ITD's CUPS server on the wireless network.  How to do this varies from OS to OS, but here is a Windows walkthrough.  The key thing is getting the URI for the printer into the right place:
 

  • Open the Printers Control Panel and click "Add a Printer". 
  • Select the option to add a network printer.  (Ignore the list of printers that it generates automatically).
  • Click on the button or option for "the printer that I want isn't listed". 
  • Select the option for a shared printer and enter the green URL from the list above for the printer you want.
    eg. http://cups.bnl.gov:631/printers/HP_Color_LaserJet_4700_2
  • On the next window, select the hardware manufacturer and model (if not listed, let Windows search for additional models).
  • Print a test page and cross your fingers... 
  • If your test print does not come out, it doesn't necessarily mean your configuration is wrong - sometimes a problem occurs on the the CUPS server that prevents printing - it isn't always easy to tell where the fault lies.

 

Since printing through ITD's CUPS servers at BNL has not been very reliable, here are some less convenient alternatives to using the printers that you may find handy.  (Note that with these, you can even print on our printers while you are offsite - probably not something to do often, but might come in handy sometimes.)
 

1.  Use VPN.  But if you are avoiding the internal network altogether for some reason, or can't use the VPN client, then keep reading...

2.  Get your files to rcf2.rhic.bnl.gov and print from there.  Most of printers listed above have rcf print queues (hence the column "rcf2 queue name").  But if you want to use a printer for which there is no queue on rcf2, or you have a format or file type that you can't figure out how to print from rcf2, then the next tip might be what you need.

3.  SSH tunnels can provide a way to talk directly (sort-of) to almost any printer on the campus wired network.  At least as far as your laptop's print subsystem is concerned, you will be talking directly to the printer.  (This is especially nice if you want to make various configuration changes to the print job through a locally installed driver.)  But if you don't understand SSH tunnels, this is gonna look like gibberish:

Here is the basic idea, using the printer in the Control Room.
It assumes you have access to both the RSSH and STAR SSH gateways.

The ITD SSH gateways might also work in place of rssh (I haven't
tried them yet).  If they can talk directly to our printers,
then it would eliminate step C below.

A.  From your laptop:

ssh -A -L 9100:127.0.0.1:9100 <username>@rssh.rhic.bnl.gov

(Note 1:  -A is only useful if you are running an ssh-agent with a
loaded key, which I highly recommend)

(Note 2:   Unfortunately, the rssh gateways cannot talk directly to our
printers, so we have to create another tunnel to a node that can...  If the
ITD SSH gateways can communicate directly with the printers, then the
next hop would be unnecessary...)

B.  From the rssh session:

ssh -L 9100:130.199.162.165:9100 <username>@stargw1.starp.bnl.gov

(Note 1: 130.199.162.165 is the IP address of onlprinter1.star.bnl.gov -
it could be replaced with any printer's IP address on the wired network.)
(Note 2:  port 9100 is the HP JetDirect default port - non-HP printers
might not use this, and there are other ways of communicating with HP
network printers, so ymmv - but the general idea will work with most TCP 
communications, if you know the port number in use. 

C.  On your laptop, set up a local print queue as if you were going to
print directly to the printer over the network (with no intermediate
server), but instead of supplying the printer's IP address, use
127.0.0.1 instead.

D. Start printing...


If you close either of the ssh sessions above, you will have to
re-establish them before you can print again. 

The two ssh commands can be combined into one and you can create an alias to
save typing the whole thing each time.  (Or use PuTTY or some other GUI SSH client
wrapper to save these details for reuse.)

You could set up multiple printers this way, but to use them
simultaneously, you would need to use unique port numbers for each one
(though the port number at the end of the printer IP would stay 9100).

 

Direct connection, internal network

You can use direct connections to access them over the network.

  • Direct:  These printers accept direct TCP/IP connections, without any intermediate server. 
  • JetDirect (AppSocket) and lpd usually work under Linux. 
  • For Windows NT/2K/XP, a Standard TCP/IP port is usually the way to go. 

How to configure this varies with OS and your installed printing software.

Tips

What follows are miscellaneous tips and suggestions that will be irregularly maintained.

  • The 2-sided printers are configured to print 2-sided by default, but the default for many printer drivers will override this and specify 1-sided.  If you are printing from Windows, you can usually choose your preferences for this in the printer preferences or configuration GUI.  You may need to look in the Advanced Settings and/or Printing Defaults to enable 2-sided printing in Windows.
  • Depending on the print method and drivers used, from the Linux command line you may be able to specify various options for things like duplex printing.  To see available options for a given print queue, try the "lpoptions" command.  For instance, on rcf2 you could do "lpoptions -d xerox7300 -l".  In the output, you will find a line like this:  "Duplex/2-Sided Printing: DuplexNoTumble *DuplexTumble None"  (DuplexNoTumble is the same as flip on long edge, while DuplexTumble is the same as flip on short edge, and the * indicates the default setting.)  So to turn off duplex printing, you could do "lp -d xerox7300 -o Duplex=None <filename>".  Keep in mind that not all options listed by lpoptions may actually be supported by the printer, and the defaults (especially in the rcf queues) may not be what you'd like.  There are so many print systems, options and drivers in Linux/Unix that there's no way to quickly describe all the possible scenarios.
  • There is a handy utility called a2ps that is available on most Linux distributions. It is an "Any to PostScript" filter that started as a Text to PostScript converter, with pretty printing features and all the expected features from this kind of program. But it is also able to deal with other file types (PostScript, Texinfo, compressed, whatever...) provided you have the necessary tools installed.

  • psresize is another useful utility in Linux for dealing with undesired page sizes. If you are given a PostScript file that specifies A4 paper, but want to print it on US Letter-sized paper, then you can do:
    psresize -PA4 -pletter in.ps out.ps
    See the man page for more information.
  • Some of the newer printers have installation wizards for Windows that can be accessed through their web interfaces. I've had mixed success with the HP IPP installation wizards. The Xerox wizard (linked above) has worked well, though it pops up some unnecessary windows and is a bit on the slow side.

  • Windows 9x/Me users will likely have to install software on their machines in order to print directly to these printers. HP and Xerox have such software available for download from their respective support websites, but who uses these OSes anymore?

  • For linux users setting up new machines, CUPS at least for recent distros is the default printing system (unless upgrading from an older distribution, in which case LPRng may still be in use).  Given an appropriate PPD file, CUPS is capable of utilizing various print options, such as tray selection and duplexing, or at least you can create different queues with different options to a single printer.

  • There are other potentially useful printers around that are not catalogued here. Some are STAR printers out of the mainstream (like in 1006D), and some belong to other groups in the physics department.

Quick (?) start guide for visitors with laptops

So you brought a laptop to BNL… and the first thing you want to do is get online, right?
Ok, here's a quick (?) guide to getting what you want without breaking too many rules.

Wired Options:

  • Visitors' network: Dark purple jacks (usually labeled VNxx) are on a visitors' network and are effectively outside of the BNL firewall. They support DHCP and do not require any sort of registration to use. Being outside the firewall can be advantageous, but will prevent you from
    using some network services within BNL (printing, for instance). (The rest of this page is largely irrelevant if you are using the visitors' network.)

  • BNL network: If it isn't dark purple (and it isn't a phone jack) then it is on the BNL network, which supports DHCP on most subnets. (NB. The 60/61 subnet (available in parts of 1006, including the WAH) has a locally managed DHCP server -- contact Wayne Betts to be added to the access list). All devices on the BNL networks are required to be registered based on the MAC address that is unique to each network interface. To help enforce this policy, if you request a DHCP
    address from an unregistered node, you will be assigned a restricted address. With a restricted IP address, your web browser will be automatically redirected to the BNL registration page, and you will be unable to surf anywhere else until you are registered.

    When registering a laptop, fill in "varies" for the location fields. For the computer name field, I recommend using "DHCP Client" (unless you have a static IP address of course).

    Previously registered users are encouraged to verify and update their registration information by going to http://register.bnl.gov from the machine to be updated.

    There you can also find out more about the registration system and find links to some useful information for network users.

     

Windows

This area is intended to provide information for STAR members to assist in configuring and using typical desktop/laptop PCs at BNL.

  Windows 2000/XP and Scientific Linux/Redhat Enterprise Linux are the preferred Operating Systems within STAR at BNL for desktop computing, though there is no formal requirement to use any particular OS.

  These pages are intended to be dynamic, subject to the constantly changing software world and user input.   Feedback from users -- what you find indispensable; what is misleading, confusing or flat-out wrong; and what is missing that you wish was here -- can help to significantly increase the value of these pages.

  Additional pages that are under consideration for creation:

  • Windows installation checklist (the basic software and configuration that should probably be on every Windows PC)
  • Linux installation checklist
  • Common Linux details and useful links, such as Linux equivalents to software for Windows.
  • Resources specific to the experiment operations (eg. common DAQ NFS mounts)
  • Publically useable terminals

Cygwin installation and tips

To quote from the Cygwin website:  "Cygwin is a Linux-like environment for Windows."

The Linux-like nature is quite comprehensive...  You can *almost* forget that you are using a Windows OS -- most utilities and software that you are familiar with from your Linux experience are available in Cygwin.  For example, the Cygwin distribution has available an openssh client (and the server too, but I don't recommend you use it), PostScript and PDF viewers and editors, compression (eg. zip) utilities, software development tools and X Windows packages (more on X below). 

Using the Cygwin X server

An example of Cygwin's usefulness and cost-saving potential is the X server.  The Cygwin X server is, in most cases, easy and convenient to use in place of commercial X servers such as Hummingbird Exceed.  Here is the short version for those familiar with Cygwin installations:
  1. You need the xorg-x11-base and X-startup-scripts packages (and whatever dependencies they have, which the setup routine should solve for you).  You'll probably also want the xwinclip package.  All of these are in the X11 Category in the Cygwin Setup.
  2. Execute "startxwin.bat" (in <cygwin_root>/usr/X11R6/bin/).  That will start a stand-alone X Server and an xterm with a cygwin shell.   Edit this batch file as you see fit -- it includes documentation for a number of options. 
  3. If you are displaying windows from a remote session over ssh, be sure you have X tunneling enabled in your ssh client configuration.  Please do not try to open up your X server to the entire world with anything like "xhost +".  That is a *VERY BAD IDEA*.
  4. In light of step 3 above:  If you have a local firewall that asks about blocking access to the Xserver, you can usually block it without a problem -- if you have X forwarding enabled and working, then you are usually ok.  (If you believe a localhost-based firewall is interfering with X, try allowing only connections from the loopback/localhost address (127.0.0.1)).
Long version:  Walkthrough of a Cygwin installation (MS Word doc).

Subsidiary recommendation:

There is a handy tool for initiating shell connections to remote hosts (such as via ssh) and starting the Cygwin X server called Mortens Cygwin X-Launcher.  Coming soon (?): screenshots of the X-Launcher configuration that are most likely to be useful...

Installation Tip:

A Cygwin mirror is available at http://mirror.bnl.gov/cygwin/ making the installation go quite quickly if you are at BNL.  This is quite handy for the cygwin installation and any subsequent use of the setup utility.  One potential catch for onsite users -- even if you intend to use the local mirror, you must still configure a BNL proxy server during Setup, as shown in  this walkthrough of a Cygwin installation (MS Word format).
Please send comments, corrections and suggestions to Wayne Betts: wbetts {at} bnl.gov

FAQS and Tips that don't fit well elsewhere

Software Site Licenses:

Do we have a site license for software package XYZ?

The answer is (almost) always: No!
Neither STAR nor BNL have site licences for any Microsoft product, Hummingbird Exceed, WinZIP, ssh.com's software or much of anything intended to run on individual users' desktops. Furthermore, for most purposes BNL-owned computers do not qualify for academic software licenses, though exceptions do exist.

FAQ: PDF creation:

How can I create a file in pdf format?

Without Adobe Acrobat (an expensive bit of software), this can be a daunting question. I am researching answers, some of which are available in my Windows software tips. Here is the gist of it in a nutshell as I write this -- there are online conversion services and OpenOffice is capable of exporting PDF documents.

FAQ: X Servers:

What X server software should I use in Windows?

I recommend trying the X Server that is available freely with Cygwin, for which I have created some documentation here: Cygwin Tips. If you can't make that work for you, then I next recommend a commercial product called Xmanager, available from http://www.netsarang.com. Last time I checked, you could still download a fully functional version for a time-limited evaluation period.

TIP: Windows Hibernation trick:

Hibernate or Standby -- There is a difference which you might find handy: 
  • "Standby" puts the machine in a low power state from which it can be woken up nearly instantly with some stimulus, such as a keystroke or mouse movement (much like a screensaver) but the state requires a continuous power source.  The power required is quite small compared to normal running, but it can eventually deplete the battery (or crash hard if the power is lost in the case of a desktop).
  • "Hibernate" actually dumps everything in memory to disk and turns off the computer, then upon restarting it reloads the saved memory and basically is back to where it was.  While hibernating, no power source is required.  It can't wake up quickly (it takes about as long as a normal bootup), but when it does wake up, (almost) everything is just the way you left it.  One caveat about networking is in order here:  Stateful connections (eg. ssh logins) are not likely to survive a hibernation mode (though you may be able to enable such a feature if you control both the client and server configurations), but most web browsing activity and email clients, which don't maintain an active connection, can happily resume where they left off.

Imagine:  the lightning is starting, and you've got 50 windows open on your desktop that would take an hour to restore from scratch.  You want to hibernate now!  Here's how to enable hibernating if it isn't showing up in the shutdown box: 
Open the Control Panels and open "Power Options".  Go to the "Hibernate" tab and make sure the the box to enable Hibernation is checked.  When you hit "Turn Off Computer" in the Start menu, if you still only see a Standby button, then try holding down a Shift key -- the Standby button should change to a Hibernate button.  Obvious, huh?

For the curious:
There are actually six (or seven depending on what you call "official") ACPI power states, but most motherboards/BIOSes only support a subset of these.  To learn more, try Googling "acpi power state", or you can start here as long as this link works.  (Note there is an error in the main post -- the S5 state is actually "Shutdown" in Microsoft's terminology). 
From the command line, you can play around with these things with such straighforward commands as:
%windir%\System32\rundll32.exe powrprof.dll,SetSuspendState 1
Even more obvious, right?  If you like that, then try this on for size.

TIP: My new computer is broken!:

It's almost certainly true - your new computer is faulty and the manufacturer knows it!  Unfortunately, that's just a fact of life.  Straight out of the box, or after acquiring a used PC, you might just want to have a peek at the vendor's website for various updates that have been released.  BIOS updates for the motherboard are a good place to start, as they tend to fix all sorts of niggling problems.  Firmware updates for other components are common as are driver updates and software patches for pre-installed software.  I've solved a number of problems applying these types of updates, though it can take hours to go through them thoroughly and most of the updates have no noticeable effect.  And it is dangerous at times.  One anecdote to share here -- we had a common wireless PC Card adapter that was well supported in both Windows and Linux.  The vendor provided an updated firmware for the card, installed under Windows.  But it turned out that the Linux drivers wouldn't work with the updated firmware.  So back we went to reinstall a less new firmware.  You'll want to try to be intelligent and discerning in your choices.  Dell for instance does a decent job with this (your Dell Service Tag is one very useful key here), but still requires a lot from the updater to help ensure things go smoothly.  This of course is in addition to OS updates that are so vital to security and discussed elsewhere.



Please send comments, corrections and suggestions to Wayne Betts: wbetts {at} bnl.gov

Networking Software

Networking
Software

  • PuTTY:
     This is the preferred SSH client for Windows.  It is free, easy to use
    and well maintained for both security and bug issues.
     (As with everything, it is only "maintained" if you regularly check
    for updated versions!)
     Please note that most other SSH clients for Windows are NOT free for
    use on government computers or in the pursuit of lab business, though
    they might function just fine without payment.

  • WinSCP:  This is a fine graphical SFTP and SCP client utility with some additional features built in.

  • X servers (no, Exceed doesn't make the cut because of the high monetary cost):

    • Cygwin:  Please look at the separate Cygwin page for information on installing and configuring the Cygwin X server.

    • Xmanager:  I
      recommend that you use the Cygwin X server, but if you find something
      that it can't handle, then this is the recommended alternative. 
      It isn't free (but it does have fully functional time-limited
      evaluation license if you want to try it out.) 
      It is much cheaper than Exceed and seemingly just as capable, but
      without quite as much overhead. 
      I'm particularly interested in hearing about X Server alternatives, so
      let me know if you have a favorite!

  • Alternatives to Microsoft's Internet Explorer and Outlook Express:

     As
    the leading web browser and mail client, these two apps are the target
    of prolific viruses, trojans, malware and other nasties. 
    In addition to avoiding many of these, you may also like some of the
    features available in the alternatives (eg. tabbed browsing is a
    popular feature unavailable in IE). 
    Four alternatives are in common use (three of them share much of the
    same code-base -- Mozilla, Netscape Navigator and Firefox). 
     This review
    might help sort you out the differences.
     As with anything, your preference is yours to decide (and also, as
    with everything else here, feature and security updates are released
    quite often, so you might try to check for new versions regularly): 
    They are listed here from highest recommendation to lowest:

    1. Firefox/Thunderbird: 
      Though frequently mentioned as a pair, Firefox and Thunderbird are
      stand-alone applications. 
      Firefox is a web browser, and Thunderbird is an email client. 
      "Stand alone" here means that these can be installed separately from
      each other. 
      You can configure them to work with alternative software as you wish
      (eg. use Firefox for surfing, but set Outlook as your default mail
      client). Actually, you can generally mix and match pieces from all of
      these alternatives, but most of them start out with defaults tied to
      their suite companions. 
      Slight thumbs up to Firefox over the other alternatives because it has
      almost every feature found in the corresponding Mozilla suite, plus
      additional add-ons. 
      Vast numbers of independently produced add-ons and customizations are
      available as well.
    2. Mozilla Suite: 
      A suite that includes the big three:  a browser, email client and HTML
      editor. 
      This is a fine alternative, but as a browser alternative, this author
      gives the bigger thumbs up to its sibling, Firefox, listed above.
    3. Opera. 
      It is available in a free version with a "branding" bar that contains
      advertisements, or you can buy the product to remove this minor
      annoyance.  (Branding/non-branding examples.)
    4. Netscape: 
      The Netscape suite includes a browser (Navigator), email client (Mail),
      HTML editor (Composer) and other tidbits. 
      Of the three Mozilla-based browsers, this is probably the least used
      and has the most extraneous stuff thrown in, which is one of several
      reasons it gets last place in this list.
        It is good enough to recommend, but just not quite as highly as the
      others.

  • Java, WebStart, JRE, J2RE, JSDK,
    Microsoft VM and all that Jazz...: The author of this segment finds
    this to be very puzzling and sometimes frustrating stuff to understand,
    keep up with, and especially to try to explain clearly and succinctly. 
    <Melodrama> Imagine Sun, IBM and Microsoft all walked into a bar
    and had a few drinks. Heck, let Netscape walk in a few minutes later
    for good measure. 
    Fifty states' attorneys general plus the US AG and DOJ are to act as a
    referee. 
    Now imagine that you, a mere passerby on the street were harangued into
    cleaning up the inevitable bar fight, complete with broken bottles,
    flying bar stools and blood everywhere all while it is still going on. 
    That's not even close to how awful it is...</Melodrama>   Details
    to be filled in here!

  • OpenAFS, MIT Kerberos, Wake and Leash: Details to be filled in here!

  • Google Toolbar :
    This is a very convenient interface to initiate Google searches, plus a
    decent pop-up blocker. Unfortunately, it is only available for Internet
    Explorer (though other browsers may support similar features natively).

Please send comments, corrections and suggestions to Wayne Betts: wbetts {at} bnl.gov

Office applications and productivity software

Productivity software and viewers/utilities for various file types

  • OpenOffice -- Free and available on multiple platforms.  Perhaps the single best reason to use it is that it natively creates PDF format.  In addition to its own formats, it can read (and write) MS Word, Excel and PowerPoint files (usually -- sometimes formatting details go haywire, but they are constantly updating it.)

  • Adobe Reader -- Used for viewing PDF documents.  (You will probably want to install it with the very useful text search feature.) (Linux users can try xpdf as an alternative which is part of many distributions.)

  • Ghostscript and GSview: PostScript interpreter and viewer (and PDF too) that you probably want to have.

  • Online Document Conversion Services:  Neevia Technology and CERN Document Conversion Service both have file convertors that allow you to submit a variety of common (and uncommon) file formats in small numbers and produce files in different formats (PDF being of most interest probably).  Though not convenient for many files or very large files (and certainly inappropriate for confidential or non-public information), they are good to know about.  (Don't forget -- OpenOffice is able to export documents in PDF format too and handles a lot of file types.)

  • Graphics and Image Manipulation software:  The GIMP and ImageMagick are both quite capable tools available for free for multiple platforms.  Perhaps not perfect replacements for Adobe PhotoShop, but pretty darn good.  (If you are a PhotoShop veteran, then you'll have to spend some time learning the ropes, but it will probably be worth it.)

  • Compression Utilities:  WinZip is not free (though many, many people use it without payment).  Fortunately, there are freeware alternatives.  For instance:
    • 7Zip:  This is the current recommendation of this page, the reasons for which may be included in the future..
    • FreeZip (but not "FreeZip!" which is reported to contain spyware and/or adware)
    • ZipCentral
    • ZipItFast
    • ExtractNow
    • CAMUnZip
    • ZipWrangler
    • Freebyte Zip

  • If you've ever spent a few minutes waiting for MS Windows Search function to find a file on your system, then you might find the following can save you some time. The basic idea is similar to most internet search engines: index your files (while the computer would otherwise be idle so as not to slow things down for the user) and then consult the indexes when a search is requested:

    • Yahoo! Desktop Search:  This is a free version of a well respected product from X1 with a few features removed, such as indexing of remote drives, Eudora and Mozilla-based email.
    • Google Desktop Search:  Use Google's Desktop Search to quickly search for files on your computer using an indexing system much like Google's web indexing.  Not all file types are supported, but most common ones are, such as Outlook mail, MS Office documents and so on.
    • Copernic Desktop Search:  This is similar to the Google Desktop Search, but appears to be a bit more capable, though as of this writing I have not had time or cause to test it much.  User comments would be appreciated.
    • Windows 2000 and XP include an "Indexing Service" which (according to Microsoft) is "a base service [...] that extracts content from files and constructs an indexed catalog to facilitate efficient and rapid searching."  To configure the Indexing Service open Control Panels -> Administrative Tools -> Computer Management.  In the left pane, click the plus sign next to "Services and Applications", then right-click on the "Indexing Service" icon.  In the popup menu, select "All Tasks | Tune Performance".  The "Indexing Service Usage" dialog box will appear.  The Indexing Service is actually quite customizable, though doing so can add significantly to the resources required by the service.  A warning: it can eat up a surprising amount of disk space to maintain the indexes.  It has sped up basic searches for this author, but your mileage may vary in both search efficiency gains and overall performance penalty.

  • Cygwin: Cygwin has a number of utilities for handling, viewing and transforming file formats, so I have a separate page of Cygwin tips
  • Multimedia Players (work related, of course!)

    Pick one.  Use it.  If you find a format it doesn't support, try a different one, or go to the vendor's site and look for a download of an update or add-on (plug-in, patch, codec, etc.) for your format.  This isn't the place to go into the details, but some quick thoughts are included here:
    • Microsoft's Media Player -- you've almost certainly already got it, so why not use it? 
    • Real Player:  complaint -- by default it runs background processes continuously, pops up annoying little messages and practically begs you to register it, though it isn't nessecary for full functionality..  It isn't a big deal to disable these annoyances, but why should you have to?
    • Winamp:  There is a free version and an inexpensive "Pro" version that has CD burning.  It has been up and down over the years, with some versions much quirkier than others.  Currently it seems to be on par with the rest.
    • Apple's iTunes:  Though intended to suck you into Apple's music store, you can use the application without using the store.  In keeping with most Apple stuff, it seems to be well liked by those who like it.  Enough said.


Please send comments, corrections and suggestions to Wayne Betts: wbetts {at} bnl.gov

Performance and Security enhancement

Utilities for Security and Performance

If your computer seems to be running slower than it used to, pop-up advertising is appearing at an alarming rate, your web browser's settings keep changing in undesired ways or you just want a better idea what your computer is up to (eg. "What the heck is PRPCUI.exe?"), here are some resources for understanding what's going on and making things better, presented in roughly the order from those that require the lease detailed understanding to the most:
  • Ad-Aware:

    Ad-Aware was, not very long ago, *the* place to start for malware detection and removal, with the added bonus that it was free.  Alas, recent versions of Ad-Aware (even the Personal version) are no longer licensed quite so freely (let's be clear -- a DOE-owned computer shouldn't have it installed without a paid license.)  It is still free for personal use, so it is highly recommended for home and personal laptop use, though it may not be keeping up with the constantly expanding field, which is a common problem with this type of software.  One thing to keep in mind:  you must be sure to keep your definitions up-to-date, just like a virus scanner, in order to get the most benefit.
  • Spybot - Search&Destroy:

    This is the historical alternative to Ad-Aware, with similar good results "in the early days", but it too may be failing to keep up.  Unlike Ad-Aware, it's license is quite liberal, so it can be installed as desired.  It has an "Advanced" mode, with a variety of additional tools beyond the basic malware scanner, (but keep in mind that some of these features are indeed "Advanced" and not to be played with lightly).  Broken record time:  you must be sure to keep your definitions up-to-date.  You should also consider using the "Immunize" feature to prevent some infestations, and to blacklist some sites known to host various forms of malware.

     

  • Microsoft's Malicious Software Removal Tool:  This is a regularly updated (but far from comprehensive) online removal tool for Windows 2000 and Windows XP.  It isn't a bad idea to run this scanner once a month or whenever you suspect you might have caught "something".

     

  • Microsoft's AntiSpyware Beta:  Though called a Beta product, this is essentially a re-GUIed and slightly modified version of a long standing and respected commercial product that Microsoft recently purchased.   Some recent tests by more-or-less independent testers have shown this tool to be better even than the old reliables, Ad-Aware and SpyBot.

     

  • Defragmenting your hard drive is something to put on the calendar 2-4 times a year.  Because Windows' built-in defragmenter seems especially slow, and modern disk drives hold so much, this is something usually left running overnight.  Third-party alternatives exist that may do a better job in various ways.  Let's hope I get around to listing one or two here in the not-too-distant future...

     

  • CrapCleaner:  This is a system optimization tool for removing unnecesary temporary files and registry entries. The default installation creates a "Run CCleaner" entry in the Recycle Bin's context (right-click) menu.
  • Monitoring startup activity and services.

    Programs that start when you boot or login to your computer can be big performance drains, in addition to doing unwanted things.  The following may help you understand and control what's going on.  (N.B. Some of the following are capable of rendering your system unusable if not handled with care!  They may require significant understanding of Windows' internals to be most useful):

     

    • StartUp Monitor and Startup Control Panel.  These are separate utilities, but they are from the same source and complement each other nicely.  (The author of these has additional utilities that you may find worthwhile as well.)
    • msconfig.exe:  This is Windows' very own "System Configuration Utility", with which one can look at and configure system startup paramters and files, which is especially useful to see the effects of individual changes. You can hose things up quite good in here however, so be careful!
    • services.msc:  This provides a Management Console to configure the startup of various registered services.  This is useul for disabling unnecessary or unused Windows services.  A potentially informative feature in this Console is the "Description" column, though it can still be quite cryptic (or blank).  
    • Merijn.org's website provides several downloads that you might find useful, such as HijackThis ("a general homepage hijack detector and remover"), CWShredder (CoolWebsearch removal tool) and StartUpList ("way better than msconfig")
    • BlackViper.com
    • http://www.sysinfo.org/ (slow site) 
    • Security Task Manager
    • http://www.sysinternals.com
    • HijackThis
    • BHODemon
  • Pop-up Blockers

Pop-up blocking software is increasingly unnecessary because other tools are including their own pop-up blockers.  Mozilla/Firefox for instance have built in pop-up blockers.  Internet Explorer has a pop-up blocker added with Windows XP SP2.  The Google Toolbar (recommended in the "Networking Software" recommendations) has a pop-up stopper as well.  Still, you might fight some utility in the products available from the PanicWare website.  Versions of their Pop-Up Stopper FREE Edition served this author quite well for over a year, but as I said above, it no longer seems as essential as in the pastthe basic functionality has been supplanted by features in other software.
Microsoft Office updates are a combination of security fixes, bug fixes and new features.  Though not emphasized as much as Windows Updates, the security fixes for Office are of similar importance.  Unfortunately, using the online updating system usually requires an installation CD that matches your product (for instance, "Office XP Pro" disks are not acceptable for updating "Office XP Standard".)  Many people, for a variety of reasons, don't have their original installation CD(s).  If you do not have an acceptable installation CD available then the online product update scan can still be used to determine what updates are applicable.  Then you can usually download full updates and apply them manually without the installation media.  (Browse for the downloads that match your product -- most are in self-extracting executable format.)  

  • Clock keepers

  • Multi-desktop software

Other resources



Please send comments, corrections and suggestions to Wayne Betts: wbetts {at} bnl.gov

Required software and configuration for Windows PCs at BNL

BNL-specific requirements and configuration for networked Windows computers:

  • A file and real-time virus scanner with up-to-date virus patterns/definitions is REQUIRED!  (***Cyber-Security requirement***)

      Information about the BNL-supported products from TrendMicro is available from the BNL ITD group: TrendMicro at BNL.   It is critical that any anti-virus product receive regular updates (daily or even more often), which is sometimes difficult for mobile machines on a variety of networks.   Four similar products are available to try to meet the demands of our diverse environment:

    Windows desktops that reside on the BNL internal networks are best served by TrendMicro's basic OfficeScan product.   It has a master server inside the BNL firewall from which it receives updates and to which it reports infections.  Every Windows desktop system at BNL should be using this product, with very few exceptions.  You can
    click here to go to the online install the OfficeScan product.  (You'll need administrator privileges on your system for the installation.)

    Laptop users with wireless networking are encouraged to use a newer OfficeScan version that has a firewall module and is able to recieve virus pattern updates from multiple sources -- so it can roam around on- and off-site and usually still reach an update server.  This OfficeScan version is also more capable of cleaning up some trojans and malware than the desktop version.   To install it in the standard way, you must already be on the BNL external wireless network and go here.   Repeat: you must be on the "BNLexternal" wireless network to use that link.

    BNL employees' personal home computers are permitted to use the PC-cillin product, which gets its updates from servers that are outside the BNL firewall (and it does not report infections to anybody at BNL).  PC-cillin includes a firewall module (OfficeScan does not) and PC-cillin has more (but quite limited) spy-ware and ad-ware detection capabilities.

    If you are running a Windows *Server* OS (if you are unsure, then you almost certainly are not!), then there is yet another option, for which you will need to contact ITD (help desk at x5522 or Jim McManus directly at x4107).

    or those readers to whom none of the above apply, which is to say, computers not owned or used primarily at BNL or by BNL employees, I recommend (though can offer no significant assistance with) the following three free anti-virus products about which we (Wayne / Jerome) have read or heard good things:

    1. AVG Anti-Virus     - JL tried for 3 months, worked great but had conflict with fingerprint driver (thought to be a malicious script when activated)
    2. COMODO Free        - JL tried this for years and it works just fine and appears to be a great product considering the cost (none :-) ). The free version is for home users only so NOT to be installed on a BNL system for sure (usually the case of most Free AV).
    3. Microsoft Sec. E   - Microsoft Security Essentials is new on the market but starts doing a good job and supports Windows 7, Vista and XP

      Other anti-virus resources available include online scanners, such as HouseCall from TrendMicro and Symantec's Security Check.   Most major anti-virus vendors have something similar.   Relying on these online scanners as you primary defense is unwise.   In addition to the inconvenience of manually performing these scans, you really need a product monitoring your system at all times to prevent infections in the first place, rather than trying to clean up afterwards.   But since no two products catch and/or clean the same set of problems, occaisionally using a second vendor's product can be useful.

     

  • Windows Critical Updates/SUS (***Cyber-Security requirement***)

      Windows systems must be regularly patched with "critical" updates.  Unfortunately, the BNL firewall and proxy configurations can interfere with the Windows Automatic Update feature in Windows 2000/XP (though you can still use Windows Updates in Internet Explorer if you have the proxies configured correctly, see below for proxy info).  To help with this situation, BNL ITD has set up a Software Update Services server to locally host critical updates.  To use this service (which places a notification icon in the System Tray when updates are available), please click here for more information and installation instructions.  (It is quite easy, but you must have administrative privileges.)   You can manually apply Windows updates (critical and otherwise) using Internet Explorer --  go to the Tools menu and click on "Windows Updates", at which point it is straightforward.  Note that in many cases, the machine must be rebooted to complete the update process.
  • Logon Banner (**Cyber-Security requirement**)

      As required by the DOE, please install a logon banner for BNL-owned or BNL-based computers.  (This includes other OSes as well -- essentially anything that you can log into is required to post a banner if technically possible.)  Click here for more information about logon banners at BNL. To install the banner:  Windows NT/2000/XP click here (must be an administrator to insert the registry changes).  Window 95/98 click here instead.
  • MAC Registration (**Cyber-Security requirement**)

  All networked devices on the BNL internal networks are required to be registered.   (NB--- Please do not attempt to register your machine while using STAR's cygnusb wireless access points.)   More specifically, each network interface is to be registered -- one computer might have multiple network interfaces, each of which requires a separate registration.   That's because the registration is keyed on a specific string assigned to each network interface by the manufacturer that is supposed to be unique in the world.   It is known as a "MAC", "ethernet" or "hardware" address and each network interface has one. (Ie. You must create a separate registration entry for each network card you use on a system.)   For more information, or to update your registration information, click here.  This requirement applies to things beyond typical PCs, such as remote network power supplies, VME processors and other networked equipment.   If you have such equipment that you cannot register (typically because it doesn't run any sort of web browser), then please contact ITD (x5522) or Wayne Betts for assistance in registering the system.   While not necessary, if you have the capability to verify that the MAC you are registering is in fact yours (Windows hint:  "ipconfig /all" or Linux hint:  "ifconfig"), please do so.   Glitches in the system occaisionally fail to properly keep track of the realtime IP-to-MAC mapping, and you, the adaptable human, can perhaps avert the unfortunate situation of misregistration.
  • Proxy servers

    As per 2017/11, please use direct connection to the network while at BNL.
  • Security Scanning

  The BNL networks are routinely scanned for vulnerabilities by ITD, auditors and even sometimes malicious intruders.  The most prevalent scan is done using Nessus, which looks for common network services and many known vulnerabilities.  Any user with a web browser can initiate a new scan of his host machine and look at the most recent scan results for his IP address by going to http://scanner.bnl.gov/.   (NB. When it requests an email address to send the results, you must use an address ending in bnl.gov, or it will reject you.)   The results can be daunting to interpret, so please ask for assistance if you are unsure how to interpret or correct any results.   Some results are "false positives" or uncorrectable but necessary, in which case they can be marked as such in the database.

 


Please send comments, corrections and suggestions to Wayne Betts: wbetts {at} bnl.gov

Facility Access

A selection of tips on how to log to the RCF
facility. We hope to augment those pages and add
information as user request or need.

Getting a computer account in STAR

  1. Introduction
  2. Getting an account and performing work at BNL

 

Introduction

First of all, if you are a new user, WELCOME to the RHIC/STAR collaboration and experiment. STAR is located at Brookhaven National Laboratory and is one of the premier particle detectors in the world.

As a (new) STAR user, you will need to be granted access to our BNL Tier0 computing facility in order to have access to the offline and online infrastructure and resources. This includes accessing BNL from remote or directly while visiting us on site. Access includes access to data, experiment, mailing lists, desktop computer for visitors to name only those. As a National Facility under the Department of Energy (DOE) regulations, a few steps are required for this to happen. Please, follow them precisely and make sure you understand their relevance.

Note:

The DOE requires proper credentials for anyone accessing a computing "resource" and expect such individual to keep credentials up-to-date i.e. in good standing. It is YOUR responsibility to keep valid credentials with Brookhaven National Laboratory's offices. Credentials include: being a valid and active STAR member, having a valid and active guest/user ID and appointment, having and keeping proper trainings. Any missing component would cause an immediate closure of access to computing resources.

In many cases, we rely on account name matching the one created at the RCF (for example, Hypernews or Drupal accounts need exact match to be approved) - this is enforced so we can accurately rely on the work already done by the RCF personnel and only base our automation on "RCF account exist and is active". The RCF personnel work with the user's office and other agencies to verify your credentials.


If you were a STAR user before and seek to re-activate your account, this page also has information for you.

 

Getting an account and performing work at BNL

Note that along the process of requesting either an appointment or a computing account implies a check from the facility and user office personnel of your good standing with RHIC/STAR as the affiliated experiment. Therefore, we urge you to follow the steps as described below


ALL USERS - Ensure/Verify you are affiliated to STAR in our records

Whenever you join a group affiliated with STAR, please
  • Ask your council representative that he/she sends your information to the collaboration's record keeping person (at this point in time, this person is Liz Mogavero).
    Note: Your council representative IS the one responsible for keeping the list of authors and active members at all times. We will not (and cannot) consider requests coming from other STAR members.
     
  • Pro-actively check the presence of your name and record in our Phone Book.
    Note: If you are not in our Phone Book, you are simply NOT a STAR user as far as we know as our PhoneBook is the central repository of active STAR members as defined by the STAR council representatives. 
     

New users in STAR

  1. Request a Guest appointment
    You must be sure you have a valid guest appointment with the BNL User Office.

    Note
    1: Requesting a Guest ID requires a procedure called “Foreign Visit and Assignment”. This procedure involves steps such as background checks with Counter Intelligence and approval from the Department of State. The procedure could take up to 60 days from the time it is started (sensitive countries may take 90 days).
    Note2 : If you have done this already and are a valid Guest, please go to to this section.

    • Go to the Guest Registration Form and complete the registration as instructed.
      • Purpose of Visit: likely "Research" but if you come for other purposes, chose as appropriate ("CRADA" or "Interview" may apply for example) 
      • Experiment/Facility:  "Physics Dept (RHIC/AGS)"
      • Facility Code: "RHIC"
      • Type of Research: "STAR"
      • Type of Access Requested: likely "Open Research" if you stated your visit purpose as "Research"
      • Subject Code for this Visit/Assignment: likely "General Physics"
         
  2. Be patient and wait for further instructions and the approval.
    • ONLY AFTER THE FIRST STEPS will you be able to proceed with the rest of the instructions below.
    • We will assume that you, from now on, have a valid Guest appointment and hold a Guest/BNL ID.
       
  3. Ensure you have the required and mandatory training
    You MUST take the Cyber Security training and course GE-CYBERSEC. This training is mandatory and access to the facility computing resources will NOT be granted without it.
    You are also requested to read the Personal User Agreement which describes your responsibilities, the reasonable use and scope of personal use of computing equipments. In recent years, the BNL User Office have requested for the form to be signed and returned for their records. Please, do not skip any of those steps.
     
  4. Request an RCF account
    To request a new RCF account, start here. The fields are explained on this instruction page.
    Note: There is a "Contact information" field which is aimed to be filled using an existing RHIC member (holder of a valid account and appointment) who can vouch for you. Put your council representative or team lead name there OR (in case of interview / CRADA etc...) the name of your contact and host at BNL. DO NOT use your own name for this field. DO NOT use the name a person who is NOT yet a STAR Member.
     
  5. Additional steps are described below.

Previously a STAR user

If you were a STAR user before and consulting those pages, it may mean that either
  1. you cannot remember how to login and need access but you are in good standing (all training valid, your BNL appointment is valid)
  2. you have let your training expire but you are a valid BNL guest (your appointment with BNL has NOT expired)
  3. your BNL appointment is about to expire or has expired not long ago
  4. you are a RHIC user (from another experiment), and now coming to STAR
The instructions follows:
  • First of all, please make sure you are in the STAR PhoneBook as indicated here.
    If you were a member of another experiment before, you will be joining STAR as either a member of an existing institution or joining as a new institution. All membership handling are the responsibilities of the STAR council (approval of new institutions) or your council representative. In both cases, we MUST find your name in our PhoneBook records.
     
  • Instructions for the several use cases above
    1. If you are in good standing but cannot remember your login information at the RCF facility, please see Account re-activation
       
    2. You have let your BNL training expire - likely, you have not renewed or taken GE-CYBERSEC training available from the training page (please locate in the list at the bottom the course named GE-CYBERSEC). Within 24 hours of the training being taken/renewed again, the privilege to access the BNL computing resources using your RCF account will be re-established (the process will be automatic).
       
    3. Your appointment is about to expire, or has expired not long ago - you will need to go to the Extension requests .
      The Guest Central interface will help identifying your status and appointment expiration. This form could be used by users already having a BNL Guest ID. If you have let your appointment expire for a long time however, this form may let you know (or not show your old BNL badge/guest ID at all). In such a case, you should consider yourself as a "New user" and follow the first set of instructions above.
      For an appointment renewal, the starting point will be the Guest Extension Form.
       
    4. If you were a user before and now coming to STAR, you will need to follow
    5. Additional steps are described below.

Additional steps for everyone

  1. Generate and upload your SSH keys to ensure secure login
    You may now read SSH Keys and login to the SDCC and following information in this section.
     
  2. Drupal access
    1. Log in to RCF node to verify your account username/password working
    2. Download 2-Factor Authentication app to your mobile device (application ranges from Google or Microsoft Authenticator, Duo Mobile, Authy, FreeOTP, Aegis, ...)
    3. Contact Dmitry Arkhipkin or Jerome Lauret on MatterMost (https://chat.sdcc.bnl.gov, choose "BNL Login" with your RCF username/password) to obtain the 2-Factor Authentication QR code.
    4. Use your RCF username/password + 2-Factor Authentication code (read from the app on your mobile device) to log in to drupal.
You may also be interested in
  • Web Access
  • You do not have access to view this node your ssh keys will need to be uploaded to a different interface if you have a need to login to the online setup - this is needed mostly by experts (not by all users in STAR).
  • Software Infrastructure for general information
  • Video Conferencing ... and the related comments at the bottom of some of those pages (viewable only if you are authenticated to Drupal).
All of those links are referenced on Software & Computing, the main page for Software and Computing ...
Wishing you a great time in STAR.

 

Account re-activation

The instructions here are for users who have an account at the RCF but have unfortunately let their BNL appointment expire or do not know how to access their (old) account.

Account expired or is disabled

First of all, please be sure you understand the requirements and rationals explained in Getting a computer account in STAR.
As soon as your appointment with BNL ends or expires, all access to BNL computing resources are closed / suspended and before re-establishing it, you MUST renew your appointment first. In such case, we will not provide you with any access which may include access to Drupal (personal account) and mailing lists.

The simplest way to proceed is to

  • Check that you do have GE-CYBERSEC training. You can do this by checking your training records.
    • If you do not, please take this training NOW as any future request will be denied until this training is complete
  • Send an Email to RT-RACF-UserAccounts@bnl.gov requesting re-activation of your account. Specify the account name (Unix account name, not your name) if you remember it. If you don't your full name may do. The RCF team will check your status (Cyber training, appointment status) and
    • if any is not valid, you will be notified and further actions will be needed.
    • If all is fine, they will re-activate your account after verifying with us that you are a valid STAR user. Please consult Getting a computer account in STAR for what this means ...

If your appointment has expired, you will need to renew it. Please, follow the instructions available here.

Chicken and Egg issue? Forgot your password but did not upload SSH keys

If your account is valid, so is your appointment  but you have not logged in the facility for a while and hence, are unable to upload your SSH keys (as described in SSH Keys and login to the SDCC and related documents) this may be for you.

You cannot access the upload page unless you have a valid password as the access to the RCF requires a double authentication scheme (Kerberos password + SSH key). In case you have forgotten your password, you have first to send an Email to the RCF at RT-RACF-UserAccounts@bnl.gov asking to reset your password, then thereafter go to the SSH key upload interface and proceed.

Drupal access

This page describes how you can obtain the access to the STAR drupal pages. Please understand that your Drupal access is now tight to a valid SDCC login - no SDCC account, no access to Drupal. This is because we integrated Drupal login to the common infrastructure (the login is Kerberos based). Here are the steps to gain access then:

  1. Get a computer account in STAR, make sure you have a valid guest appointment and valid cyber training, then request a RCF account  

    https://drupal.star.bnl.gov/STAR/comp/sofi/facility-access/general-access

  2. Generate SSH keys and upload them to the SDCC - this has formally little to do with access to Drupal but will allow you to log to the facility and verify you know your Kerberos password (it should have been sent to you via EMail during the account creating with instructions on how to change it)
    https://drupal.star.bnl.gov/STAR/comp/sofi/facility-access/ssh-keys
    Now Log in to RCF node- again, this is to ensure you have the proper

    1. ssh xxx@ssh.rhic.bnl.gov    (xxx is your username on RCF, enter the passphrase for your SSH key)
    2. kinit (enter your SDCC kerberos password - this is the one you will use for accessing Drupal)
  3. Download 2-Factor Authentication app to your mobile device (application ranges from Google or Microsoft Authenticator, Duo Mobile, Authy, FreeOTP, Aegis, ...)
  4. The FIRST time you log to Drupal, you will actually NOT need a second factor (leave the "code" box blank) but MUST generate it right away (the second login will require it).
    1. To generate it, use the "(re)Create 2FA login" in the left hand-side menu, leave all default option (you need a time based OTP), import the QRCode displayed.
    2. Once imported, a 6 Digits code should appear in your app - this is he "code" you will need to enter in future in the third field named "code". Note the code changes every 30 seconds.
  5. IF you have logged out  without generating a QRCode or forgot to import and have no "code", then you will need to contact Dmitry Arkhipkin or Jerome Lauret on MatterMost to obtain your 2FA QR-code (both can generate it for you after the fact). SDCC chat is available at https://chat.sdcc.bnl.gov. Use "BNL Login" and your SDCC account user/Kerberos password to login. Do as discussed above (import the QRCode, make sure an entry appears, test your login right away, ...)
     
  6. You are set. Drupal login will now ask for your SDCC username and Kerberos password and a 6-digit code you read from the 2-Factor Authentication app.
 

SSH Keys and login to the SDCC

How to generate keys for about every platform ... and actually be able to log to the SDCC

General

What you find below is especially useful for those of you that work on several machines and platforms in and out of BNL and need to use ssh key pairs to get into SDCC.

  • If you use Linux only or Windows only everywhere, all you need is follow the instructions on the SDCC web site and you are all set (see especially their Unix SSH Key generation page).
  • Otherwise this page is for you.

The findings on this web page are a combined effort of Jérôme Lauret, Jim Thomas, and Thomas Ullrich. All typos and mistakes on this page are my doing. I am also not going to discuss the wisdom of having to move private keys around - all I want to do is get things done.

The whole problem arises from the fact that there are 3 different formats to store ssh key-pairs and all are not compatible:

  • ssh.com: Secure Shell is the company that invented the (now public) ssh protocol. They provide the (so far) best ssh version for Windows which is far nicer than PuTTY. Especially the File Browser provided is so much nicer than the scp command interface. It is free for academic/university sites.
  • PuTTY: a free ssh tool for Windows.
  • OpenSSH: runs on all Linux boxes and via cygwin on Windows.

Despite all claims, OpenSSH cannot export private keys into ssh.com format, nor can it import ssh.com private keys. Public keys seem to work but this is not what we want. So here is how it goes:

[A] Windows: follow one of the instructions below

  1. PuTTY (Windows)
    1. Download puttygen.exe from the PuTTY download page. You only need it once, but it might be good to keep it in case you need to regenerate your keys.
    2. Start the program puttygen.exe
      • Under parameters pick SSH-2 (RSA) and 1024 for the size of the key in bits.
      • Then press the Generate button. You will be asked to move your mouse over the blank area.
      • Enter a passphrase in the referring fields. The passphrase is needed as it will correspond to a password. Make a mental note of it as keys will not be usable without it.
      • I recommend to save the "key fingerprint" too since you will need it at the SDCC web site when uploading your public key. Just save it in a plain text file.
        Note: You can always generate it later from Linux with ssh-keygen -l -f <key_file> but since you will need access to a Linux system to do this, it is important you keep a copy of this now so you could proceed with the rest of the instructions. The picture below shows where the important fields are
    3. Saving keys
      • Press Save Public Key. To not confuse all the keys you are going to generate I strongly recommend to call it rsa_putty.pub.
      • Next press Save Private Key. Type rsa_putty as a name when prompted. PuTTY will automatically name it rsa_putty.ppk. That's your private key.
        Don't quit puttygen yet. Now comes the important stuff.
      • In the menu bar pick Conversions->Export OpenSSH key. When prompted give a name that indicated that this is the private key for OpenSSH (Linux). I used rsa_openssh. There is no public key stored only the private. We will generate the public one from the private one later.
      • In the menu bar pick Conversions->Export ssh.com key. When prompted give a name that indicated that this is the private key for ssh.com. I used rsa_sshcom. Again, there is no public key stored only the private. We will generate the public one from the private one later.
    4. All done. Now you have essentially 4 files: public and private keys for putty and private keys for ssh.com and OpenSSH.
  2. Getting ssh.com to work (Windows):
    1. Here I assume that you have SSHSecureShell (client) installed, that is the ssh.com version. Open a DOS (or cygwin) shell. We now need to generate a public key from the private key we got from puttygen. Best is to change into the directory where your private key is stored and type: ssh-keygen2 -D rsa_sshcom . Note that the command has a '2' at the end. This will generate a file called rsa_sshcom.pub containing the public key. Now you have your key pair.
    2. Launch SSH and pick from the menu bar Edit->Settings.
      Click on GlobalSettings/UserAuthentication/Keys and press the Import button. Point to your public key rsa_sshcom.pub. The private key will be automatically loaded too. That's it. Press OK and quit SSH. We are not quite ready yet. We still have to generate and upload the OpenSSH key to SDCC.
  3. Getting keys to work with OpenSSH/Linux:
    1. Copy the private key rsa_openssh to a Linux box (cygwin on Windows works of course too).
    2. Set the permissions such that only you can read the private key file:
      % chomod 600 rsa_openssh
    3. Generate the public key with:
      % ssh-keygen -y -f rsa_openssh > rsa_openssh.pub
    4. Now you have the key pair.
    5. To install the key pair on a Linux box copy rsa_openssh and rsa_openssh.pub to your ~/.ssh directory.
      Important: the keys ideally will be named id_rsa and id_rsa.pub, otherwise extra steps/options will be required to work with them.  So, you are recommended to also do
      % mv rsa_openssh ~/.ssh/id_rsa  

      % mv rsa_openssh.pub ~/.ssh/id_rsa.pub

      All done.  Note that there is no need to put your key files on every machine to which you are going to connect.  In fact, you should keep your private key file in as few places as possible -- just the source machine(s) from which you will initiate SSH connections.  Your public key file is indeed safe to share with the public, so you need not be so careful with it and in fact will have to provide it to remote systems (such in the next section) in order to use your keys at all.
       

[B] Uploading the public key to SDCC:

  1. https://web.racf.bnl.gov/Facility/SshKeys/UploadSshKey.php
  2. Make sure you upload the OpenSSH public key. Everything else won't work.
    You need to provide the key fingerprint which you hopefully saved from the instructions above.  In case of OpenSSH based keys, you can re-generate the fingerprint with
    % ssh-keygen -Emd5 -l -f <key_file>

Note that forcing MD5 hash is important (default hash is SHA256 the RACF interface will not take). All done.
If you followed all instructions you now have 3 key pairs (files). This covers essentially all SSH implementations there are. Where ever you go, whatever machine and system you deal with, one key pair will work. Keep them all in a very save place.

 

[C] Done.What's next?

Uploading your keys to the SDCC and STAR SSH-key management interfaces

You need to upload your SSH keys only once. But after your first upload, please wait a while (30 mnts) before connecting to the SDCC SSH Gatekeepers. Basic connection instructions, use:

% ssh -AX xxx@sssh.sdcc.bnl.gov
% rterm

The rterm command will open an X-terminal on a valid STAR interactive node. If you do NOT have an X11 server running on your computer, you could use the -i options of rterm for interactive (non X-term based) session.

If you intend to logon to our online enclave, please check the instructions on You do not have access to view this node to request an account on the STAR SSH gateways and Linux pool (and upload your keys to the STAR Key SSH Management system).  Note that you cannot upload your keys anywhere without a Kerberos password (both the SDCC and STAR's interface will require a real account kerberos password to log in). Logging in to the Online enclave involves the following ssh connection:

% ssh -AX xxx@cssh.sdcc.bnl.gov
% ssh -AX xxx@stargw.starp.bnl.gov

A first thing to see is that SDCC gatekeeper is here "cssh" as the network is spearated into a "campus" side (cssh) and a ScienceZone side (sssh). For convenience, we have asked Cyber security to allow connections from "sssh" to our online enclave as well (so if you use sssh all the time, it will work).

For the requested an account online ... note that users do not request access to the individual stargw machines directly.  Instead, a shared user database is kept on onlcs.starp.bnl.gov - approval for access to onlcs grants access to the stargw machines and the Online Linux Pool.  Such access is typically requested on the user's behalf when the user requests access to the online resources following the instructions at You do not have access to view this node, though users may also initiate the request themselves. 

Logging in to the stargw machines is most conveniently done Using the SSH Agent, and is generally done through the SDCC's SSSH gateways.  This additional step of starting an agent would be removed whenever we will be able to directly access the STAR SSH GW (as per 2009, this is not yet possible due to technical details).

See also

To learn more, see:

 

Caveats, issues, special cases and possible problems

Shortcut links

 

SSH side effects

Please note that if you remote account name is different from your RCF account name, you will need to use

% ssh -X username@rssh.rhic.bnl.gov

specifying explicitly username rather as the form

% ssh -X rssh.rhic.bnl.gov

will assume a username defaulting to your local machine (remote from the BNL ssh-daemon stand point) user name where you issue the ssh command. This has been a source of confusion for a few users. The first form by the way is preferred as always work and removes all ambiguities.

X11 Forwarding: -X or -Y ??

-X is used to automatically set the display environment to a secure channel (also called untrusted X11 forwarding) . In other words, it enables X11 forwarding without having to grant remote applications the right to manipulate your Xserver parameters. If you want ssh client to always act like with X11 forwarding, have the following line added in your /etc/ssh/ssh_config (or any /etc/ssh*/ssh*_config ).

ForwardX11 yes

-Y enables trusted X11 forwarding. So, what does trusted mean? It means that the X-client will be allowed to gain full access to your Xserver, including changing X11 properties (i.e. attributes and values which alters the look and feel of opened X windows or things such as mouse controls and position info, keyboard input reading and so on).  Starting with OpenSSH 3.8, you will need to set

ForwardX11Trusted yes 

in the client configuration  to allow remote nodes full access to your Xserver as it is NOT enabled by default.

When to use trusted, when to use untrusted

Recent OpenSSH version supports both untrusted (-X) and trusted (-Y) X11 Forwarding. As hinted above, the difference is what level of permissions the client application has on the Xserver running on the client machine.  Untrusted (-X) X11 Forwarding is more secure, but unfortunately several applications (especially older X-based applications) do not support running with less privileges and will eventually die and/or crash your entire Xserver session.

Dilema? A rule of thumb is that while using trusted (-Y) X11 Forwarding will have less applications problems for the near future, try first the most secured untrusted (-X) way and see what happens. If remote X applications fail with a errorssimilar to the below:

X Error of failed request: BadAtom (invalid Atom parameter)
  Major opcode of failed request: 18 (X_ChangeProperty)
  Atom id in failed request: 0x114
  Serial number of failed request: 370
  Current serial number in output stream: 372

you will have to use the trusted (-Y) connection.

Per client / server setup?

Instead of a system global configuration which will require your system administrator's assistance, you may create a config file in your user’s home directory (client side) under the .ssh directory with the following line $HOME/.ssh/config

ForwardX11Trusted yes 

But it gets better as the config file allows per host or per-domain configuration. For example, the below is valid

Host *.edu
	ForwardX11 no
	User jlauret

Host *.starp.bnl.gov
	ForwardX11 yes
    	Cipher blowfish
	User jeromel

Host orion.star.bnl.gov
     ForwardAgent yes
     Cipher 3des
     ForwardX11Trusted yes

Host what.is.this
    User exampleoptions
    ServerAliveInternal=900
    Port 666
    Compression yes
    PasswordAuthentication no
    KeepAlive yes
    ForwardAgent yes
    ForwardX11 yes
    RhostsAuthentication no
    RhostsRSAAuthentication no
    RSAAuthentication yes
    TISAuthentication no
    PasswordAuthentication no
    FallBackToRsh no
    UseRsh no

As a side note, 3des is more secure thank blowfish but also 3x slower. If speed and security is important, use at least aes cypher.

Kerberos hand-shake, How to.

OK, now you are logged to the facility gatekeeper but any sub-sequent login would ask for your password again (and this would defeat security). But you can cure this problem by, on the gatekeeper, issue the following command (we assume $user is your user name)

% kinit -5 -d -l 7d $user

-l 7d is used to provide a long life K5 ticket (7 days long credentials). Note that you should afterward be granted an AFS token automatically upon login to the worker nodes on the facility. From the gatekeeper, the command

% rterm

would open a terminal from the least loaded node on the cluster where you are allowed to log.

Generic (group) accounts

Due to policy regulations, group or generic accounts login cannot be allowed at the facility unless the login is traceable to an individual. The way to log in is therefore to

  • Log to the gatekeeper using SSH keys under your PERSONAL account as described at SSH Keys and login to the SDCC
  • kinit -5 -4 -l 7d $gaccount
  • In case of wide use generic account, one more jump to a "special" node will be necessary. For starreco and starlib for example, this additional gatekeeper node is rcas6003. From there, login to the rest of the facility could be done using rterm as usual (at least in STAR)

Special nodes

This section is about standing on one foot, tapping on to of your head and chanting a mantra unless the moon is full (in such case, the procedure involves parsley and sacrificial offerings). OK, we are in the realm of the very very special tricks for very very special nodes:

  • The rmine nodes CANNOT be connected to anymore. However, one can use rsec00.rhic.bnl.gov has a gatekeeper, using your desktop keys and then jump from there to the rmine nodes.
    Scope: Subject to special authorization.
  • The test node aplay1.usatlas.bnl.gov cannot be accessed using a Kerberos trick. Since there are two HOPs from your machine to aplay1, you need to use the ssh-agent. See instructions on the Using the SSH Agent help page.

K5 Caveats

  • If you log to gatekeeper GK1 for your personal account, you will need to chose another gatekeeepr GK2 for your group account login. This will allow not interference of Kerberos credentials.
  • Whenever you log in a gatekeeper and you know you had previously obtained Kerberos credentials on this gatekeeper, you should ensure the destruction of previous credential to avoid premature lifetime expiration. In other words, -l 7d will NOT give you a 7 days lifetime K5 ticket on a gatekeeper where previous credentials exists. To destroy previous credentials, be sure
    1. you do not have (still) opened windows using the credential. Check this by issuing a klist and observe the listing. Valid credentials used in opened session would look like this
      Valid starting     Expires            Service principal
      12/26/06 10:59:28  12/31/06 10:59:28  krbtgt/RHIC.BNL.GOV@RHIC.BNL.GOV
             renew until 01/02/07 10:59:25
      12/26/06 10:59:30  12/31/06 10:59:28  host/rcas6005.rcf.bnl.gov@RHIC.BNL.GOV
             renew until 01/02/07 10:59:25
      12/26/06 11:11:48  12/31/06 10:59:28  host/rplay43.rcf.bnl.gov@RHIC.BNL.GOV
             renew until 01/02/07 10:59:25
      12/26/06 17:51:05  12/31/06 10:59:28  host/stargrid02.rcf.bnl.gov@RHIC.BNL.GOV
             renew until 01/02/07 10:59:25
      12/26/06 18:34:03  12/31/06 10:59:28  host/stargrid01.rcf.bnl.gov@RHIC.BNL.GOV
             renew until 01/02/07 10:59:25
      12/26/06 18:34:22  12/31/06 10:59:28  host/stargrid03.rcf.bnl.gov@RHIC.BNL.GOV
             renew until 01/02/07 10:59:25
      12/28/06 17:53:29  12/31/06 10:59:28  host/rcas6011.rcf.bnl.gov@RHIC.BNL.GOV
             renew until 01/02/07 10:59:25 
      
    2. If nothing appears to be relevant or existing, it is safe to issue the kdestroy command to wipe out all old credentials and then re-initiate a kinit.

 

 

Using the SSH Agent

General

The ssh-agent is a program you may use together with OpenSSH or similar ssh programs. The ssh-agent provides a secure way of storing the passphrase of the private key.

One advantage and common use of the agent is to use the agent forwarding. Agent forwarding allows you to open ssh sessions without having to repeatedly type your passphrase as you make multiple SSH hops. Below, we provide instructions on starting the agent, loading your keys and how to use key forwarding.

Instructions

Starting the agent

The ssh-agent is started as follow.

% ssh-agent

Note however that the agent will immediately display information such as the one below

% ssh-agent
SSH_AUTH_SOCK=/tmp/ssh-fxDmNwelBA/agent.5884; export SSH_AUTH_SOCK;
SSH_AGENT_PID=3520; export SSH_AGENT_PID;
echo Agent pid 3520;

 
It may not be immediately obvious to you but you actually MUST type those commands on the command line for the next steps to be effective.

Here is what I usually do: redirect the message to a file and source it from the shell like this:

% ssh-agent >agent.sh 
% source agent.sh

The commands above will create a script containing the necessary shell commands, then the source command will load the information into your shell.  This assumes you are using sh. For csh, you need use the setenv shell command to define both SSH_AUTH_SOCK and SSH_AGENT_PID. A simpler approach may however be to use

% ssh-agent csh

The command above will start a new shell, in which the necessary environment variables will be defined in the newly started shell (no sourcing needed). 

Yet another method to start an agent and set the environment variables in tcsh or bash (and probably other shells) is this:
 

% eval `ssh-agent`


Now that you've started an agent and set the environment variables to use it, the next step is to load your SSH key.

 

Loading a key

The agent alone is not very useful until you've actually put keys into it. All your agent key management is handled by the ssh-add command. If you run it without arguments, it will add any of the 'standard' keys $HOME/.ssh/identity, $HOME/.ssh/id_rsa, and $HOME/.ssh/id_dsa.

To be sure the agent has not loaded any id yet, you may use the -l option with ssh-add.  Here's what you should see if you have not loaded a key:

% ssh-add -l
The agent has no identities.

 

To load your key, simply type

% ssh-add
Enter passphrase for /home/jlauret/.ssh/id_rsa:
Identity added: /home/jlauret/.ssh/id_rsa (/home/jlauret/.ssh/id_rsa)

 

To very if all is fine, you may use again the ssh-add command with the -l option. The result should be different now and similar to the below (if not, something went wrong).

% ssh-add -l
1024 34:a0:3f:56:6d:a2:02:d1:c5:23:2e:a0:27:16:3d:e5 /home/jlauret/.ssh/id_rsa (RSA)

 

Is so, all is fine.

Agent forwarding

Two conditions need to be present for agent forwarding to function:

  • The server need to be set to accept forwards (enabled by default)
  • You need to use the ssh client with the -A option

Usage is simply

 

% ssh -A user@remotehost

 

And that is all. For every hop, you need to use the -A option to have the key forwarded throughout the chain of ssh logins. Ideally, you may want to use -AX (where "X" enabled X11 agent forwarding).

Agent security concern

The ssh-agent creates a unix domain socket, and then listens for connections from /usr/bin/ssh on this socket. It relies on simple unix permissions to prevent access to this socket, which means that any keys you put into your agent are available to anyone who can connect to this socket. BE AWARE that root especially has acess to any file hence any sockets and as a consequence, may acquire access to your remote system whenever you use an agent.

Manpages indicates you may use the -c of ssh-add and this indeed adds one more level of safety to the agent mechanism (the agent will aks for the passphrase confirmation at each new session). However, if root has its mind on stealing a session, you are set for a lost battle from the start so do not feel over-confident of this option.

Addittional information

Help pages below links to the OpenSSH implementation of the ssh client/server and other ssh related documentation from our site.

 

SSH connection stability

IF
  • Your SSH connections are closed from home
  • You get disconnected from any nodes without any reasons?
  • ... and you are a PuTTY user
  • ... or an Uglix SSH client user
This page is for you. If you are another user, use different clients and so on, this page may still be informative and help you stabalize your connection (the same principles apply).

PuTTY users

PuTTY to connect to gateway (from a home connection), you have to

  • set a session, be sure to enable SSH

  • go to the 'Connection' menu and have the following options box checked

    • Disable Nagle's algorithm (TCP_NODELAY option)

    • Enable TCP keepalives (SO_KEEPALIVE option)

  • Furthermore, in 'Connection' -> 'SSH' -> 'Tunnels' enable the option

    • Enable X11 forwarding

    • Enable MIT-Magic-Cookie-1

  • Save the session

Documentation on those features (explanation for the interested) are added at the end of this document.


SSH Users

SSH users and owner of their system could first of all be sure to manipulate the SSH client configuration file and be sure settings are turned on by default. The client configuration is likely located as /etc/ssh_config or /usr/local/etc/ssh_config depending on where you have ssh installed.

But if you do NOT have access to the configuration file, the client can nonetheless pass on options from the command line. Those options would have the same name as they would appear in the config file.

Especially, KEEP_ALIVE is controlled via the SSH configuration option TCPKeepAlive.

% ssh -o TCPKeepAlive=yes

You will note in the next section that a spoofing issue exists with keep alive (I know it works well, but please consider the ServerAliveCountMax mechanism) so, you may use instead

% ssh -o TCPKeepAlive=no -o ServerAliveInterval=15

Note that the value 15 in our example is purely empirical. There are NO magic values and you need to test your connection and detect when (after what time) you get kicked out and disconnected and set the parameters from your client accordingly. Let's explain the default first and come back to this and a rule of thumb.

There are two relevant parameters (in addition of TCPKeepAlive):


ServerAliveInterval

Sets a timeout interval in seconds after which if no data has been received from the server, ssh will send a message through the encrypted channel to request a response from the server. The default is 0, indicating that these messages will not be sent to the server.

This option applies to protocol version 2 only.


ServerAliveCountMax

Sets the number of server alive messages (see above) which may be sent without ssh receiving any messages back from the server. If this threshold is reached while server alive messages are being sent, ssh will disconnect from the server, terminating the session. It is important to note that the use of server alive messages is very different from TCPKeepAlive (below). The server alive messages are sent through the encrypted channel and therefore will not be spoofable. The TCP keepalive option enabled by TCPKeepAlive is spoofable. The server alive mechanism is valuable when the client or server depend on knowing when a connection has become inactive.

The default value is 3. If, for example, ServerAliveInterval (above) is set to 15, and ServerAliveCountMax is left at the default, if the server becomes unresponsive ssh will disconnect after approximately 45 seconds.


In our example

% ssh -o TCPKeepAlive=no -o ServerAliveInterval=15

The recipe should be: if you get disconnected after N seconds, play with the above and be sure to set a

time of ServerAliveInterval*ServerAliveCountMax <= 0.8*N, N being the timeout. Since ServerAliveCountMax is typically not modified, in our example we assume the default value of 3 and therefore a a 3x15 = 45 seconds (and we guessed a disconnect every minute or so). If you set the value too low, the client will send to much "chatting" to the server and there will be a traffic impact.


Appendix

Nagle's algorithm

This was written based on this article.

RPC implementations on TCP should disable Nagle. This reduces average RPC request latency on TCP, and makes network trace tools work a little nicer.

Determines whether Nagle's algorithm is to be used. The Nagle's algorithm tries to conserve bandwidth by minimizing the number of segments that are sent. When applications wish to decrease network latency and increase performance, they can disable Nagle's algorithm (that is enable TCP_NODELAY). Data will be sent earlier, at the cost of an increase in bandwidth consumption.


KeepAlive

The KEEPALIVE option of the TCP/IP Protocol ensures that connections are kept alive even while they are idle. When a connection to a client is inactive for a period of time (the timeout period), the operating system sends KEEPALIVE packets at regular intervals. On most systems, the default timeout period is two hours (7,200,000 ms).

If the network hardware or software drops connections that have been idle for less than the two hour default, the Windows Client session will fail. KEEPALIVE timeouts are configured at the operating system level for all connections that have KEEPALIVE enabled.

If the network hardware or software (including firewalls) have a idle limit of one hour, then the KEEPALIVE timeout must be less than one hour. To rectify this situation TCP/IP KEEPALIVE settings can be lowered to fit inside the firewall limits. The implementation of TCP KEEPALIVE may vary from vendor to vendor. The original definition is quite old and described in RFC 1122.


MIT Magic cookie

To avoid unauthorized connections to your X display, the command xauth for encrypted X connections is widely used. When you login, a .Xauthority file is created in your home directory ($HOME). Even SSH initiate the creation of a magic cookie and without it, no display could be opened. Note that since the .Xauthority file IS the file containing the MIT Magic cookie, if you ever run out of disk quota or the file system is full, this file CANNOT be created or updated (even from the sshd impersonating the user) and consequently, no X connections can be opened.

The .Xauthority file sometimes contains information from older sessions, but this is not important, as a new key is created at every login session. The Xauthority is simple and powerful, and eliminates many of the security problems with X.




FileCatalog

The STAR FileCatalog is an a set of tools and API providing users access to the MeataData, File and Replica information pertaining to all data produced by the RHIC/STAR experiment.  The STAR FileCatalog in other words provides users access to meta-data, file and replica information through a unified schema-agnostic API. The user never needs to know the details of the relation between elements (or keywords) but rather, is provided with a flexible yet powerful query API allowing them to request any combination of 'keywords' based on sets of conditions composed of sequences of keyword operation values combinations. The user manual provides a list of keywords.

The STAR FIleCatalog also provides multi-site support through the same API. In other words, the same set of tools and programmatic interface allows to register, update, maintain a global catalog for the experiment and serve as a core component to the Data Management system. To date, the STAR FileCatalog holds information on 22 Million files and 52 Million active replicas.

 

The history & version information7

Manual

XML(s) examples

Examples and other documentation

 

A few examples will be left here to guide users and installer.

Data dictionary

This dictionary was created on 2012/03/12.

CollisionTypes

Field Type Null Default Comments
collisionTypeID smallint(6) No    
firstParticle varchar(10) No    
secondParticle varchar(10) No    
collisionEnergy float No 0  
collisionTypeIDate timestamp No CURRENT_TIMESTAMP  
collisionTypeCreator smallint(6) No 1  
collisionTypeCount int(11) Yes NULL  
collisionTypeComment text Yes NULL  

Creators

Field Type Null Default Comments
creatorID bigint(20) No    
creatorName varchar(15) Yes unknown  
creatorIDate timestamp No CURRENT_TIMESTAMP  
creatorCount int(11) Yes NULL  
creatorComment varchar(512) Yes NULL  

DetectorConfigurations

Field Type Null Default Comments
detectorConfigurationID int(11) No    
detectorConfigurationName varchar(50) Yes NULL  
dTPC tinyint(4) Yes NULL  
dSVT tinyint(4) Yes NULL  
dTOF tinyint(4) Yes NULL  
dEMC tinyint(4) Yes NULL  
dEEMC tinyint(4) Yes NULL  
dFPD tinyint(4) Yes NULL  
dFTPC tinyint(4) Yes NULL  
dPMD tinyint(4) Yes NULL  
dRICH tinyint(4) Yes NULL  
dSSD tinyint(4) Yes NULL  
dBBC tinyint(4) Yes NULL  
dBSMD tinyint(4) Yes NULL  
dESMD tinyint(4) Yes NULL  
dZDC tinyint(4) Yes NULL  
dCTB tinyint(4) Yes NULL  
dTPX tinyint(4) Yes NULL  
dFGT tinyint(4) Yes NULL  

DetectorStates

Field Type Null Default Comments
detectorStateID int(11) No    
sTPC tinyint(4) Yes NULL  
sSVT tinyint(4) Yes NULL  
sTOF tinyint(4) Yes NULL  
sEMC tinyint(4) Yes NULL  
sEEMC tinyint(4) Yes NULL  
sFPD tinyint(4) Yes NULL  
sFTPC tinyint(4) Yes NULL  
sPMD tinyint(4) Yes NULL  
sRICH tinyint(4) Yes NULL  
sSSD tinyint(4) Yes NULL  
sBBC tinyint(4) Yes NULL  
sBSMD tinyint(4) Yes NULL  
sESMD tinyint(4) Yes NULL  
sZDC tinyint(4) Yes NULL  
sCTB tinyint(4) Yes NULL  
sTPX tinyint(4) Yes NULL  
sFGT tinyint(4) Yes NULL  

EventGenerators

Field Type Null Default Comments
eventGeneratorID smallint(6) No    
eventGeneratorName varchar(30) No    
eventGeneratorVersion varchar(10) Yes 0  
eventGeneratorParams varchar(200) Yes NULL  
eventGeneratorIDate timestamp No CURRENT_TIMESTAMP  
eventGeneratorCreator smallint(6) No 1  
eventGeneratorCount int(11) Yes NULL  
eventGeneratorComment varchar(512) Yes NULL  

FileData

Field Type Null Default Comments
fileDataID bigint(20) No    
runParamID int(11) No 0  
fileName varchar(255) No    
baseName varchar(255) No   Name without extension
sName1 varchar(255) No   Will be used for name+runNumber
sName2 varchar(255) No   Will be used for name before runNumber
productionConditionID mediumint(9) Yes NULL  
numEntries mediumint(9) Yes 0  
md5sum varchar(32) Yes 0  
fileTypeID smallint(6) No 0  
fileSeq smallint(6) Yes NULL  
fileStream smallint(6) Yes 0  
fileDataIDate timestamp No CURRENT_TIMESTAMP  
fileDataCreator smallint(6) No 1  
fileDataCount int(11) Yes NULL  
fileDataComment text Yes NULL  

FileLocations

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileLocationsID

Field Type Null Default Comments
fileLocationID bigint(20) No    

FileLocations_0

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileLocations_1

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileLocations_2

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileLocations_3

Field Type Null Default Comments
fileLocationID bigint(20) No    
fileDataID bigint(20) No 0  
filePathID bigint(20) No 0  
storageTypeID mediumint(9) No 0  
createTime timestamp No CURRENT_TIMESTAMP  
insertTime timestamp No 0000-00-00 00:00:00  
owner varchar(15) Yes NULL  
fsize bigint(20) Yes NULL  
storageSiteID smallint(6) No 0  
protection varchar(15) Yes NULL  
hostID mediumint(9) No 1  
availability tinyint(4) No 1  
persistent tinyint(4) No 0  
sanity tinyint(4) No 1  

FileParents

Field Type Null Default Comments
parentFileID bigint(20) No 0  
childFileID bigint(20) No 0  

FilePaths

Field Type Null Default Comments
filePathID bigint(6) No    
filePathName varchar(255) No    
filePathIDate timestamp No CURRENT_TIMESTAMP  
filePathCreator smallint(6) No 1  
filePathCount int(11) Yes NULL  
filePathComment varchar(512) Yes NULL  

FileTypes

Field Type Null Default Comments
fileTypeID smallint(6) No    
fileTypeName varchar(30) No    
fileTypeExtension varchar(15) No    
fileTypeIDate timestamp No CURRENT_TIMESTAMP  
fileTypeCreator smallint(6) No 1  
fileTypeCount int(11) Yes NULL  
fileTypeComment varchar(512) Yes NULL  

Hosts

Field Type Null Default Comments
hostID smallint(6) No    
hostName varchar(30) No localhost  
hostIDate timestamp No CURRENT_TIMESTAMP  
hostCreator smallint(6) No 1  
hostCount int(11) Yes NULL  
hostComment varchar(512) Yes NULL  

ProductionConditions

Field Type Null Default Comments
productionConditionID smallint(6) No    
productionTag varchar(10) No    
libraryVersion varchar(10) No    
productionConditionIDate timestamp No CURRENT_TIMESTAMP  
productionConditionCreator smallint(6) No 1  
productionConditionCount int(11) Yes NULL  
productionConditionComment varchar(512) Yes NULL  

RunParams

Field Type Null Default Comments
runParamID int(11) No    
runNumber bigint(20) No 0  
dataTakingStart timestamp No 0000-00-00 00:00:00  
dataTakingEnd timestamp No 0000-00-00 00:00:00  
dataTakingDay smallint(6) Yes 0  
dataTakingYear smallint(6) Yes 0  
simulationParamsID int(11) Yes NULL  
runTypeID smallint(6) No 0  
triggerSetupID smallint(6) No 0  
detectorConfigurationID mediumint(9) No 0  
detectorStateID mediumint(9) No 0  
collisionTypeID smallint(6) No 0  
magFieldScale varchar(50) No    
magFieldValue float Yes NULL  
runParamIDate timestamp No CURRENT_TIMESTAMP  
runParamCreator smallint(6) No 1  
runParamCount int(11) Yes NULL  
runParamComment varchar(512) Yes NULL  

RunTypes

Field Type Null Default Comments
runTypeID smallint(6) No    
runTypeName varchar(255) No    
runTypeIDate timestamp No CURRENT_TIMESTAMP  
runTypeCreator smallint(6) No 1  
runTypeCount int(11) Yes NULL  
runTypeComment varchar(512) Yes NULL  

SimulationParams

Field Type Null Default Comments
simulationParamsID int(11) No    
eventGeneratorID smallint(6) No 0  
simulationParamIDate timestamp No CURRENT_TIMESTAMP  
simulationParamCreator smallint(6) No 1  
simulationParamCount int(11) Yes NULL  
simulationParamComment varchar(512) Yes NULL  

StorageSites

Field Type Null Default Comments
storageSiteID smallint(6) No    
storageSiteName varchar(30) No    
storageSiteLocation varchar(50) Yes NULL  
storageSiteIDate timestamp No CURRENT_TIMESTAMP  
storageSiteCreator smallint(6) No 1  
storageSiteCount int(11) Yes NULL  
storageSiteComment varchar(512) Yes NULL  

StorageTypes

Field Type Null Default Comments
storageTypeID mediumint(9) No    
storageTypeName varchar(6) No    
storageTypeIDate timestamp No CURRENT_TIMESTAMP  
storageTypeCreator smallint(6) No 1  
storageTypeCount int(11) Yes NULL  
storageTypeComment varchar(512) Yes NULL  

TriggerCompositions

Field Type Null Default Comments
fileDataID bigint(20) No 0  
triggerWordID mediumint(9) No 0  
triggerCount mediumint(9) Yes 0  

TriggerSetups

Field Type Null Default Comments
triggerSetupID smallint(6) No    
triggerSetupName varchar(50) No    
triggerSetupComposition varchar(255) No    
triggerSetupIDate timestamp No CURRENT_TIMESTAMP  
triggerSetupCreator smallint(6) No 1  
triggerSetupCount int(11) Yes NULL  
triggerSetupComment varchar(512) Yes NULL  

TriggerWords

Field Type Null Default Comments
triggerWordID mediumint(9) No    
triggerWordName varchar(50) No    
triggerWordVersion varchar(6) No V0.0  
triggerWordBits varchar(6) No    
triggerWordIDate timestamp No CURRENT_TIMESTAMP  
triggerWordCreator smallint(6) No 1  
triggerWordCount int(11) Yes NULL  
triggerWordComment varchar(512) Yes NULL  

Tables creation and attributes

#use FileCatalog;

#
# All IDs are named after their respective table. This MUST
# remain like this.
#  eventGeneratorID        -> eventGenerator+ID       in 'EventGenerators'
#  detectorConfigurationID ->detectorConfiguration+ID in 'DetectorConfigurations'
#
# etc...
#

DROP TABLE IF EXISTS EventGenerators;
CREATE TABLE EventGenerators
(
  eventGeneratorID      SMALLINT     NOT NULL    AUTO_INCREMENT,
  eventGeneratorName    VARCHAR(30)        NOT NULL,
  eventGeneratorVersion VARCHAR(10)     NOT NULL,
  eventGeneratorParams  VARCHAR(200),

  eventGeneratorIDate   TIMESTAMP       NOT NULL,
  eventGeneratorCreator CHAR(15)        DEFAULT 'unknown' NOT NULL,
  eventGeneratorCount   INT,
  eventGeneratorComment TEXT,
  UNIQUE        EG_EventGeneratorUnique (eventGeneratorName, eventGeneratorVersion, eventGeneratorParams),
  PRIMARY KEY (eventGeneratorID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS DetectorConfigurations; CREATE TABLE DetectorConfigurations
(
  detectorConfigurationID               INT             NOT NULL        AUTO_INCREMENT,
  detectorConfigurationName             VARCHAR(50)        NULL           UNIQUE,
  dTPC                                  TINYINT,
  dSVT                                  TINYINT,
  dTOF                                  TINYINT,
  dEMC                                  TINYINT,
  dEEMC                                 TINYINT,
  dFPD                                  TINYINT,
  dFTPC                                 TINYINT,
  dPMD                                  TINYINT,
  dRICH                                 TINYINT,
  dSSD                                  TINYINT,
  dBBC                                  TINYINT,
  dBSMD                                 TINYINT,
  dESMD                                 TINYINT,
  PRIMARY KEY (detectorConfigurationID)
) TYPE=MyISAM;


# Trigger related tables
DROP TABLE IF EXISTS TriggerSetups; CREATE TABLE TriggerSetups
(
   triggerSetupID               SMALLINT     NOT NULL    AUTO_INCREMENT,
   triggerSetupName             VARCHAR(50)        NOT NULL       UNIQUE,
   triggerSetupComposition      VARCHAR(255) NOT NULL,

   triggerSetupIDate            TIMESTAMP       NOT NULL,
   triggerSetupCreator          CHAR(15)       DEFAULT 'unknown' NOT NULL,
   triggerSetupCount            INT,
   triggerSetupComment          TEXT,
   PRIMARY KEY                  (triggerSetupID)
) TYPE=MyISAM;


DROP TABLE IF EXISTS TriggerCompositions; CREATE TABLE TriggerCompositions
(
  fileDataID                    BIGINT          NOT NULL,
  triggerWordID                 INT             NOT NULL,
  triggerCount                  MEDIUMINT       DEFAULT 0,
  PRIMARY KEY                   (fileDataID, triggerWordID)
) TYPE=MyISAM;



DROP TABLE IF EXISTS TriggerWords;
CREATE TABLE TriggerWords (
  triggerWordID         mediumint(9)   NOT NULL auto_increment,
  triggerWordName       varchar(50)  NOT NULL default '',
  triggerWordVersion    varchar(6)        NOT NULL default 'V0.0',
  triggerWordBits       varchar(6)   NOT NULL default '',
  triggerWordIDate      timestamp(14)       NOT NULL,
  triggerWordCreator    varchar(15)       NOT NULL default 'unknown',
  triggerWordCount      int(11)     default NULL,
  triggerWordComment    text,
  PRIMARY KEY           (triggerWordID),
  UNIQUE KEY TW_TriggerCharacteristic (triggerWordName,triggerWordVersion,triggerWordBits)
) TYPE=MyISAM;




DROP TABLE IF EXISTS CollisionTypes; CREATE TABLE CollisionTypes
(
  collisionTypeID SMALLINT NOT NULL AUTO_INCREMENT,
  firstParticle VARCHAR(10) NOT NULL,
  secondParticle VARCHAR(10) NOT NULL,
  collisionEnergy FLOAT NOT NULL,
  PRIMARY KEY (collisionTypeID)
) TYPE=MyISAM;


#
# A few dictionary tables
#
DROP TABLE IF EXISTS ProductionConditions; CREATE TABLE ProductionConditions
(
  productionConditionID         SMALLINT       NOT NULL      AUTO_INCREMENT,
  productionTag                 VARCHAR(10)   NOT NULL,
  libraryVersion                VARCHAR(10)   NOT NULL,

  productionConditionIDate      TIMESTAMP       NOT NULL,
  productionConditionCreator    CHAR(15)        DEFAULT 'unknown' NOT NULL,
  productionConditionCount      INT,
  productionConditionComments   TEXT,
  PRIMARY KEY                   (productionConditionID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS StorageSites; CREATE TABLE StorageSites
(
  storageSiteID                 SMALLINT      NOT NULL     AUTO_INCREMENT,
  storageSiteName               VARCHAR(30)  NOT NULL,
  storageSiteLocation           VARCHAR(50),

  storageSiteIDate              TIMESTAMP       NOT NULL,
  storageSiteCreator            CHAR(15)       DEFAULT 'unknown' NOT NULL,
  storageSiteCount              INT,
  storageSiteComment            TEXT,
  PRIMARY KEY                   (storageSiteID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS FileTypes; CREATE TABLE FileTypes
(
  fileTypeID                    SMALLINT NOT NULL        AUTO_INCREMENT,
  fileTypeName                  VARCHAR(30)    NOT NULL   UNIQUE,
  fileTypeExtension             VARCHAR(15)        NOT NULL,

  fileTypeIDate                 TIMESTAMP       NOT NULL,
  fileTypeCreator               CHAR(15) DEFAULT 'unknown' NOT NULL,
  fileTypeCount                 INT,
  fileTypeComment               TEXT,
  PRIMARY KEY                   (fileTypeID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS FilePaths; CREATE TABLE FilePaths
(
  filePathID                    BIGINT         NOT NULL         AUTO_INCREMENT,
  filePathName                  VARCHAR(255)   NOT NULL         UNIQUE,

  filePathIDate                 TIMESTAMP       NOT NULL,
  filePathCreator               CHAR(15) DEFAULT 'unknown' NOT NULL,
  filePathCount                 INT,
  filePathComment               TEXT,
  PRIMARY KEY                   (filePathID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS Hosts; CREATE TABLE Hosts
(
  hostID                        SMALLINT       NOT NULL         AUTO_INCREMENT,
  hostName                      VARCHAR(30)    NOT NULL DEFAULT 'localhost' UNIQUE,

  hostIDate                     TIMESTAMP       NOT NULL,
  hostCreator                   CHAR(15)     DEFAULT 'unknown' NOT NULL,
  hostCount                     INT,
  hostComment                   TEXT,
  PRIMARY KEY                   (hostID)
) TYPE=MyISAM;


DROP TABLE IF EXISTS RunTypes; CREATE TABLE RunTypes
(
  runTypeID                     SMALLINT  NOT NULL AUTO_INCREMENT,
  runTypeName                   VARCHAR(255)    NOT NULL   UNIQUE,

  runTypeIDate                  TIMESTAMP       NOT NULL,
  runTypeCreator                CHAR(15)  DEFAULT 'unknown' NOT NULL,
  runTypeCount                  INT,
  runTypeComment                TEXT,
  PRIMARY KEY                   (runTypeID)
) TYPE=MyISAM;


DROP TABLE IF EXISTS StorageTypes; CREATE TABLE StorageTypes
(
  storageTypeID                 MEDIUMINT       NOT NULL    AUTO_INCREMENT,
  storageTypeName               VARCHAR(6)   NOT NULL  UNIQUE,

  storageTypeIDate              TIMESTAMP       NOT NULL,
  storageTypeCreator            CHAR(15)       DEFAULT 'unknown' NOT NULL,
  storageTypeCount              INT,
  storageTypeComment            TEXT,
  PRIMARY KEY                   (storageTypeID)
) TYPE=MyISAM;





DROP TABLE IF EXISTS SimulationParams; CREATE TABLE SimulationParams
(
  simulationParamsID            INT             NOT NULL     AUTO_INCREMENT,
  eventGeneratorID              SMALLINT    NOT NULL,
  detectorConfigurationID       INT             NOT NULL,
  simulationParamComments       TEXT,
  PRIMARY KEY                   (simulationParamsID),
  INDEX         SP_EventGeneratorIndex          (eventGeneratorID),
  INDEX         SP_DetectorConfigurationIndex   (detectorConfigurationID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS RunParams;
CREATE TABLE RunParams
(
  runParamID                  INT        NOT NULL AUTO_INCREMENT,
  runNumber                   BIGINT     NOT NULL UNIQUE,
  dataTakingStart             TIMESTAMP,
  dataTakingEnd               TIMESTAMP,
  simulationParamsID          INT       NULL,
  runTypeID                   SMALLINT     NOT NULL,
  triggerSetupID              SMALLINT      NOT NULL,
  detectorConfigurationID     INT            NOT NULL,
  collisionTypeID             SMALLINT             NOT NULL,
  magFieldScale               VARCHAR(50)    NOT NULL,
  magFieldValue               FLOAT,
  runComments                 TEXT,
  PRIMARY KEY                          (runParamID),
  INDEX RP_RunNumberIndex              (runNumber),
  INDEX RP_DataTakingStartIndex        (dataTakingStart),
  INDEX RP_DataTakingEndIndex          (dataTakingEnd),
  INDEX RP_MagFieldScaleIndex          (magFieldScale),
  INDEX RP_MagFieldValueIndex          (magFieldValue),
  INDEX RP_SimulationParamsIndex       (simulationParamsID),
  INDEX RP_RunTypeIndex                (runTypeID),
  INDEX RP_TriggerSetupIndex           (triggerSetupID),
  INDEX RP_DetectorConfigurationIndex  (detectorConfigurationID),
  INDEX RP_CollisionTypeIndex          (collisionTypeID)
) TYPE=MyISAM;

DROP TABLE IF EXISTS FileData; CREATE TABLE FileData
(
  fileDataID                    BIGINT          NOT NULL AUTO_INCREMENT,
  runParamID                    INT             NOT NULL,
  fileName                      VARCHAR(255)       NOT NULL,
  baseName                      VARCHAR(255)       NOT NULL COMMENT 'Name without extension',
  sName1                        VARCHAR(255) NOT NULL COMMENT 'Will be used for name+runNumber',
  sName2                        VARCHAR(255) NOT NULL COMMENT 'Will be used for name before runNumber',
  productionConditionID         INT             NULL,
  numEntries                    MEDIUMINT,
  md5sum                        CHAR(32)     DEFAULT 0,
  fileTypeID                    SMALLINT NOT NULL,
  fileSeq                       SMALLINT,
  fileStream                    SMALLINT,
  fileDataComments              TEXT,
  PRIMARY KEY                   (fileDataID),
  INDEX         FD_FileNameIndex                (fileName(40)),
  INDEX         FD_BaseNameIndex                (baseName),
  INDEX         FD_SName1Index                  (sName1),
  INDEX         FS_SName2Index                  (sName2),
  INDEX         FD_RunParamsIndex               (runParamID),
  INDEX         FD_ProductionConditionIndex     (productionConditionID),
  INDEX         FD_FileTypeIndex                (fileTypeID),
  INDEX         FD_FileSeqIndex                 (fileSeq),
  UNIQUE        FD_FileDataUnique               (runParamID, fileName, productionConditionID, fileTypeID, fileSeq)
) TYPE=MyISAM;



# FileParents
DROP TABLE IF EXISTS FileParents; CREATE TABLE FileParents
(
  parentFileID                  BIGINT          NOT NULL,
  childFileID                   BIGINT          NOT NULL,
  PRIMARY KEY                   (parentFileID, childFileID)
) TYPE=MyISAM;

# FileLocations
DROP TABLE IF EXISTS FileLocations; CREATE TABLE FileLocations
(
  fileLocationID                BIGINT          NOT NULL      AUTO_INCREMENT,
  fileDataID                    BIGINT          NOT NULL,
  filePathID                    BIGINT          NOT NULL,
  storageTypeID                 MEDIUMINT       NOT NULL,
  createTime                    TIMESTAMP,
  insertTime                    TIMESTAMP       NOT NULL,
  owner                         VARCHAR(30),
  fsize                         BIGINT,
  storageSiteID                 SMALLINT      NOT NULL,
  protection                    VARCHAR(15),
  hostID                        BIGINT          NOT NULL DEFAULT 1,
  availability                  TINYINT         NOT NULL DEFAULT 1,
  persistent                    TINYINT         NOT NULL DEFAULT 0,
  sanity                        TINYINT         NOT NULL DEFAULT 1,
  PRIMARY KEY                   (fileLocationID),
  INDEX         FL_FilePathIndex                (filePathID),
  INDEX         FL_FileDataIndex                (fileDataID),
  INDEX         FL_StorageTypeIndex             (storageTypeID),
  INDEX         FL_StorageSiteIndex             (storageSiteID),
  INDEX         FL_HostIndex                    (hostID),
  UNIQUE        FL_FileLocationUnique           (fileDataID, storageTypeID, filePathID, storageSiteID, hostID)
) TYPE=MyISAM;

XML configuration

 

<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<!DOCTYPE SCATALOG [
   <!ELEMENT SCATALOG (SITE*)>
       <!ATTLIST SCATALOG VERSION CDATA #REQUIRED>
   <!ELEMENT SITE (SERVER+)>
       <!ATTLIST SITE name (BNL | LBL) #REQUIRED>
       <!ATTLIST SITE description CDATA #IMPLIED>
       <!ATTLIST SITE URI CDATA #IMPLIED>
   <!ELEMENT SERVER (HOST+)>
       <!ATTLIST SERVER SCOPE (Master | Admin | User) #REQUIRED>
   <!ELEMENT HOST (ACCESS+)>
       <!ATTLIST HOST NAME CDATA #REQUIRED>
       <!ATTLIST HOST DBTYPE CDATA #IMPLIED>
       <!ATTLIST HOST DBNAME CDATA #REQUIRED>
       <!ATTLIST HOST PORT CDATA #IMPLIED>
   <!ELEMENT ACCESS EMPTY>
       <!ATTLIST ACCESS USER CDATA #IMPLIED>
       <!ATTLIST ACCESS PASS CDATA #IMPLIED>
]>



<SCATALOG VERSION="1.0.1">
        <SITE name="BNL">
                <SERVER SCOPE="Master">
                        <HOST NAME="mafata.wherever.net" DBNAME="Catalog_XXX" PORT="1234">
                                <ACCESS USER="Moi" PASS="HelloWorld"/>
                        </HOST>
                        <HOST NAME="mafata.wherever.net" DBNAME="Catalog_YYY" PORT="1235">
                                <ACCESS USER="Moi" PASS="HelloWorld"/>
                        </HOST>
                        <HOST NAME="duvall.star.bnl.gov" DBNAME="FileCatalog" PORT="">
                                <ACCESS USER="FC_master" PASS="AllAccess"/>
                        </HOST>
                </SERVER>
                <SERVER SCOPE="Admin">
                        <HOST NAME="duvall.star.bnl.gov" DBNAME="FileCatalog_BNL" PORT="">
                                <ACCESS USER="FC_admin" PASS="ExamplePassword"/>
                        </HOST>
                </SERVER>
                <SERVER SCOPE="User">
                        <HOST NAME="duvall.star.bnl.gov" DBNAME="FileCatalog_BNL" PORT="">
                                <ACCESS USER="FC_user" PASS="FCatalog"/>
                        </HOST>
                </SERVER>
        </SITE>
</SCATALOG>

Migration and notes from V01.265 to V01.275

This document is intended for FileCatalog managers only who have previously deployed an earlier version of API and older database table layout. It is NOT intended for users.

Reasoning for this upgrade and core of the upgrade

One of the major problem with the preceding database layout started to show itself when we reached 4 Million entries (for some reason, we seem to have magic numbers). A dire restriction was the presence of the field 'path' and 'nodename' in the FileLocations table. This table became unnecessarily large (of the order of GB) and sorting and queries would become slow and IO demanding (regardless of our careful indexing). The main action was to move both field to separate tables. This change requires a two step modification :

  1. reshape of the database (leaving the old field), deployment of the database API in cross mode support
  2. run the normalization scripts filling the new table and fields, deployment of the final API and drop of the obsolete columns (+ index rebuild)

The steps are more carefully described below ...

Step by step migration instructions

Has to be made in several steps for safety a least interruption of service (although a pain to the manager). Note that you can do that much faster by cutting the Master/slave relationship, disabling all daemons auto-updating the database, proceed with table reshape and normalization script execution, drop and rebuild index, deploy the point-of-no-return API and restore Master/slave relation).

This upgrade is best if you have perl 5.8 or upper. Note that this transition will be the LAST one using perl 5.6 (get ready for a perl upgrade on your cluster).

We will assume you know how to connect to your database from an account able to manipulate and create any tables in the FileCatalog database.

Steps in Phase I

  1. (0) Create the following tables
      DROP TABLE IF EXISTS FilePaths; CREATE TABLE FilePaths
      (
        filePathID                    BIGINT         NOT NULL         AUTO_INCREMENT,
        filePathName                  VARCHAR(255)   NOT NULL         UNIQUE,
        filePathCount                 INT,
        PRIMARY KEY                   (filePathID)
      ) TYPE=MyISAM;
    
      DROP TABLE IF EXISTS Hosts; CREATE TABLE Hosts 
     (
        hostID      smallint(6) NOT NULL auto_increment,
        hostName    varchar(30) NOT NULL default 'localhost',
        hostIDate   timestamp(14) NOT NULL,
        hostCreator varchar(15) NOT NULL default 'unknown',
        hostCount   int(11) default NULL,
        hostComment text,
        PRIMARY KEY (hostID),
        UNIQUE KEY  hostName (hostName)
      ) TYPE=MyISAM;
    
    
  2. Modify some table and recreate one
         
         ALTER TABLE `FileLocations` ADD `filePathID` bigint(20) NOT NULL default '0' AFTER `fileDataID`;
         ALTER TABLE `FileLocations` ADD `hostID` bigint(20) NOT NULL default '1' AFTER `protection`;
         UPDATE TABLE `FileLocations` SET hostID=0;
    
         # note that I did that one from the Web interface (TBC)
         INSERT INTO Hosts VALUES(0,'localhost',NOW()+0,'',0,'Any unspecified node'); 
    
         ALTER TABLE `FileLocations` ADD INDEX ( `filePathID` )  
    
         ALTER TABLE `FilePaths` ADD `filePathIDate` TIMESTAMP NOT NULL AFTER `filePathName` ;
         ALTER TABLE `FilePaths` ADD `filePathCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `filePathIDate` ;
         ALTER TABLE `FilePaths` ADD `filePathComment` TEXT AFTER `filePathCount`;
    
         ALTER TABLE `StorageSites` ADD  `storageSiteIDate` TIMESTAMP NOT NULL AFTER `storageSiteLocation` ;
         ALTER TABLE `StorageSites` ADD  `storageSiteCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `storageSiteIDate` ;
         ALTER TABLE `StorageSites` DROP `storageComment`;
         ALTER TABLE `StorageSites` ADD  `storageSiteComment` TEXT AFTER `storageSiteCount`;
    
         ALTER TABLE `StorageTypes` ADD `storageTypeIDate` TIMESTAMP NOT NULL AFTER `storageTypeName` ;
         ALTER TABLE `StorageTypes` ADD `storageTypeCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `storageTypeIDate` ;
    
    
         ALTER TABLE `FileTypes` ADD `fileTypeIDate` TIMESTAMP NOT NULL AFTER `fileTypeExtension` ;
         ALTER TABLE `FileTypes` ADD `fileTypeCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `fileTypeIDate` ;
         ALTER TABLE `FileTypes` ADD `fileTypeComment` TEXT AFTER `fileTypeCount`;
    
    
         ALTER TABLE `TriggerSetups` ADD `triggerSetupIDate` TIMESTAMP NOT NULL AFTER `triggerSetupComposition` ;
         ALTER TABLE `TriggerSetups` ADD `triggerSetupCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `triggerSetupIDate`;
         ALTER TABLE `TriggerSetups` ADD `triggerSetupCount`   INT AFTER `triggerSetupCreator`;
         ALTER TABLE `TriggerSetups` ADD `triggerSetupComment` TEXT  AFTER `triggerSetupCount`;
    
         ALTER TABLE `EventGenerators` ADD `eventGeneratorIDate` TIMESTAMP NOT NULL AFTER `eventGeneratorParams` ;
         ALTER TABLE `EventGenerators` ADD `eventGeneratorCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `eventGeneratorIDate` ;
         ALTER TABLE `EventGenerators` ADD `eventGeneratorCount`   INT AFTER `eventGeneratorCreator`;
    
         ALTER TABLE `RunTypes` ADD `runTypeIDate` TIMESTAMP NOT NULL AFTER `runTypeName` ;
         ALTER TABLE `RunTypes` ADD `runTypeCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `runTypeIDate` ;
    
         ALTER TABLE `ProductionConditions` DROP `productionComments`; 
         ALTER TABLE `ProductionConditions` ADD  `productionConditionIDate`   TIMESTAMP NOT NULL AFTER `libraryVersion`;
         ALTER TABLE `ProductionConditions` ADD  `productionConditionCreator` CHAR( 15 ) DEFAULT 'unknown' NOT NULL AFTER `productionConditionIDate`;
         ALTER TABLE `ProductionConditions` ADD  `productionConditionComment` TEXT AFTER `productionConditionCount`;
    
    
    
         #
         # This table was not shaped as a dictionary so needs to be re-created
         # Hopefully, was not filled prior (but will be this year)
         #
         DROP TABLE IF EXISTS TriggerWords; CREATE TABLE TriggerWords
         (
            triggerWordID           MEDIUMINT       NOT NULL        AUTO_INCREMENT,
            triggerWordName         VARCHAR(50)     NOT NULL,
            triggerWordVersion      CHAR(6)         NOT NULL DEFAULT "V0.0",
            triggerWordBits         CHAR(6)         NOT NULL,  
            triggerWordIDate        TIMESTAMP       NOT NULL,
            triggerWordCreator      CHAR(15)        DEFAULT 'unknown' NOT NULL,
            triggerWordCount        INT,
            triggerWordComment      TEXT,
            UNIQUE   TW_TriggerCharacteristic (triggerWordName, triggerWordVersion, triggerWordBits),
            PRIMARY KEY             (triggerWordID)
         ) TYPE=MyISAM;
  3. Deploy the new API CVS version 1.62 of FileCatalog.pm

  4. Run the following utility scripts

    util/path_convert.pl
    util/host_convert.pl

    Note that those scripts use a new method $fC->connect_as("Admin"); which assumes that the Master Catalog will be accessed using the XML connection description. Also, it should be obvious that

    use lib "/WhereverYourModulAPIisInstalled"; should be replaced by the appropriate path for your site (or test area). Finally, it uses API CVS version 1.62 which supports Xpath and Xnode transitional keywords allowing us to transfer the information from one field to one table.

  5. Check that Hosts table was filled properly and automatically with Creator/IDate
  6. Paranoia step : Re-run the scripts mentioned 2 steps ago

    At this stage and ideally, nothing should happen (as you have already modified the records).
    A few tips prior from doing that
    • % fC_cleanup.pl -modif node=localhost -cond node='' -doit
      would hopefully do nothing but if you have messed something up in the past, hostName would be NULL and the above would be necessary.
    • After a full update, the following queries should return NOTHING
      % get_file_list.pl -keys flid -cond rfpid=0 -all -alls -as Admin
      % get_file_list.pl -keys flid -cond rhid=0 -all -alls -as Admin

      Those are equivalent to the SQL statements
      >SELECT FileLocations.fileLocationID FROM FileLocations WHERE FileLocations.filePathID = 0 LIMIT 0, 100
      >SELECT FileLocations.fileLocationID FROM FileLocations WHERE FileLocations.hostID = 0 LIMIT 0, 100 
    If it does return anything, contact me for further investigation and database repairs. As a side note, the -as keyword was introduced recently and you should update your get_file_list.pl script if not available.
  7. Make a backup copy of the database for security (optional but safer) Backup can be done by easer a dump of mysql or more trivially, a cp -r of the database directory.
  8. Leave it running for a few days (should be fine) for confidence consolidation ;-)

You are ready for phase II. Hang on tight now ...

Steps in Phase II

Those steps are no VERY intrusive and potentially destructive. Be careful from here on ...

  1. Stop all daemons, be sure that during the rest of the operations, NO command attempts to manipulate the database. If you want to shield your users from the upgrade, stop all Master/slave relations.
  2. Connect to the master FileCatalog as administrator for that database and execute the following SQL commands
      > ALTER TABLE `FileLocations` ADD INDEX FL_HostIndex (hostID);
      > ALTER TABLE `FileLocations` DROP INDEX `FL_FileLocationUnique`, ADD UNIQUE (fileDataID, storageTypeID, filePathID, storageSiteID, hostID);
    
      # drop the columns not in use anymore / should also get rid of the associated
      # indexes.
      > ALTER TABLE `FileLocations` DROP COLUMN nodeName;
      > ALTER TABLE `FileLocations` DROP COLUMN filePath;
    
      # "rename" index / was created with a name difference to avoid clash for transition
      # now renamed for consistency
      > ALTER TABLE `FileLocations` DROP INDEX `filePathID`, ADD INDEX  FL_FilePathIndex (filePathID);
  3. OK, you should be done. Deploy either CVS version 1.63 which correspond to the FileCatalog API version V01.275 and above ... (by the way, get_file_list.pl -V gives the API version).


 

A few notes

  • The new API is XML connection aware via a non-mandatory module named XML::Simple . You should install that module but there are some limitations if you are using perl 5.6 i.e., you MUST use the schema with ONLY one choice per category (Admin, Master or User).
  • Your scripts will likely need to change if your Database Master and Slave are not on the same node (i.e. the administration account for the FileCatalog can be used only on the database Master and the regular user account on the Slave). There are a few forms of this such as the one below
    # Get connection fills the blanks while reading from XML
    # However, USER/PASSWORD presence are re-checked
    #$fC->debug_on();
    ($USER,$PASSWD,$PORT,$HOST,$DB) = $fC->get_connection("Admin");
    $port = $PORT if ( defined($PORT) );
    $host = $HOST if ( defined($HOST) );
    $db   = $DB   if ( defined($DB) );
    
    
    if ( defined($USER) ){   $user = $USER;}
    else {                   $user = "FC_admin";}
    
    if ( defined($PASSWD) ){ $passwd = $PASSWD;}
    else {                   print "Password for $user : ";
                             chomp($passwd = );}
    
    #
    # Now connect using a fully specified user/passwd/port/host/db
    #
    $fC->connect($user,$passwd,$port,$host,$db);

    or counting on the full definition in the XML file

    $fC    = FileCatalog->new();
    $fC->connect_as("Admin");
  • Note a small future convenience when XML is ON. connect_as() does not only select as who you want to connect to but where as well. In fact, the proper syntax is intent=SITE::User (for example BNL::Admin is valid as well as LBL::User). This is only partly supported however.
  • The new version of the API automatically add information in dictionary tables. Especially, the account under which a new dictionary value was inserted (Creator) and the insertion date (IDate) are filled automatically. A side effect being that the new API is NOT compatible with previous database table layout (no backward support will be attempted).

Migration and notes from V01.275 to V01.280

This document is intended for FileCatalog managers only who have previously deployed an earlier version of API and older database table layout. It is NOT intended for users.

Reasoning for this upgrade and core of the upgrade

This upgrade is a minor one, making support for two more detector sub-systems. The new API supports this modification. You need to alter the table DetectorConfigurations and add two columns. API are always forward compatible in that regard so it is completely safe to alter the table and deploy the API later.

ALTER TABLE `DetectorConfigurations` ADD dBSMD TINYINT;
ALTER TABLE `DetectorConfigurations` ADD dESMD TINYINT;
UPDATE `DetectorConfigurations` SET dBSMD=0;
UPDATE `DetectorConfigurations` SET dESMD=0;

And deploy the API V01.280 or later. You are done.

HPSS services

HPSS Performance study

Introduction

HPSS is software that manages petabytes of data on disk and robotic tape libraries. We will discuss in this section our observations as per the efficiency of accessing files in STAR as per a snapshot of the 2006 situation. It is clear that IO opptimizations has several components amongst which:
  • Access pattern optimization (request ordering, ...)
  • Optimization based on tape drive and technology capabilities
  • Miscellaneous technology considerations
    (cards, interface, firmware and driver, RAID, disks, ...)
  • HPSS disk cache optimizations
  • COS and/or PVR optimization
However, several trend have already been the object of past research and we will point to some of those and compare to our situation rather than debating the obvious. Wewill try to keep a focus on measurements in our environment.

Tape drive and technology capabilities

A starting point would be to discuss the capabilities of the technologies involved and their maximum performance and limitations. In STAR, two technologies remain as per 1006/10:
  • the 994B drives
  • the LTO-3 drives

Access pattern optimization (request ordering, ...)

A first simple and immediate consideration is to minimize tape mount and dismount operations, causing latencies and therefore performance drops. Since we use the DataCarousel for most restore operations, let's summarize its features.

The DataCarousel

The DataCarousel (DC) is an HPSS front end which main purpose is to coordinate requests from many un-correlated client's requests. Its main assumption is that all requests are asynchronous that is, you make a request from one client and it is satisfied “later” (as soon as possible). In other words, the DC aggregates all requests from all clients (many users could be considered as separate clients) and re-order them according policies, and possibly aggregating multiple requests for the same source into one request to the mass storage. The DC system itself is composed of a light client program (script), a plug-and-play policy based server architecture component (script) and a permanent process (compiled code) interfacing with the mass storage using HPSS API calls (this components is known as the “Oakridge Batch” although it current code content has little to do with the original idea from the Oakridge National Laboratory). Client and server interacts via a database component isolating client and server completely from each other (but sharing the same API , a perl module).

Policies may throttle the amount of data by group (quota, bandwidth percentage per user, etc ... i.e. queue request fairshare) but also perform tape access optimization such as grouping requests by tape ID (for equivalent share, all requests from the same tape are grouped together regardless of the time at which this request was performed or position in the request queue). The policy could be anything one can come up with based on the information either historical usage or current pending requests and characteristics of those requests (this could include file type, user, class of service, file size, ...). The DC then submits bundle of requests to the daemon component ; each request is a bundle of N file and known as a “job”. The DC submits K of those jobs before stopping and observing the mass storage behavior: if the jobs go through, more are submitted otherwise, either the server stops or proceed with a recovery procedure and consistency checks (as it will assume that no reaction and no unit of work being performed is a sign of MSS failure). In other words, the DC will also be error resilient and recover from intrinsic HPSS failures (being monitored). Whenever the files are moved from tape to cache in the MSS, a call back to the DC server is made and captive account connection is initiated to pull the file out of mass storage cache to more permanent storage.

Optimizations

While the policy is clearly a source of optimization (as far as the user is concerned), from a DataCarousel “post policy” perspective, N*K files are being requested at minimum at every point in time. In reality, more jobs are being submitted so the consumption of the “overflow”of job are used to monitor if the MSS is alive. The N*K files represents a total amount of files which should match the number of threads allowed by the daemon. The current setting are K=50, N=15 with an overflow allowed up to 25). The daemon itself has the possibility to treat requests simultaneously according to a“depth”. Those calls to HPSS are however only advisory. The depth is set at being 30 deep for the DST COS and 20 deep for the Raw COS. The deepest the request queue will be, more files will be requested simultaneously but this means that the daemon will also have to start more threads as previously noted. Those parameters have been showed to influence the performance to some extent (within 10%) with however a large impact on response time: the larger the request stack, the “less instantaneous” the response from a user's perspective (since the request queue length is longer).

The daemon has the ability of organizing X requests into a sorted list of tape ID and number of requests per tape. There are a few strategies allowing to alter the performance. We chose to enable “start with the tape with the largest number of files requested”. In addition, and since our queue depth is rather small comparing to the ideal number of files (K) per job, we order the files requested by the user by tape ID. Both optimizations are in place and lead to a 20% improvement within a realistic usage (bulk restore, Xrootd, other user activities).

Remaining optimizations

  • Optimization based on tapeID would need to be better quantified (graph, average restore rate) for several class of files and usage. TBD.

  • The tape ID program is a first implementation returning partial information. Especially, the MSS failures are not currently handled, leading to setting the tape ID to -1 (since there are now ways to recognize whether or not it is an error or a file missing in HPSS or even a file in the MSS MetaData server but located on a bad tape). Work in progress.

  • The queue depth parameters should be studied and adjusted according to the K and N values. However, this would need to respect the machine / hardware capabilities. The beefier the machine would be, the better but this is likely a fine tuning. This needs to be done with great care as the hardware is also shared by multiple experiments. Ideally the compiled daemon should auto-adjust to the DC API settings (and respect command line parameters for queue depth). TBD.

  • Currently, the daemon number of threads used for handling the HPSS API calls and to handle the call backs are sharing the same pool. This diminishes the number of threads available to communication with the Mass Storage and therefore, causes performance fluctuations (call back threads could get “stuck” or come in “waves” - we observed cosine behavior perhaps related to this issue). TBD.


Optimizations based on drive and technology capabilities

File size effects on IO performance

In this paper (CERN/IT 2005), the author measured the IO performance as a function of file size and number of files requested per tape. The figure of relevance is added here for illustration.
HPSS IO efficiency per file size and per
This graph has been extracted for an optimal 30 MB/s capable drive (9940B like). Both file size and number of files per cartriges have been evaluated. The conclusions are immediate and confirms the advertized behavior observed by all HPSS deployment (see references below). Smal file size is detrimental to HPSS IO performance and this size highly depends on the tape technology.

In STAR, we use the 9940B (read only as per 2006) and LT0-3 drives (read and write /all new files would go to LTO-3). The finding would not be altered but we have little marging of flexibility as per the "old" tape drive.

Below, we show the average file size per file type in STAR as a 2006 snapthot.

Average (bytes) Average (MB) File Type
943240627 899 MC_fzd
666227650 635 MC_reco_geant
561162588 535 emb_reco_event
487783881 465 online_daq
334945320 319 daq_reco_laser
326388157 311 MC_reco_dst
310350118 295 emb_reco_dst
298583617 284 daq_reco_event
246230932 234 daq_reco_dst
241519002 230 MC_reco_event
162678332 155 MC_reco_root_save
93111610 88 daq_reco_MuDst
52090140 49 MC_reco_MuDst
17495114 16 MC_reco_minimc
14982825 14 daq_reco_emcEvent
14812257 14 emb_reco_geant
12115661 11 scaler
884333 0 daq_reco_hist

Note that the average size for an event file is 284MB while for a MicroDST, the size average is 84 MB so a ratio of 3. The number of files per catriges is at best 1.2 files per cartriges with peaks at 10 or more. THis is mostly due to a request profile dominated by Xrootd "random" pattern and a few user's requests. According to the previous study, the IO efficiency should be around 8% for 84 MB file and reaching perhaps 20-25% efficiency for a 284 MB average file class. Cosndiering we have not studied the drive access pattern beyond a simple scaling (i.e. we will ignore to first order the fcat we have many drives at our disposal), we should see a performance change change from 8 to 20 so an improvement of x2.5-3 shall the file size argument stand.

In order to observe the IO performance when small or big files are being requested, we requested event files to the DataCarousel and produced the below two graphs for the dates ranging between 2006/10/26 and 2006/10/29. The first graph represents the IO "before" the massive submission of event file dominated requests, the second the graph "after" the event. The graphs are preliminary (work in progress).
DC IOPerf 20061027 (time before)
DC IOPerf 20061029 (time after)
We observe an average transfer rate at best saturating at 15 MB/sec for MuDST dominated files and an average close to 50 MB/sec for event files. The ratio is ~ 3 which remains consistent with the results on HPSS IO efficiency per file size and per and our initial rough estimate.

Note: It is interresting to note that a significant mix of very small files (below 12 MB average) would bring the performance to a sub 1% efficiency. The net result for 9 drives (as we have in STAR) would be an aggregate performance no better than 3MB/sec for a 9940B x 9 drive configuration . We observe periods with such poor performance. The second observation is that even with MuDST dominated files only, we would not be able to exceed in speed 70% of the performance of one drive so at best, 21 MB/sec (this correspond to our current "best hour"). The results are coherent to first order.

MSS failures and cascading effects

A poor MSS IO efficiency is one thing, but stability is another. Under poor performance situations, failures are critical to minimize. We have already stated that the DC is error resilient. However, during failure periods, the request queue accumulates requests and whenever the mass storage comes back, all requests are suddenly released, opening the flood gate of IO ... which are not much of a flood than a drip. As a consequence, user's requests or bulk transfer would not suffer much but modes requiring immediate response (such as Xrootd) would become largely impacted. In fact, shall the downtime be long enough, it is likely that all requests occuring while the errors started will fail but subsequent accumulated requests will also cause further delays and spurious Xrootd failures - Xrootd will timeout if the DC has not satisfied its request for 3600 secondes i.e. 1 hour per file. The following graph shows an error sequence:
HPSS Error types 2006 09 W36
While there are errors at the early stage of this graph, reminissent of previous failures, problems during the focused time period starts around 9 AM with a MetaData lookup failure (cannot get lock after 5 retries). Subsequentely, there is an immediate appearance of files failing to be restored for more than an hour and this continues up to around 10:30 to 11:00 at ehich point, more errors occur from the Mass Storage system (a mix of MetaData lookup failure and massive authentication failures). The authentication failures are related to the DCE component failure, causing periodic problems. In our case, we immediately the light blue band continuing for up to 15:00 (3 PM) followed but yet again a massive meta data failure. All of those cacading failures would, for a period of no less that 6 hours long, affect users requesting files from Xrootd which would needto be restored from MSS.

From all failures, the relative proprtion for that day is displayed below
HPSS Error relative proportion, 2006-09
Only one error in this graph (a DataCarousel connection failue) cna be fixed from a STAR stand point, allother occurances being a facility issue to resolve.

Miscellaneous technology considerations

All considerations in this sections are beyond our control and a facility work and optmization.

HPSS disk cache optimizations

This section seems rather academic considering the previous sections improvement perspectives.

COS and PVR optimizations

In this section, we will discuss optimizing based on file size, perhaps isolated by PVR or COS. This will be possible in future run but would lead to a massive repackaging of files and data for the past years.


Appendix

Further reading:



HSI


This is an highlight of the HSI features. Please, visit the HSI Home Page for more information.

HSI is a friendly interface for users of the High Performance Storage System (HPSS). It is intended to provide a familiar Unix-style environment for working within the HPSS environment, while automatically taking advantage of the power of HPSS (e.g. for high speed parallel file transfers) without requiring any special user interaction, where possible.

HSI requires one of two authentication methods (see the HSI User Guide for more information):
  • Kerberos (the preferred method)
  • DCE keytab (using a keytab file generated for you by the HPSS system administrators)
HSI's features include:
  • Familiar Unix-style command interface, with commands such as "LS", "CD", etc.
  • Interactive, batch, or "one-liner" execution modes
  • Ability to interactively pipe data into or out of HPSS, using filters such as "TAR"
  • Recursive option is available for most commands; including the ability to copy an entire directory tree to or from HPSS with a single simple command
  • Conditional put and get operations, including ability to update based on file timestamps
  • Automatically uses HPSS parallel I/O features for file transfer operations
  • Multi-threaded I/O within a single process space
  • Command aliases and abbreviations
  • 10 working directories
  • Ability to read command input from a file, and write log or command output to a file.
  • Non-DCE version runs on most major Unix-based platforms
  • Non-DCE version provides the ability to connect to multiple HPSS systems and perform 3rd-party copies between the systems, using a "virtual drive" path notation.

HTAR

To use htar within the HPSS environment, users are required to have the valid Kerberos credentials.

The following is the man page of how to use htar.

   
                     NAME
                          htar - HPSS tar utility
     
   
                     PURPOSE
                          Manipulates HPSS-resident tar-format archives.


                     SYNOPSIS
                          htar  -{c|t|x|X}  -f Archive [-?]  [-B] [-E]  [-L  inputlist] [-h]  [-m] [-o]
                                 [-d  debuglevel] [-p] [-v]  [-V] [-w]
                                 [-I  {IndexFile | .suffix}] [-Y  [Archive COS ID][:Index File COS ID]]
                                 [-S  Bufsize] [-T  Max Threads] [Filespec | Directory ...]

                     DESCRIPTION
                          htar  is a utility which manipulates HPSS-resident archives
                          by writing files to,  or retrieving files from the High
                          Performance Storage System (HPSS).  Files written to HPSS
                          are in the POSIX 1003.1 "tar" format, and may be retrieved
                          from HPSS, or read by native tar programs.

                          For those unfamiliar with HPSS, an introduction can be found
                          on the web at
                                 http://www.sdsc.edu/hpss

                          The local files used by the htar command are represented by
                          the Filespec parameter. If the Filespec parameter refers to
                          a directory, then that directory, and, recursively, all
                          files and directories within it, are referenced as well.

                          Unlike the standard Unix "tar" command, there is no default
                          archive device; the "-f Archive" flag is required.

                     Archive and Member files
                          Throughout the htar documentation, the term "archive file"
                          is used to refer to the tar-format  file, which is named by
                          the "-f filename" command line option. The term "member
                          file" is used to refer to individual files contained within
                          the archive file.

                     WHY USE HTAR
                          htar has been optimized for creation of archive files
                          directly in HPSS, without having to go through the
                          intermediate step of first creating the archive file on
                          local disk storage, and then copying the archive file to
                          HPSS via some other process such as ftp or hsi. The program
                          uses multiple threads and a sophisticated buffering scheme
                          in order to package member files into in-memory buffers,
                          while making use of the high-speed network striping
                          capabilities of HPSS.

                          In most cases, it will be signficantly  faster to use htar
                          to create a tar file in HPSS than to either create a local
                          tar file and then copy it to HPSS, or to use tar piped into
                          ftp (or hsi) to create the tar file directly in HPSS.

                          In addition, htar creates a separate index file, (see next
                          section) which contains the names and locations of all of
                          the  member files in the archive (tar) file.  Individual
                          files and directories in the archive can be randomly
                          retrieved without having to read through the archive file.
                          Because the index file is usually smaller than the archive
                          file, it is possible that the index file may reside in HPSS
                          disk cache  even though the archive file has been moved
                          offline to tape; since htar uses the index file for listing
                          operations, it may be possible to list the contents of the
                          archive file without having to incur the time delays of
                          reading the archive file back onto disk cache from tape.

                          It is also possible to create an index file for a tar file
                          that was not originally created by htar.

                     HTAR Index File
                          As part of the process of creating an archive file on HPSS,
                          htar also creates an index file, which is a directory of the
                          files contained in the archive. The Index File includes the
                          position of member files within the archive, so that files
                          and/or directories can be randomly retrieved from the
                          archive without having to read through it sequentially.  The
                          index file is usually significantly smaller in size than the
                          archive file, and may often reside in HPSS disk cache even
                          though the archive file resides on tape. All htar operations
                          make use of an index file.

                          It is also possible to create an index file for an archive
                          file that was not created by htar, by using the "Build
                          Index" [-X] function (see below).

                          By default, the index filename is created by adding ".idx"
                          as a suffix to the Archive name specified by the -f
                          parameter.  A different suffix or index filename may be
                          specified by the "-I " option, as described below.

                          By default, the Index File is assumed to reside in the same
                          directory as the Archive File.  This can be changed by
                          specifying a relative or absolute pathname via the -I
                          option. The Index file's relative pathname is relative to
                          the Archive File directory unless an absolute pathname is
                          specified.

                     HTAR Consistency File
                          HTAR writes an extra file as the last member file of each
                          Archive, with a name similar to:

                                  /tmp/HTAR_CF_CHK_64474_982644481

                          This file is used to verify the consistency of the Archive
                          File and the Index File.  Unless the file is explicitly
                          specified, HTAR does not extract this file from the Archive
                          when the -x action is selected.  The file is listed,
                          however, when the -t action is selected.

                     Tar File Restrictions
                          When specifying path names that are greater than 100
                          characters for a file (POSIX 1003.1 USTAR) format, remember
                          that the path name is composed of a prefix bufferFR, a /
                          (slash), and a name buffer.

                          The prefix buffer can be a maximum of 155 bytes and the name
                          buffer can hold a maximum of 100 bytes. Since some
                          implementations of TAR require the prefix and name buffers
                          to terminate with a null (' ') character, htar enforces the
                          restriction that the effective prefix buffer length is 154
                          characters (+ trailing zero byte), and the name buffer
                          length is 99 bytes (+ trailing zero byte). If the path name
                          cannot be split into these two parts by a slash, it cannot
                          be archived. This limitation is due to the structure of the
                          tar archive headers, and must be maintained for compliance
                          with standards and backwards compatibility. In addition, the
                          length of a destination for a hard or symbolic link ( the
                          'link name') cannot exceed 100 bytes (99 characters + zero-
                          byte terminator).

                     HPSS Default Directories
                          The default directory for the Archive file is the HPSS home
                          directory for the DCE user.  An absolute or relative HPSS
                          path can optionally be specified for either the Archive file
                          or the Index file. By default, the Index file is created in
                          the same HPSS directory as the Archive file.

                     Use of Absolute Pathnames
                          Although htar does not restrict the use of absolute
                          pathnames (pathnames that begin with a leading "/") when the
                          archive is created, it will remove the leading / when files
                          are extracted from the archive.  All extracted files use
                          pathnames that are relative to the current working
                          directory.

                     HTAR USAGE
                          Two groups of flags exist for the htar command; "action"
                          flags and "optional" flags. Action flags specify the
                          operation to be performed by the htar command, and are
                          specified by one of the following:

                          -c, -t, -x, -X

                          One action flag must be selected in order for the htar
                          command to perform any useful function.

                     File specification (Filespec)
                          A file specification has one of the following forms:

                                  WildcardPath
                                     or
                                  Pathname
                                     or
                                  Filename

                          WildcardPath is a path specification that includes standard
                          filename pattern-matching characters, as specified for the
                          shell that is being used to invoke htar.  The pattern-
                          matching characters are expanded by the shell and passed to
                          htar as command line arguments.

                     Action Flags
                          Action flags defined for htar are as follows:

                          -c   Creates a new HPSS-resident archive, and writes the
                               local files specified by one or more File parameters
                               into the archive. Warning: any pre-existing archive file
                               will be overwritten without prompting. This behavior
                               mimics that of the AIX tar utility.

                          -t   Lists the files in the order in which they appear in
                               the HPSS- resident archive.   Listable output is
                               written to standard output; all other output is written
                               to standard error.

                          -x   Extracts the files specified by one or more File
                               parameters from the HPSS-resident archive. If the File
                               parameter refers to a directory, the htar command
                               recursively extracts that directory and all of its
                               subdirectories from the archive.

                               If the File parameter is not specified, htar extracts
                               all of the files from the archive. If an archive
                               contains  multiple copies of the same file, the last
                               copy extracted overwrites  all previously extracted
                               copies. If the file being extracted does not already
                               exist on the system, it is created. If you have the
                               proper permissions, then htar command restores all
                               files and directories with the same owner and group IDs
                               as they have on the HPSS tar file. If you  do not have
                               the proper permissions, then files and directories are
                               restored with your owner and group IDs.

                          -X   builds a new index file by reading the entire tar file.
                               This operation is used either to reconstruct an index
                               for tar files whose Index File is unavailable (e.g.,
                               accidentally deleted), or for tar files that were not
                               originally created by htar.

                     Options
                          -?   Displays htar's verbose help

                          -B   Displays block numbers as part of the listing (-t
                               option). This is normally used only for debugging.

                          -d debuglevel
                               Sets debug level (0 - N) for htar. 0 disables debug, 1
                               - n enable progressively higher levels of debug output.
                               5 is the highest level; anything > 5 is silently mapped
                               to 5.  0 is the default debug level.

                          -E   If present, specifies that a local file should be used
                               for the file specified by the "-f Archive" option.  If
                               not specified, then the archive file will reside in
                               HPSS.

                          -f Archive
                               Uses Archive as the name of archive to be read or
                               written. Note: This is a required parameter for htar,
                               unlike the standard tar utility, which uses a built-in
                               default name.

                               If the Archive variable specified is - (minus sign),
                               the tar command writes to standard output or reads from
                               standard input. If you write to standard output, the -I
                               option is mandatory, in order to specify an Index File,
                               which is copied to HPSS if the Archive file is
                               successfully written to standard output. [Note: this
                               behavior is deferred - reading from or writing to pipes
                               is not supported in the initial version of htar].

                          -h   Forces the htar command to follow symbolic links as if
                               they were normal files or directories. Normally, the
                               tar command does not follow symbolic links.

                          -I index_name
                               Specifies the index file name or suffix.  If the first
                               character of the index_name is a period, then
                               index_name is appended to the Archive name, e.g. "-f
                               the_htar -I .xdnx" would create an index file called
                               "the_htar.xndx".  If the first character is not a
                               period, then index_name is treated as a relative
                               pathname for the index file (relative to the Archive
                               file directory) if the pathname does not start with
                               "/", or an absolute pathname otherwise.

                               The default directory for the Index file is the same as
                               for the Archive file.  If a relative Index file
                               pathname is specifed, then it is appended to the
                               directory path for the Archive file.  For example, if
                               the Archive file resides in HPSS in the directory
                               "projects/prj/files.tar", then an Index file
                               specification of "-I projects/prj/files.old.idx" would
                               fail, because htar would look for the file in the
                               directory "projects/prj/projects/prj".  The correct
                               specification in this case is "-I files.old.idx".

                          -L InputList
                               Writes the files and directories listed in the
                               "InputList" file to the archive. Directories named in
                               the InputList file are not treated recursively. For
                               directory names contained in the InputList file, the
                               tar command writes only the directory entry to the
                               archive, not the files and subdirectories rooted in the
                               directory.  Note that "home directory" notation ("~")
                               is not expanded for pathnames contained in the
                               InputList file, nor are wildcard characters, such as
                               "*" and "?".

                          -m   Uses the time of extraction as the modification time.
                               The default is to preserve the modification time of the
                               files. Note that the modification time of directories
                               is not guaranteed to be preserved, since the operating
                               system may change the timestamp as the directory
                               contents are changed by extracting other files and/or
                               directories.  htar will explicitly set the timestamp on
                               directories that it extracts from the Archive, but not
                               on intermediate directories that are created during the
                               process of extracting files.

                          -o   Provides backwards compatibility with older versions
                               (non-AIX) of the tar command. When this flag is used
                               for reading, it causes the extracted file to take on
                               the User and Group ID (UID and GID) of the user running
                               the program, rather than those on the archive.  This is
                               the default behavior for the ordinary user. If htar is
                               being run as root, use of this option causes files to
                               be owned by root rather than the original user.

                          -p   Says to restore fields to their original modes,
                               ignoring the present umask. The setuid, setgid, and
                               tacky bit permissions are also restored to the user
                               with root user authority.

                          -S bufsize
                               Specifies the buffer size to use when reading or
                               writing the HPSS tar file.  The buffer size can be
                               specified as a value, or as kilobytes by appending any
                               of  "k","K","kb", or "KB" to the value.  It can also be
                               specified as megabytes by appending any of  "m" or "M"
                               or "mb" or "MB" to the value, for example, 23mb.

                          -T max_threads
                               Specifies the maximum number of threads to use when
                               copying local member files to the Archive file.  The
                               default is defined when htar is built; the release
                               value is 20.  The maximum number of threads actually
                               used is dependent upon the local file sizes, and the
                               size of the I/O buffers.  A good approximation is
                               usually

                                  buffer size/average file size

                               If the -v or -V option is specified, then the maximum
                               number of local file threads  used while writing the
                               Archive file to HPSS is displayed when the transfer is
                               complete.

                          -V   "Slightly verbose" mode. If selected, file transfer
                               progress will be displayed in interactive mode. This
                               option should normally not be selected if verbose (-v)
                               mode is enabled, as the outputs for the two different
                               options are generated by separate threads, and may be
                               intermixed on the output.

                          -v   "Verbose" mode. For each file processed, displays a
                               one-character operation flag, and lists the name of
                               each file. The flag values displayed are:
                                   "a"  - file was added to the archive
                                   "x"  - file was extracted from the archive
                                   "i"  - index file entry was created (Build Index
                               operation)

                          -w   Displays the action to be taken, followed by the file
                               name, and then waits for user confirmation. If the
                               response is affirmative, the action is performed. If
                               the response is not affirmative, the file is ignored.

                          -Y auto | [Archive CosID][:IndexCosID]
                               Specifies the HPSS Class of Service ID to use when
                               creating a new Archive and/or Index file. If the
                               keyword auto is specified, then the HPSS hints
                               mechanism is used to select the archive COS, based upon
                               the file size.  If -Y cosID  is specified, then cosID
                               is the numeric COS ID to be used for the Archive File.

                               If -Y :IndexCosID is specified, then IndexCosID is the
                               numeric COS ID to be  used for the Index File.  If both
                               COS IDs are specified, the entire parameter must be
                               specified as a single string with no embedded spaces,
                               e.g. "-Y 40:30".

                     HTAR Memory Restrictions
                          When writing to an HPSS archive, the htar command uses a
                          temporary file (normally in /tmp) and maintains in memory a
                          table of files; you receive an error message if htar cannot
                          create the temporary file, or if there is not enough memory
                          available to hold the internal tables.

                     HTAR Environment
                          HTAR should be compiled and run within a non-DCE HPSS environment.

                     Miscellaneous Notes:
                          1. The maximum size of a single Member file within the
                          Archive is approximately 8 GB, due to restrictions in the
                          format of the tar header.  HTAR does not impose any
                          restriction on the total size of the Archive File when it is
                          written to HPSS; however, space quotas or other system
                          restrictions may limit the size of the Archive File when it
                          is written to a local file (-E option).

                          2.  HTAR will optionally write to a local file; however, it
                          will not write to any file type except "regular files".  In
                          particular, it is not suitable for writing to magnetic tape.
                          To write to a magnetic tape device, use the "tar" or "cpio"
                          utility.

                     Exit Status
                          This command returns the following exit values:

                          0       Successful completion.

                          >0      An error occurred.

                     Examples
                          1.   To write the file1 and file2 files to a new archive
                               called "files.tar" in the current HPSS home directory,
                               enter:

                                      htar -cf files.tar file1 file2

                          2.   To extract all files from the project1/src directory in
                               the Archive file called proj1.tar, and use the time of
                               extraction as the modification time,  enter:

                                     htar -xm -f proj1.tar project1/src

                          3.   To display the names of the files in the out.tar
                               archive file within the HPSS home directory, enter:

                                     htar -tvf out.tar

                     Related Information
                          For file archivers: the cat command, dd command, pax
                          command.  For HPSS file transfer programs: pftp, nft, hsi

                          File Systems Overview for System Management in AIX Version 4
                          System Management Guide: Operating System and Devices
                          explains file system types, management, structure, and
                          maintenance.

                          Directory Overview in AIX Version 4 Files Reference explains
                          working with directories and path names.

                          Files Overview in AIX Version 4 System User's Guide:
                          Operating System and Devices provides information on working
                          with files.

                          HPSS web site at http://www.sdsc.edu/hpss

                     Bugs and Limitations:
                          - There is no way to specify relative Index file pathnames
                          that are not rooted in the Archive file directory without
                          specifying an absolute path.

                          - The initial implementation of HTAR does not provide the
                          ability to append, update or remove files.  These features,
                          and others, are planned enhancements for future versions.

Home directories and other areas backups

Home directories

If you accidently erase a file in your home directoy at RFC, you can restore it using a two week backup that you can access directly. Two weeks worth of backups are kept as snapshots. The way it works is that as day pass, live backups are being made on the file system itself hence preserving your files in-place.

For example, your username is 123, your home directory is /star/u/123 and you erased a file /star/u/123/somedir/importantfile.txt and now realise that was a mistake. Don't panic. This is not the end of thw world as snapshot backups exist.

Simply look under /star/u/.snapshot

The directory names are odered by the date and time of backup. Pick a date when the file existed and under there is a copy of your home directory from that day. From here you can restore the file, i.e,

% cp /star/u/.snapshot/20yy-mm-dd_hhxx-mmxx.Daily_Backups_STAR-FS05/123/somedir/importantfile.txt 
/star/u/123/somedir/importantfile.txt

See also starsofi #7363.

AFS areas

Each doc_proected/ AFS areas also have a .backup volume which keeps recently deleted files in that directory until a real AFS based backup is made (then the content is deleted and you will need to ask the RCF to restore your files).  Finding it is tricky though because there is one such directory per volume. The best is to backward search for that directory. For example, let's suppose you are working in /afs/rhic.bnl.gov/star/doc_protected/www/bulkcorr/. If you bacward search for a .backup directory, you will find one as /afs/rhic.bnl.gov/star/doc_protected/www/bulkcorr/../.backup/ and this is where the files for this AFS volume will go upon deletion.

Other areas

Other areas are typically not backed-up.

 

Hypernews

Most Hypernews forum will have to be retired - please consult the list of mailing lists at this link to be sure you need HN at all.
While our Web serve ris down, many Computing related discussions are now happening on Mattermost Chat (later, will be Mail based by popular demand). Please log there using the 'BNL login' option (providing a facility wide unified login) and use your RACF/SDCC kerberos credentials to get in. If you are a STAR user, you will automatically be moved to the "STAR Team".

Please, read the Hypernews in STAR section before registering a new account (you may otherwise miss a few STAR specificities and constraints).

General Information

HyperNews is a cross between the hypermedia of the WWW and Usenet News. Readers can browse postings written by other people and reply to those messages. A forum (also called a base article) holds a tree of these messages, displayed as an indented outline that shows how the messages are related (i.e. all replies to a message are listed under it and indented).

Users can become members of HyperNews or subscribe to a forum in order to get Email whenever a message is posted, so they don't have to check if anything new has been added. A recipient can then send a reply email back to HyperNews, rather than finding a browser to write a reply, and HyperNews then places the message in the appropriate forum.

Hypernews in STAR

In STAR, there are a few specificities with Hypernews as listed below. 

  • Your Hypernews account should match your BNL/RCF account by name. This account must be part of the STAR group. For example, if you have a RCF STAR account named 'abc', you should create an Hypernews account named 'abc'.  Any other account will be removed automatically. Note that if you have any other RCF unix account but not a STAR account, the result will be the same (you will not be able to register to STAR's Hypernews). This is done so automation of account approval can be achieved while complying with the DOE requirement mentioned in Getting a computer account in STAR. If you are a STAR user in good standing, the automation especially allows for immediate use of your account without further approval process.
  • You should NOT use the same password for an Hypernews account as your RCF account. Hypernews has a weak authentication method and while physical access to the machine is needed to crack it, a focus on keeping the password different from the interactive login password(s) is important. In general, Web-based password authentication should  not be the same than interactive account passwords. 
  • Hypernews does not accept Email attachments. This includes Emails containing a mix of text and html - they will be rejected by the system. Please, be aware that whenever you send "formatted" Emails (bold character, font changes etc...), your Email client does NOTHING ELSE than sending the content in two parts: one part is plain text, the second an attached HTML. Hence, Hypernews will NOT process formatting (but will take your Email anyhow).
  • Hypernews posting DO NOT need to be done from the Web interface (this is true for ANY Hypernews systems); you can send an Email directly to the list address. However, posting must have a subject. Subject-less posting will be rejected. Also, we have a spam filter in place and it is noteworthy to mention that to date, we had no accidental rejections of valid Emails.
  • As per 2012/06, all STAR Hypernews fora were made protected. In other words, and in addition of your Hypernews personal account, you MUST use the 'protected' password to access the Web interface.

Startup links

Here are a few startup links and tips, including where to start for a new Hypernews account.

  • If you DO NOT have a STAR account, consult Getting a computer account in STAR first BUT you will STILL need the additional below information:
    • You will need the famous “protected” area password. If you do not understand what this means, you are probably not a STAR collaborator ... Otherwise, you can get this information from your PAC, PWGC, OPS manager, council rep, etc ...
    • For your RCF user account name, you will need to chose a User ID other than “protected”. Hopefully, this will be the case.
  • After you get a RCF Unix account
    • create an Hypernews account  (as indicated in the Hypernews in STAR section) starting from here.
    • IMPORTANT NOTE: please wait at least one hour after you get  confirmation from the RCF before creating your Hypernews account as there is a delay in information propagation of accounts to the Hypernews system.
  • To connect to the Web based Hypernews interface, please login to the system first. This will allow for your session to be authenticated properly and postings to be identified as you. As per 2006, any anonymous posting will be rejected.
  • You can then proceed to either
    • The forum list with all Hypernews forum displayed in descending order of 'last posted'
    • You can Edit your membership to change your personal information (this is a typical link which DO require for you to login first)
    • Use the Hypernews Search engine to search / locate a particular message. This is slow and painful (we have too many message) but is te only way you can search the huge 10 years worth of Email archive.
  • Note again that as soon as you have located your forum of interest and it's address, you can send Emails directly to that list forum and/or answer previous post by using your mail client 'Reply'.

 

Tips related to message delivery to Hypernews

If you have problems sending EMail to Hypernews, please understand and verify the following before asking for help:

  • Hypernews will silently discard EMails detected as spam. This is good news for our Hypernews subscribers but be aware that Spam filtering is a tricky business and some legitimate Email may be rejected unwillingly.
    • The first and foremost reason for rejection is the use of internet service provider (ISP) EMail servers to send Emails to Hypernews. Several ISP are blacklisted as they do not protect their service against anonymous Emails. Using such ISP will have as unfortunate consequence to have your Email rejected. Be sure to use your lab or university as provider or a trusted ISP.
    • The other reason is font encoding - DO NOT use special font encoding while sending Email - Korean (EUC-KR) or Chinese (GB2312, GB18030, Big5, ...) especially gets a high mark from the Spam filter and get you closet o the rejection rating threshold. A few unfortunate words here and there and ...
  • Hypernews in STAR DOES NOT accept attachments: your Email will be silently rejected if any appears.
    • Send instead a note of where your document resides for consulting.
    • DO NOT send messages as HTML - HTML EMail actually send plain text and HTML as an attachment ... and your post will be rejected. Typically, your mail client gives you the possibility to send plain-text for entire domain matching. Hypernews is covered by the www.star.bnl.gov domain.
      • MAC Users using Mac OSX Mail client, please consult this "How to Send a Message in Plain Text" (also explaining why MIME may be dangerous). Alternatively you may want to use Mozilla/Thunderbird as a client.
      • To set this up in Thunderbird so as follow
        Select the _Tools_ menu 
           Select _Options_ a window opens. Select the tab [Composition] -then-> [General]
            click on <send option> in the new opened pannel, then select the [Plain text domain] tab
              click [add] and add star.bnl.gov
              click OK
    • Sending EMail from BNL's Exchage server will result in MIME attachements and hence, cause a rejection of your posting. Two possible solutions offers themselves
      • Use the Hypernews Web interafce to send messages (after making sure you are logged in, click on the bottom reference to go to the message and [Add message]
      • Use a tool like Thunderbird with the SMTP outgoing server set to use the RCF server. Instructions are available here.
         
  • The folowing  restrictions  apply:
    • Use only one Hypernews forum in the To: field (and do not use CC: to another HN forum) - HN will not know where to post if you use multiple fora and the result will be un-predictable (depending on syntax used for the To: field and mailer, the post will end up in one of the specified fora or be discarded entirely)
    • You will NOT be able to forward a post from one forum to another - HN will know and send the message again to the original forum. This is because the information HN keeps for archiving your posts and threading them is part of the message header and not based on where you send the message (header includes Newsgroups, X-HN-Forum, X-HN-Re and X-HN-Loop). Your options could be to strip the header or cut-and-paste the original message into a new one.
    • Multiple recipient on the Email "To:" field will not post your EMail. Strictly speaking, this is a shortcoming in the parsing of the header as defined in RFC2822 (the RFC allows for a list, STAR HN implementation disallow mass posting).
  • One frequent source of issues and unrelated to any of te above (and very STAR infrastructure specific):
    • Always use address book entries using the alias of the form list [at] www.star.bnl.gov and NOT the node specific address (connery, orion, etc...). Especially, older users should remove from their address book any address not specifying the alias.
  • If you need to test sending Email to the system, please do not spam an existing active list - instead, use our test fora: startest or tesp.
    • Remember that Hypernews is a centralized system, if your Email passes and is deleiverred to the test forum, it should be to any other lists
    • Both fora are near identical - testp is nowadays used for testing new code and features so for a casual Email check, you may prefer startest.

 

Installing the STAR software stack

The pages here are under constructions. They aim to help remote sites to install the STAR software stack.

You should read first Setting up your computing environment before going through the documents provided herein as we refer to this page very often. Please, pay particular attention to the list of environment variables defined by the group login script and their meanings in STAR. Be aware of the assumptions as per the software locations (all will be referred by the environment variables listed there) as well as the need to use a custom (provided) set of .cshrc and .login file (you may have to modify them if you install the STAR software locally). Setting up your computing environment  is however NOT written as a software installation instruction and should not be read as such.

Please, follow the instructions in order they appear below

  1. Check first the availability of the CERN libraries as this may be a show stopper. If there is no CERN libraries for your OS version and/or the available libraries are not validated for your OS, you will NOT be able to get the STAR software working on your site.
  2. Your FIRST STEP is to install the Group login scripts. Although not all will be defined, the login should be successful after this step.
  3. The next step is then to install Additional software components
    However, your OS should also have installed a few base system wide RPMs.
    Lists are available on the OS Upgrade page as well as specific issues with some OS. Read it carefully.
  4. Then, install the ROOT library Building ROOT in STAR
  5. Finally, you are ready for a STAR library installation STAR codes

Sparse notes are also in Post installation and special instructions for administrators at OS Upgrade.

 

Group login scripts

Installing

The STAR general group login scripts are necessary to define the STAR environment. They reside in $CVSROOT within the group/ sub-tree. Template files for users .cshrc and .login support also exists within this tree in a sub-directory group/templates. To install properly on a local cluster, there are two possibilities:

  • if you have access to AFS, you should simply
        % mkdir  /usr/local/star # this is only an example
        % cd /usr/local/star     # this directory needs to be readable by a STAR group
        % cvs checkout group     # this assumes CVSROOT is defined 
    This will bring a copy of all you need locally in /usr/local/star/group
  • If you do not have access to AFS from your remote site, get a copy of the entire BNL $GROUP_DIR tree and unpack in a common place (like /usr/local/star above). A copy resides in the AFS tree mentioned Additional software components.

Note that wherever you install the login scripts, they need to be readable by a STAR members (you can do this by allowing a Unix group all STAR users will belong to read access to the tree or by making sure the scripts are all users accessible).

Also, as soon as you get a local copy of the group/templates/ files, EDIT BOTH the cshrc and login files and change on top the definition of GROUP_DIR to it matches your site GROUP script location (/usr/local/group in our example).

To enable a user to use the STAR environment, simply copy the template cshrc and login scripts as indicated in Setting up your computing environment.

Special scripts

Part of our login is optional and the scripts mentioned here will NOT be part of our CVS repository but, if exists, will be executed.

  • site_pre_setup.csh - this script, if exists in $GROUP_DIR, will be executed before the execution of the STAR standard login. Its purpose is to define variables indicating non-standard location for your packages. For those variables which may be redefined, please consult Setting up your computing environment for all the variables (in blue) which may be redefined prior to login.
  • site_post_setup.csh - this script, if exists in $GROUP_DIR, will be executed after the STAR standard login. Its purpose is to define local variables nor related to STAR's environment. Such variables may be for example the definition of a proxy (http_proxy, ftp_proxy, https_proxy), a NNTP server or a default WWW home directory (WW_HOME). Do not try to redefine STAR login's defined variables using this script.

Testing this phase

Testing this phase is as simple as creating a test account and verifying that the login does succeed. Whenever you start with a blank site, the login MUST succeed and lead to viable environment ($PATH especially should be minimally correct). A typical login example would be at this stage something like

Setting up WWW_HOME  = http://www.star.bnl.gov/

         ----- STAR Group Login from /usr/local/star/group/ -----

Setting up STAR_ROOT = /usr/local/star
Setting up STAR_PATH = /usr/local/star/packages
Setting up OPTSTAR   = /usr/local/star/opt/star
WARNING : XOPTSTAR points to /dev/null (no AFS area for it)
Setting up STAF      = /usr/local/star/packages/StAF/pro
Setting up STAF_LIB  = /usr/local/star/packages/StAF/pro/.cos46_gcc346/lib
Setting up STAF_BIN  = /usr/local/star/packages/StAF/pro/.cos46_gcc346/bin
Setting up STAR      = /usr/local/star/packages/pro
Setting up STAR_LIB  = /usr/local/star/packages/pro/.cos46_gcc346/lib
Setting up STAR_BIN  = /usr/local/star/packages/pro/.cos46_gcc346/bin
Setting up STAR_PAMS = /usr/local/star/packages/pro/pams
Setting up STAR_DATA = /usr/local/star/data
Setting up CVSROOT   = /usr/local/star/packages/repository
Setting up ROOT_LEVEL= 5.12.00
Setting up SCRATCH   = /tmp/jeromel
CERNLIB version pro has been initiated with CERN_ROOT=/cernlib/pro
STAR setup on star.phys.pusan.ac.kr by Tue Mar 12 06:43:47 KST 2002  has been completed
LD_LIBRARY_PATH = .cos46_gcc346/lib:/usr/local/star/ROOT/5.12.00/.cos46_gcc346/rootdeb/lib:ROOT:/usr/lib/qt-3.3/lib

 

Suggestions

STAR group

You may want to to create a rhstar group on your local cluster matching GID 31012. This will make AFS integration easier as the group names in AFS will then translate to rhstar (it will however not grant you any special access obviously since AFS is Kerberos authentication based and not Unix UID based).
To do this, and after checking that /etc/group do not contain any mapping for gid 31012, you could (Linux):

% groupadd -g 31012 rhstar

Test account

It may be practical for testing the STAR environment to create a test account on your local cluster. The starlib account is an account  used in STAR for software installation. You may want to create such account as follow (Linux):

% useradd -d /home/starlib -g rhstar -s /bin/tcsh  starlib

 This will allow for easier integration. Any account name will do (but testing is important and we will have a section on this later).

 

 

Additional software components

Scope & audience

Described in Setting up your computing environment, OPTSTAR is the environment variable pointing to an area which will supplement the operating system installation of libraries and program. This area is fundamental to the STAR software installation as it will contain needed libraries, approved software component version, shared files, configuration and so on.

The following path should contain all software components as sources for you to install a fresh copy on your cluster:
    /afs/rhic.bnl.gov/star/common

Note that this path should allow anyuser to read so there is no need for an AFS token. The note below are sparse and ONLY indicate special instructions you need to follow if any. In the absence of special instructions, the "standard" instructions are to be followed. None of the explanations below are aimed to target a regular user but aimed to target system administrator or software infrastructure personnel.

System wide RPMs

Some RPMs from your OS distribution may be found at BNL under the directory /afs/rhic.bnl.gov/rcfsl/X.Y/*/ where X.Y is the major and minor version for your Scientific Linux version respectively. You should have a look and install. If you do not have AFS, you should log to the RCF and transfer whatever is appropriate.

In other words, we may have re-packaged some packages and/or created additional ones for compatibility purposes. An example of this for SL5.3 is flex32libs-2.5.4a-41.fc6.i386.rpm located in /afs/rhic.bnl.gov/rcfsl/5.3/rcf-custom/ which supports the 32 bits compatbility package for flex on a kernel with dual 32/64 bits support.

STAR Specific

The directory tree /afs/rhic.bnl.gov/star/common contains packages installed on our farm in addition of the default distribution software packages coming with the operating system. At BNL, all packages referred here are installed in the AFS tree

	/opt/star -> /afs/rhic.bnl.gov/@sys/opt/star/

Be aware of the intent explained in Setting up your computing environment as per the difference between $XOPTSTAR and OPTSTAR.

OPTSTAR will either

  • at BNL or to a remote site: be used to indicate and access the local software BUT may be supported through a soft-link to the same AFS area as showed above whereas @sys will expand to the operating system of interest (see Setting up your computing environment as well for a support matrix)
  • at a remote site, will point to a LOCAL (that is, non-networked) installation of the software components. This space could be anywhere on your local cluster but obviously, will have to be shared and visible from all nodes in your cluster.

XOPTSTAR

The emergence of $XOPTSTAR started from 2003 to provide better support for software installation to remote institutions. Many packages add path information to their configuration (like the infamous .la files) and previously installed in $OPTSTAR, remote sites had problems loading libraries for a path reason. Hence, and unless specified otherwise below, $XOPTSTAR will be used preferably at BNL for installation the software so remote access to the AFS repository and copy will be made maximally transparent.

In 2005, we added an additional tree level reflecting the possibility of multiple compilers and possible mismatch between fs sysname setups and operating system versions. Hence, you may see path like OPTSTAR=/opt/star/sl44_gcc346 but this structure is a detail and if the additional layer does not exist for your site, later login will nonetheless succeed. This additional level is defined by the STAR login environment $STAR_HOST_SYS. In the next section, we explained how to set this up from a "blank" site (i.e. a site without the STAR environment and software installed).

On remote sites where you decide to install the software components locally, you should use $OPTSTAR in the configure or make statements.

Basic starting point

From a blank node on remote site, be sure to have $OPTSTAR defined. You can do this by hand for example like this

% setenv OPTSTAR /usr/local

or

% mkdir -p /opt/star
% setenv OPTSTAR /opt/star

are two possibilities. The second, being the default location of the software layer, will be automatically recognized by the STAR group login scripts. From this point, a few pre-requisites are

  • you have to have a system with "a" compiler - we support gcc but also icc on Linux
  • you should have the STAR group login scripts at hand (it could be from AFS). The STAR login scripts will NOT redefine $OPTSTAR if already defined.

Execute the STAR login. This will define $STAR_HOST_SYS appropriately. Then

% cd $OPTSTAR
% mkdir $STAR_HOST_SYS
% stardev
 

the definition of $OPTSTAR will change to the version dependent structure, adding $STAR_HOST_SYS to the path definition (the simple presence of the layer made the login script redefine it).

 

Changing platform or compiler

32 bits versus 64 bits

If you want to support native 64 bits on 64 bits, do not forget to pass/force -m64 -fPIC to the compiler and -m64 to the linker. If however you want to build a cross platform (64 bit/32 bit kernels compatible) executables and libraries, you will on the contrary need to force -m32 (use -fPIC). Even if you build the packages from a 32 bit kernel node, be aware that many applications and package save a setup including compilation flags (which will have to be using -m32 if you want a cross platform package). There are many places below were I do not specify this.

Often, using CFLAGS="-m32 -fPIC" CXXFLAGS="-m32 -fPIC" LDFLAGS="-m32" would do the trick for a force to 32 bits mode (similarly for -m64). You need to use such option for libraries and packages even if you assemble on a 32 bits kernel node as otherwise, the package may build extension later not compatible as cross-platform support.

Other GCC versions

As for the 32 bits versus 64 bits, often adding something like CC=`which gcc` and CXX=`which g++` to either the configure or make command would do the trick. If not, you will need to modify the Makefile accordingly. You may also define the environment variable CC for consistency.

Summary

If ylu do have a 64 bits kernel and intend to compile both 32 bits and 64 bits, you should define the envrionment variable as shown below.  The variables will make configure (and some Makefile) pick the proper flags and make your life much easier - follow the specific instructions for the packages noted in those instructions for specific tricks. Note as well that even if you do have a 32 bits kernel only, you are encouraged to use the -m32 compilation option (this will make further integration with dual 32/64 bits support smoother as some of the packages configurations include compiler path and options).

32 bits

% setenv CFLAGS   "-m32 -fPIC"
% setenv CXXFLAGS "-m32 -fPIC"
% setenv FFLAGS   "-m32 -fPIC"
% setenv FCFLAGS  "-m32 -fPIC"
% setenv LDFLAGS  "-m32"
% setenv CC  `which gcc`     # only if you use a different compiler than the system default
% setenv CXX `which g++`     # only if you use a different compiler than the system default

and/or pass to Makefile and/or configure the arguments CFLAGS="-m32 -fPIC" CXXFLAGS="-m32 -fPIC" LDFLAGS="-m32" CC=`which gcc` CXX=`which g++` (will not hurt to use it in addition of the environment variables)

64 bits

% setenv CFLAGS   "-m64 -fPIC"
% setenv CXXFLAGS "-m64 -fPIC"
% setenv FFLAGS   "-m64 -fPIC"
% setenv FCFLAGS  "-m64 -fPIC"
% setenv LDFLAGS  "-m64"
% setenv CC  `which gcc`     # only if you use a different compiler than the system default
% setenv CXX `which g++`     # only if you use a different compiler than the system default

and/or pass to Makefile and/or configure the arguments CFLAGS="-m64 -fPIC" CXXFLAGS="-m64 -fPIC" LDFLAGS="-m64" CC=`which gcc` CXX=`which g++` (will not hurt to use it in addition of the environment variables)

 

Software repository directory - starting a build

In the instructions below, greyed instructions are historical instructions and/or package version which no longer reflects the current official STAR supported platform. However, if you try to install the STAR software under older OS, refer carefully to the instructions and package versions.

perl

The STAR envrionment and login scripts heavily rely on perl for string manipulation, compilation management and a bunch of utility scripts. Assembling it from the start is essential. You may rely on your system-wide installed perl version BUT if so, note that the minimum version indicated below IS required.

In our software repository path, you will find a perl/ sub-directory containing all packages and modules.

The package and minimal version are below
		perl-5.6.1.tar.gz      -- Moved to 5.8.0 starting from RH8
perl-5.8.0.tar.gz -- Solaris and True64 upgraded 2003
perl-5.8.4.tar.gz -- 2004, Scientific Linux
perl-5.8.9.tar.gz -- SL5+ perl-5.10.1.tar.gz -- SL6+

When building perl

  • Use all default arguments BUT when you are asked for compilation / linker args, add -m32 or -m64 depending on the platform support you are building. Those questions are (example for the 32 bits version):
    • Any additional cc flags? []  -fPIC -m32
    • Any additional ld flags (NOT including libraries)? [] -m32
    • Any special flags to pass to cc -c to compile shared library modules? []  -fPIC -m32
    • Any special flags to pass to cc to create a dynamically loaded library? [-shared -O2] -shared -O2 -m32
    • If you build a 32 bits support on a 64 bit node, you may also answer "no" below but the defalt naswer SHOULD appear as no if you properly passed -m32 as indicated above.
      Try to use maximal 64-bit support, if available? [y] n  <--- you probably did ot pass -m32
      Try to use maximal 64-bit support, if available? [n]    <--- just press return, all is fine
  • when asked for the default prefix for the installation, give the value of $OPTSTAR as answer (or a base path starting with the value of $OPTSTAR wherever appropriate). Questions include
    • Installation prefix to use? (~name ok) [/usr/local]
  • If the build warn you at first that the directory does not exists but proceed - to questions like "Use that name anyway?" answer Yes


After installing perl itself, you will need to install the STAR required module.

The modules are installed using a bundle script (install_perlmods located in /afs/rhic.bnl.gov/star/common/bin/). It needs some work to get it generalized but the idea is that it contains the dependencies and installation order . To install, you can do the following (we assume install_perlmods is in the path for simplicity and clarity):
 

  1. first chose a work place where you would unpack the needed modules. Let's assume this is /home/xxx/myworkplace
  2. Check things out by running install_perlmods with arguments 0 as follow
    % install_perlmods 0 /home/xxx/myworkplace
    It will tell you the list of modules you need to unpack. If they are already unpacked and /home/xxx/myworkplace contains all needed package directories, skip to step 4.
  3. You can unpack manually OR use the command
    % install_perlmods 1 /home/xxx/myworkplace
    to do this automatically. Note that you could have skipped step 2 and do that from the start (if confident enough).
  4. The steps above should have created a file named /home/xxx/myworkplace/perlm-install-XXX.csh where XXX is the OS you are working on. Note that the same install directory may therefore be used for ALL platform on your cluster. However, versionning is not (yet) supported.
    Execute this script after checking its content. It will run (hopefully smoothly) the perl Makerfile.PL and make / make install commands. Note that you could have also used
    % install_perlmods 2 /home/xxx/myworkplace
    and skip step 2 & 3. In this mode, it unpacks and proceeds with compilation. To do only if you have absolute blind faith in the process (I don't and have written those scripts ;-)   ).

Very old note [this used to happen with older perl version]: if typing make, you get the following message

make: *** No rule to make target `<command-line>', needed by `miniperlmain.o'.  Stop.

then you bumped into an old gcc/perl build issue (tending to come back periodically depending on message formats of gcc) and can resolve this by a using any perl version available and running the commands:

% make depend
% perl -i~ -nle 'print unless /<(built-in|command.line)>/' makefile x2p/makefile

This will suppress from the makefile the offending lines and will get you back on your feet.
 

After you install perl, and your setup is local (in /opt/star) you may want to do the following

% cd /opt/star
% ln -s $STAR_HOST_SYS/* .
%
% # prepare for later directories packages will create
% ln -s $STAR_HOST_SYS/share .
% ln -s $STAR_HOST_SYS/include .
% ln -s $STAR_HOST_SYS/info .
% ln -s $STAR_HOST_SYS/etc .
% ln -s $STAR_HOST_SYS/libexec .
% ln -s $STAR_HOST_SYS/qt .
% ln -s $STAR_HOST_SYS/jed .
%

While some of those directories will not yet exist, this will create a base set of directories (without the additional compiler / OS version) supporting future upgrades via the "default" set of directories. In other words, any future upgrade of compilers for example leading to a different  $STAR_HOST_SYS will still lead as well to a functional environment as far as compatibility exists. Whenever compatibility will be broken, you will need of course to re-create a new $STAR_HOST_SYS tree.
At this stage, you should install as much of the libraries in $OPTSTAR and re-address the perl modules later as some depends on installed libraries necessary for the STAR environment to be functional.

 

Others/ [PLEASE READ, SOME PACKAGE MAY HAVE EXCEPTION NOTES]

        Needed on Other platform (but there on Linux). Unless specified 
        otherwise, the packages were build with the default values.
                make-3.80
                tar-1.13
                flex-2.5.4   
                xpm-3.4k
                libpng-1.0.9

                mysql-3.23.43 on Solaris
                mysql-3.23.55 starting from True64 days (should be tried as
                              an upgraded version of teh API)
                              BEWARE mysql-4.0.17 was tried and is flawed.
                              We also use native distribution MySQL
                mysql-4.1.22  *** IMPORTANT *** Actually this was an upgrade 
                              on SL4.4 (not necessary but the default 4.1.20 
                              has some bugs) 

                <gcc-2.95.2>
                dejagnu-1.4.1	 
                gdb-5.2
                texinfo-4.3
                emacs-20.7 

                findutils-4.1
                fileutils-4.1
                cvs-1.11       -- perl is needed before hand as it folds
                               it in generated scripts
                grep-2.5.1a    Started on Solaris 5.9 in 2005 as ROOT would complain 
                               about too old version of egrep 


This may be needed if not installed on your system. It is part of a needed
autoconf/automake deployment.
                m4-1.4.1		
                autoconf-2.53  
                automake-1.6.3
		
Linux only
                valgrind-2.2.0
valgrind-3.2.3 (was for SL 4.4 until 2009)
valgrind-3.4.1 SL4.4 General/ The installed packages/sources for diverse software layers. The order of installation was ImageMagick-5.4.3-9 On RedHat 8+, not needed for SL/RHE but see below ImageMagick-6.5.3-10 Used on SL5 as default revision is "old" (6.2.8) - TBC slang-1.4.5 On RedHat 8+, ATTENTION: not needed for SL/RHE, install RPM lynx2-8-2 lynx2-8-5 Starting from SL/RHE xv-3.10a-STAR Note the post-fix STAR (includes patch and 32/64 bits support Makefile) nedit-5.2-src ATTENTION: No need on SL/RHE (installed by default) [+] pythia5 pythia6 text2c icalc dejagnu-1.4.1 Optional / Dropped from SL3.0.5
gdb-5.1.91 For RH lower versions - Not RedHat , 8+
gdb-6.2 (patched) Done for SL3 only (do not install on others)
gsl-1.13 Started from SL5 and back ported to SL4 gsl-1.16 Update for SL6 chtext jed-0.99-16 jed-0.99.18 Used from SL5+ jed-0.99.19 Used in SL6/gcc 4.8.2 (no change in instructions) qt-x11-free-3.1.0
qt-x11-free-3.3.1 Starting with SL/RHE
[+] qt-x11-opensource-4.4.3 Deployed from i386_sl4 and i386_sl305 (after dropping SL3.0.2), SL5 qt-everywhere-opensource-src-4.8.5 Deployed from SL6 onward qt-everywhere-opensource-src-4.8.7 Deployed on SL6/gcc 4.8.2 (latest 4.8.x release) doxygen-1.3.5
doxygen-1.3.7 Starting with SL/RHE
doxygen-1.5.9 Use this for SL5+ - this package has a dependence in qt Installed native on SL6 Python 2.7.1 Started from SL4.4 and onward, provides pyROOT support Python 2.7.5 Started from SL6 onward, provides pyROOT support pyparsing V1.5.5 SL5 Note: "python setup.py install" to install pyparsing V1.5.7 SL6 Note: "python setup.py install" to install setuptools 0.6c11 SL5 Note: sh the .egg file to install setuptools 0.9.8 SL6 Note: "python setup.py install" to install MySQL-python-1.2.3 MySQL 14.x client libs compatible virtualenv 1.9 SL6 Note: "python setup.py install" to install Cython-0.24 SL6 Note: "python setup.py build ; python setup.py install" pyflakes / pygments {TODO} libxml2 Was used only for RH8.0, installed as part of SL later [+] libtool-1.5.8 This was used for OS not having libtool, Use latest version.
libtool-2.4 Needed under SL5 64 bits kernel (32 bits code will not assemble otherwise). This was re-packaged with a patch. Coin-3.1.1 Coin 3D and related packages Coin-3.1.3 ... was used for SL6/gcc 4.8.2 + patch (use the package named Coin-3.1.3-star) simage-1.7a SmallChange-1.0.0a SoQt-1.5.0a astyle_1.15.3 Started from SL3/RHE upon user request
astyle_1.19 SL4.4 and above
astyle_1.23 SL5 and above astyle_2.03 SL6 and above unixODBC-2.2.6 (depend on Qt) Was experimental Linux only for now.
unixODBC-2.3.0 SL5+, needed if you intend to use DataManagement tools MyODBC-3.51.06 Was Experimental on Linux at first, ignore this version
MyODBC-3.51.12 Version for SL4.4 (needed for mysql 4.1 and above)
mysql-connector-odbc-3.51.12 <-- Experimental package change - new name starting from 51.12. BEWARE.
mysql-connector-odbc-5.x SL5+. As above, only if you intend to use Data Management tools boost Experimental and introduced in 2010 but not used then boost_1_54_0 SL6+ needed log4cxx 0.9.7 This should be general, initial version
log4cxx 0.10.0 Started at SL5 - this is now from Apache apr-1.3.5 and depend on the Apache Portable Runtime (apr) package apr-util-1.3.7 which need to be installed BEFORE log4cxx and in the order expat-1.95.7 showed valkyrie-1.4.0 Added to SL3 as a GUI companion to valgrind (requires Qt3) Not installed in SL5 for now (Qt4 only) so ignore fastjet-2.4.4 Started from STAR Library version SL11e, essentially for analysis fastjet-3.0.6 SL6 onward unuran-1.8.1 Requested and installed from SL6+ LHAPDF-6.1.6 Added after SL6.4, gcc 4.8.2 In case you have problems emacs-24.3 Installed under SL6 as the default version had font troubles vim-7.4 Update made under SL6.4, please prefer RPM if possible Not necessary (installed anyway) chksum pine4.64 Added at SL4.4 as removed from base install at BNL Retired xemacs-21.5.15 << Linux only -- This was temporary and removed Other directories are WorkStation/ contains packages such as OpenAFS or OpenOffice Linux WebServer/ mostly perl modules needed for our WebServer Linux/ Linux specific utilities (does not fit in General) or packages tested under Linux only. Some notes about packages : Most of them are pretty straight forward to install (like ./configure make ; make install (changing the base path /usr/local to $OPTSTAR). With configure, this is done using either ./configure --prefix=$OPTSTAR ./configure --prefix=$XOPTSTAR Specific notes follows and include packages which are NOT yet official but tried out. - Beware that the Msql-Mysql-modules perl module requires a hack I have not quite understood yet how to make automatic (the advertized --blabla do not seem to work) on platforms supporting the client in OPTSTAR INC = ... -I$(XOPTSTAR)/include/mysql ... H_FILES = $(XOPTSTAR)/include/mysql/mysql.h OTHERLDFLAGS = -L$(XOPTSTAR)/lib/mysql LDLOADLIBS = -lmysqlclient -lm -lz - GD-2+ Do NOT select support for animated GIF. This will fail on standard SL distributions (default gd lib has no support for that).


ImageMagick

Really easy to install (usual configure / make / make install) but however, the PerlMagick part should be installed separatly (the usual perl module way i.e. cd to the directory, perl Makefile.PL and make / make install). I used the distribution's module. Therefore, that perl-module is not in perl/Installed/ as the other perl-modules. The copy of PerlMagick to /bin/ by default will fail so you may want to additionally do

% make install-info
% make install-data-html

which comes later depending on version.
 

lynx

- lynx2-8-2 / lynx2-8-5 
  Note: First, I tried lynx2-8-4 and the make file / configure
        is a real disaster. For 2-8-2/2-8-5, follow the notes 
        below

  General :
  %  ./configure --prefix=$XOPTSTAR {--with-screen=slang}

  Do not forget to
  % make install-help
  % make install-doc

 caveat 1 -- Linux (lynx 2-8-2 only, fixed at 2-8-5)

  $OPTSTAR/lib/lynx.cfg was modified as follow
96,97c96,97
< #HELPFILE:http://www.crl.com/~subir/lynx/lynx_help/lynx_help_main.html
< HELPFILE:file://localhost/opt/star/lib/lynx_help/lynx_help_main.html
---
>
HELPFILE:http://www.crl.com/~subir/lynx/lynx_help/lynx_help_main.html > #HELPFILE:file://localhost/PATH_TO/lynx_help/lynx_help_main.html

   For using curses (needed under Linux, otherwise, the screen looks funny), 
   one has to do a few manipulation by hand i.e. 
   . start with ./configure --prefix=$XOPTSTAR --with-screen=slang
   . edit the makefile and add -DUSE_SLANG to SITE_DEFS
   . change CPPFLAGS from /usr/local/slang to $OPTSTAR/include [when slang is local]
     Version 2-8-5 has this issue fixed.
   . Change LIBS -lslang to -L$OPTSTAR/lib -lslang
   . You are ready now
   There is probably an easier way but as usual, I gave up after close
   to 15mnts reading, as much struggle and complete flop at the end ..

 caveta 2 -- Solaris/True64 : 
   We did not build with slang but native (slang screws colors up)

 

text2c, chksum, chtext, icalc

Those packages can be assembled simply by using the following command:

% make clean && make install PREFIX=$OPTSTAR

To build a 32 bits versions of the executable under a 64 bits kernel, use

  • text2c:             % make CC=`which gcc` CFLAGS="-lfl -m32"
  • icalc:                % make CC=`which gcc` CFLAGS="-lm -m32"
  • chksum:           % make CC=`which gcc` CFLAGS="-m32 -trigraphs"
  • chtext:             % make CC=`which gcc` CFLAGS="-lfl -m32"

 

xv-3.10a

This package distributed already patched and in principle, only a few 'make' commands should suffice. Note

  • xv is licensed so the usage as to remain stricly for your users' amusement only. If you use this package for doing any work, you are violating the law. Please, read the license agreement at http://www.trilon.com/xv/pricing.html 

Normal build

Now,  you should be ready to build the main program (I am not sure why some depencies fail on some platform and did not bother to fix).

% cd tiff/
% make clean && make
% cd ../jpeg
% make clean && make
% cd ..
% rm -f *.o  && make
% make -f Makefile.gcc64 install BINDIR=$OPTSTAR/bin

For 32 bits compilation under a 64 bits kernel

% cd tiff/
% make clean && make CC=`which gcc` COPTS="-O -m32"
% cd ../jpeg
% make clean && make CC=`which gcc` CFLAGS="-O -I. -m32" LDFLAGS="-m32"
% cd ..
% rm -f *.o   && make -f Makefile.gcc32
% make -f Makefile.gcc32 install BINDIR=$OPTSTAR/bin

Makefile.gcc32 and Makefile.gcc64 are both provided for commodity.

Building from scratch (good luck)

However, if you need to re-generate the makefile (may be needed for new architectures), use

% xmkmf 

Then, the patches is as follow

% sed "s|/usr/local|$OPTSTAR|" MakeFile >Makefile.new
% mv Makefile.new Makefile

and in xv.h, line 119 becomes

# if !defined(__NetBSD__) && ! defined(__USE_BSD) 

After xmkmf, you will need to

% make depend

before typing make. This may generate some warnings. Ignore then.

However, I had to fix diverse caveats depending on situations ...

Caveat 1 - no tiff library found

Go into the tiff/ directory and do

% cd tiff  % make -f Makefile.gcc   % cd ..

to generate the mkg3states (itself creating the g3states.h file) as it did not work.

Caveat 2 - tiff and gcc 4.3.2 in tiff/

With gcc 4.3.2 I created an additional .h file named local_types.h and force the definition of a few of the u_* types but using define statements (I know, it is bad). The content of that file is as follows

#ifndef _LOCAL_TYPES_
#define _LOCAL_TYPES_

#if !defined(u_long)
# define u_long unsigned long
#endif
#if !defined(u_char)
# define u_char unsigned char
#endif
#if !defined(u_short)
# define u_short unsigned short
#endif
#if !defined(u_int)
# define u_int unsigned int
#endif

#endif

and it needs to be included in tiff/tif_fax3.h and tiff/tiffiop.h .

Caveat 3 -- no jpeg library?

 In case you have a warning about jpeg such as No rule to make target `libjpeg.a', do the following as well:

% cd jpeg
% ./configure
% make
% cd ..

 

Nedit

There is no install provided. I did

% make linux
% cp source/nc source/nedit $OPTSTAR/bin/
% cp doc/nc.man $OPTSTAR/man/man1/nc.1
% cp doc/nedit.man $OPTSTAR/man/man1/nedit.1

Other targets

% make dec
% make solaris

If you need to build for another compiler or another platform, you may want to copy one of the provided makefile and modify them to create a new target. For example, if you have a 64 bits kernel but want to build a 32 bits nedit (for consistency or otherwise), you could do this:

% cp makefiles/Makefile.linux makefiles/Makefile.linux32

then edit and add -m32 to bothe CFLAGS and LIBS. This will add a target "platform" linux32 for a make linux32 command (tested this and worked fine). The STAR provided package added (in case) both a linux64 and a linux32 reshaped makefile to ensure easy install for all kernels (gcc compiler should be recent and accept the -m flag).

 

Pythia libraries

The unpacking is "raw". So, go in a working directory where the .tar.gz are, and do the following (for linux)

% test -d Pythia && rm -fr Pythia ; mkdir Pythia && cd Pythia && tar -xzf ../pythia5.tar.gz 
% ./makePythia.linux 
% mv libPythia.so $OPTSTAR/lib/ 
% cd .. 
% 
% test -d Pythia6 && rm -fr Pythia6 ; mkdir Pythia6 && cd Pythia6 && tar -xzf ../pythia6.tar.gz 
% test -e main.c && rm -f main.c 
% ./makePythia6.linux 
% mv libPythia6.so $OPTSTAR/lib 
% 

Substitute linux with solaris for Solaris platform. On Solaris, Pythia6 requires gcc to build/link.

On SL5, 64 bits note

Depending on whether you compile a native 64 bit library support or a cross-platform 32/64, you will need to handle it differently.

For a 64 bits platform, I had to edit the makePythia.linux and  -fPIC to the options for a so the binaries main.c . I did not provide a patched package mainly because v5 is not really needed in STAR. For pythia6 caveat: On SL5, 64 bits, use makePythia6.linuxx8664 file. You will need to chmod +x first as it was not executable in my version.

On 64 bit platform to actually build a cross-platform version, I had instead to use the normal build but make sure to add -m32 to compilation and linker options and -fPIC to compilation option.

 

True64

% chmod +x ./makePythia.alpha && ./makePythia.alpha Pythia6
% chmod +x ./makePythia6.alpha && ./makePythia6.alpha 

The following script was used to split the code which was too big

 #!/usr/bin/env perl
 $filin = $ARGV[0];
 open(FI,$filin);
 $idx = $i = 0;
 while( defined($line = <FI>) ){
    chomp($line); $i++;

    if ($i >= 500 && $line =~ /subroutine/){
	$i = 0;
	$idx++;
    }

    if ($i == 0){
	close(FO);
	open(FO,">$idx.$filin");
	print "Opening $idx.$filin\n";
    }
    print FO "$line\n";
 }
 close(FO);
 close(FI);

 

Qt 4

Starts the same than Qt3 i.e. assuming that SRC=/afs/rhic.bnl.gov/star/common/General/Sources/ and $x and $y stands for major and minor versions of Qt. There are multiple flavors of the package name (it was called qt-x11-free* then qt-x11-opensource* and with more recent package qt-everywhere-opensource-src*). For the sake of instructions, I provide a generic example with the most recent naming (please adapt as your case requires). WHEREVER is a location of your choice (not the final target directory).

% cd $WHEREVER
% tar -xzf $SRC/qt-everywhere-opensource-src-4.$x.$y.tar.gz
% cd qt-everywhere-opensource-src-4.$x.$y
% ./configure --prefix=$XOPTSTAR/qt4.$x -qt-sql-mysql -no-exceptions -no-glib -no-rpath 

To build a 32/64 bits version on a 64 bits OS or forcing a 32 bits exec (shared mode) on a 32 bits OS, use a configure target like the below

% ./configure  -platform linux-g++-32 -mysql_config $OPTSTAR/bin/mysql_config [...] 
% ./configure  -platform linux-g++-64 [...]

Note that the above assumes you have a proper $OPTSTAR/bin/mysql_config. ON a mixed 64/32 bits node, the default in /usr/bin/mysql_config will return the linked library as the /usr/lib64/mysql path and not the /usr/lib/mysql and hence, Qt make will fail finding the dependencies necessary to link with -m32. The trick we had was to copy mysql_config and replace lib64 by lib and voila!.


Compiling

  % make
  % make install

  % cd $OPTSTAR
  % ln -s qt4.$x ./qt4

For compiling with a different compiler, note that the variables referenced in this section will be respected by configure. You HAVE TO do this as the project files and other configuration files from Qt will include information on the compiler (inconsistency may arrise otherwise).

Misc notes

  • If you use the same directory tree for compiling the 64 bits and the 32 bits version, please note that 'make clean' will not do the proper job. You will need to use a more systematic % find . -name '*.o' -exec rm -f {} \;  command before running  ./configure again.
  • We added mysql support in Qt4 and Qt can now be compiled in a separate directory and installed properly (at last!). If the mysql support gives you trouble on a 64 bit OS attempting to build a 32 bit image, be sure you have used as indicated above the -mysql_config /usr/lib/mysql/mysql_config option as otherwise, the default mysql_config will be picked from /usr/bin and that version will refer to the 64 bits  libraries (the link will then fail). 
  • For SL44, we created a qt3 distribution as then, both 3 and 4 existed. Otherwise, the ./qt4 link as indicated above is sufficient for SL5 and above.
  • On some systems (SL3.0.2 for sure), I also used
    • -no-openssl   as there were include problems with ssl.h and krb5.h
    • -qt-libtiff   as the default system included header did not agree with Qt code
    • -platform linux-icc  could be used for icc based compiler

 

Qt 3

Horribly packaged, the easiest is to unpack in $OPTSTAR, cd to qt-x11-free-3.X.X (where X.X stands for the current sub-version deployed on our node), run the configure script, make the package, then make clean. Then, link

  % cd $OPTSTAR && ln -s qt-x11-free-3.X.X qt

Later release can be build that way with changing the soft-link without removing the preceeding version entirely. Before building, do the following (if you had previous version of Qt installed). This is not necessary if you install the package the first time around. Please, close windows after compilation to ensure STAR path sanity.

  % cd $OPTSTAR/qt
% setenv QTDIR `pwd`
% setenv LD_LIBRARY_PATH `pwd`/lib:$LD_LIBRARY_PATH
% setenv PATH `pwd`/bin:$PATH

To configure the package, then use one of:

  • Linux gcc: ./configure --prefix=`pwd` -no-xft -thread
  • Linux icc:  ./configure --prefix=`pwd` -no-xft -thread -platform linux-icc
  • True64 :   ./configure --prefix=`pwd` -no-xft -thread
  • Solaris:     ./configure --prefix=`pwd` -no-xft

In case of thread, the regular version is build first then the threaded version (so far, they have different names and no Soft links).

You may also want to edit  $QTDIR/mkspecs/default/qmake.conf and replace the line

QMAKE_RPATH		= -Wl,-rpath,

by

QMAKE_RPATH		= 

By doing so, you would disable the rpath shared library loading and rely on LD_LIBRARY_PATH only for loading your Qt related libraries. This has the advantages that you may copy the Qt3 libraries along your project and isolate onto a specific machine without the need to see the original installation directory.

 

unixODBC

% ./configure --prefix=$XOPTSTAR [CC=icc CXX=icc]
% make clean       # in case you are re-using the same directory for multiple platform 
% make 
% make install

Use the environment variables noted in this section and all will go well.

Note on versions earlier than 2.3.0 (including 2.2.14 previously suggested)

The problem desribed below DOES NOT exist if you use 32 bits kernel OS and is specific to 64 bits kernel with 32 bits support.

For a 32 bits compilation under a 64 bits kernel, please use % cp -f $OPTSTAR/bin/libtool .  after the ./configure and before the make (see this section for an explaination of why). unixODBC versions 2.3.0 does not have this problem.



MyODBC

Older version

Came with sources and one could compile "easily" (and register manually).

- MyODBC
Linux % ./configure --prefix=$XOPTSTAR --with-unixODBC=$XOPTSTAR [CC=icc CXX=icc]
Others % ./configure --prefix=$XOPTSTAR --with-unixODBC=$XOPTSTAR --with-mysql-libs=$XOPTSTAR/lib/mysql
--with-mysql-includes=$XOPTSTAR/include/mysql --with-mysql-path=$XOPTSTAR

Note : Because of an unknown issue, I had to use --disable-gui on True64
as it would complain about not finding the X include ... GUI is
not important for ODBC client anyway but whenever time allows ...

Deploy instructions at
http://www.mysql.com/products/myodbc/faq_toc.html

Version 5.x of the connector

Get the proper package, currently named mysql-connector-odbc-5.x.y-linux-glibc2.3-x86-32bit or  mysql-connector-odbc-5.x.y-linux-glibc2.3-x86-64bit. the package are available from the MySQL Web site. The install will need to be manual i.e.

% cp -p bin/myodbc-installer $OPTSTAR/bin/
% cp -p lib/*  $OPTSTAR/lib/
% rehash

To register the driver, use the folowing command

% myodbc-installer -d -a -n "MySQL ODBC 5.1 Driver" -t "DRIVER=$OPTSTAR/lib/libmyodbc5.so;SETUP=$OPTSTAR/lib/libmyodbc3S.so"
% myodbc-installer -d -a -n "MySQL" -t "DRIVER=$OPTSTAR/lib/libmyodbc5.so;SETUP=$OPTSTAR/lib/libmyodbc3S.so"

this will add a few lines in $OPTSTAR/etc/odbcinst.ini . The  myodbc-installer -d -l does not seem to be listing what you installed though (but the proper lines will be added to the configuration).

 

doxygen

Installation would benefit from some smoothing + note the space between the --prefix and OPTSTAR (non standard option set for configure).

Use one of

% ./configure --prefix $OPTSTAR                       # for general compilation
% ./configure --platform linux-32 --prefix $OPTSTAR   # Linux, gcc 32 bits - this option was added in the STAR package
% ./configure --platform linux-64 --prefix $OPTSTAR   # Linux, gcc 64 bits - this option was fixed in the STAR package

then

% make
% make install

as usual but also

% make docs

which will fail d ue to missing eps2pdf program. Will create however the HTML files you will need to copy somewhere.

% cp -r html $WhereverTheyShouldGo

and as example

% cp -r html /afs/rhic.bnl.gov/star/doc/www/comp/sofi/doxygen


Note: The linux-32 and linux-64 platform were packaged in the archive provided for STAR (linux-32 does not exists in the original doxygen distribution while linux-64 is not consistent with -m64 compilation option).

 

Additional Graphics libraries

Starting from SL5, we also deployed the following: coin, simage, SmallChange, SoQt. Those needs to be installed before Qt4 but after doxygen. All options are specified below to install those packages. Please, substitute -m32 by -m64 for a 64 bits native support. After the configure, the usual make and make install is expected.

The problem desribed below DOES NOT exist if you use 32 bits kernel OS and is specific to 64 bits kernel with 32 bits support.

For the 32 bits version compilation under a 64 bits kernel and for ALL sub-packages below, please be sure you have the STAR version of libtool installed and use the command
% cp -f $OPTSTAR/bin/libtool .
after the ./configure to replace the generated local libtool script. This will correct a link problem which will occur at link time (see the libtool help for more information).

 

Coin:

% ./configure --enable-debug --disable-dependency-tracking --enable-optimization=yes \
--prefix=$XOPTSTAR CFLAGS="-m32 -fPIC -fpermissive" CXXFLAGS="-m32 -fPIC -fpermissive" LDFLAGS="-m32 -L/usr/lib" \
--x-libraries=/usr/lib

or, for or the 64 bits version

% ./configure --enable-debug --disable-dependency-tracking --enable-optimization=yes \
--prefix=$XOPTSTAR CFLAGS="-m64 -fPIC -fpermissive" CXXFLAGS="-m64 -fPIC -fpermissive" LDFLAGS="-m64" 

 

simage (needs Qt installed and QTDIR defined prior):

% ./configure --prefix=$XOPTSTAR --enable-threadsafe --enable-debug --disable-dependency-tracking \ 
--enable-optimization=yes --enable-qimage CFLAGS="-m32 -fPIC" CXXFLAGS="-m32 -fPIC" \ 
LDFLAGS="-m32" FFLAGS="-m32 -fPIC" --x-libraries=/usr/lib

or, for the 64 bits version

% ./configure --prefix=$XOPTSTAR --enable-threadsafe --enable-debug --disable-dependency-tracking \ 
--enable-optimization=yes --enable-qimage CFLAGS="-m64 -fPIC" CXXFLAGS="-m64 -fPIC" \ 
LDFLAGS="-m64" FFLAGS="-m64 -fPIC" 

SmallChange:

% ./configure --prefix=$XOPTSTAR --enable-threadsafe --enable-debug --disable-dependency-tracking \
--enable-optimization=yes CFLAGS="-m32 -fPIC -fpermissive" CXXFLAGS="-m32 -fPIC -fpermissive" \
LDFLAGS="-m32" FFLAGS="-m32 -fPIC"

or, for the 64 bits version

% ./configure --prefix=$XOPTSTAR --enable-threadsafe --enable-debug --disable-dependency-tracking \
--enable-optimization=yes CFLAGS="-m64 -fPIC -fpermissive" CXXFLAGS="-m64 -fPIC -fpermissive" \
LDFLAGS="-m64" FFLAGS="-m64 -fPIC"

SoQt:

./configure --prefix=$XOPTSTAR --enable-threadsafe --enable-debug --disable-dependency-tracking \ 
--enable-optimization=yes --with-qt=true --with-coin CFLAGS="-m32 -fPIC  -fpermissive" CXXFLAGS="-m32 -fPIC  -fpermissive" \
LDFLAGS="-m32" FFLAGS="-m32 -fPIC"

or, for the 64 bits version

./configure --prefix=$XOPTSTAR --enable-threadsafe --enable-debug --disable-dependency-tracking \ 
--enable-optimization=yes --with-qt=true --with-coin CFLAGS="-m64 -fPIC  -fpermissive" CXXFLAGS="-m64 -fPIC -fpermissive" \
LDFLAGS="-m64" FFLAGS="-m64 -fPIC"

 

 

flex

Flex is usually not needed but some OS have pre-GNU flex not adequate so I would recommend to deploy flex-2.5.4  anyway (the latest version since Linux 2001). Do not install under Linux if you have flex already on your system as rpm.

Attention: Under SL5 64 bits, be sure you have flex32libs-2.5.4a-41.fc6 installed as documented on Scientific Linux 5.3 from 4.4. Linkage of 32 bits executable would otherwise dramatically fail.

 

- Xpm (Solaris)
  % xmkmf
  % make Makefiles
  % make includes
  % make 
  I ran the install command by hand changing the path (cut and paste)
  Had to 


  % cd lib
  % installbsd -c -m 0644 libXpm.so $OPTSTAR/lib
  % installbsd -c -m 0644 libXpm.a $OPTSTAR/lib
  % cd ..
  % cd sxpm/
  % installbsd -c sxpm $OPTSTAR/bin
  % cd ../cxpm/
  % installbsd -c cxpm $OPTSTAR/bin
  %
  
  Onsolaris, the .a was not there, add to
  % cd lib && ar -q libXpm.a *.o && cp libXpm.a $OPTSTAR/lib
  % cd ..

  Additionally needed
  % if ( ! -e $OPTSTAR/include) mkdir $OPTSTAR/include 
  % cp lib/xpm.h $OPTSTAR/include/

  

- libpng 
  ** Solaris **
  % cat scripts/makefile.solaris | sed "s/-Wall //" > scripts/makefile.solaris2
  % cat scripts/makefile.solaris2 | sed "s/gcc/cc/" > scripts/makefile.solaris3
  % cat scripts/makefile.solaris3 | sed "s/-O3/-O/" > scripts/makefile.solaris2
  % cat scripts/makefile.solaris2 | sed "s/-fPIC/-KPIC/" > scripts/makefile.solaris3
  % 
  % make -f scripts/makefile.solaris3

  will eventually fail related to libucb. No worry, this can be sorted
  out (http://www.unixguide.net/sun/solaris2faq.shtml) by including
  /usr/ucblib in the -L list
  % cc -o pngtest -I/usr/local/include -O pngtest.o -L. -R. -L/usr/local/lib \
    -L/usr/ucblib -R/usr/local/lib -lpng -lz -lm
  % make -f scripts/makefile.solaris3 install prefix=$OPTSTAR


  ** True64 **
  Copy the make file but most likely, a change like
ZLIBINC = $(OPTSTAR)/include
ZLIBLIB = $(OPTSTAR)/lib

  in the makefile is neeed.
 
  pngconf.h and png.h needed for installation and either .a or .a + .so

cp pngconf.h png.h $OPTSTAR/include/
cp libpng.* $OPTSTAR/lib



- mysql client (Solaris)
 % ./configure --prefix=$XOPTSTAR --without-server {--enable-thread-safe-client}
 (very smooth)
 The latest option is needed to create the libmysqlclient_r needed by some
 applications. While this so is build with early version of MySQL, version
 4.1+ requires the configure option explicitly.


- dejagnu-1.4.1	[Solaris specific]
the install program was not found.
% cd doc/ && cp ./runtest.1 $OPTSTAR/man/man1/runtest.1
% chmod 644 $OPTSTAR/man/man1/runtest.1

Jed

The basic principles is as usual

% ./configure --prefix=$OPTSTAR
% make
% make xjed
% make install

However, on some platform (but this was not seen as a problem on SL/RHE), you may need to apply the following tweak before typing make. Edit the configure script and add $OPTSTAR (possibly /opt/star) to it as follow.

JD_Search_Dirs="$JD_Search_Dirs \
                $includedir,$libdir \
                /opt/star/include,/opt/star/lib \
                /usr/local/include,/usr/local/lib \
                /usr/include,/usr/lib \
                /usr/include/slang,/usr/lib \
                /usr/include/slang,/usr/lib/slang" 

32 / 64 bit issue?

The problem desribed below DOES NOT exist if you use 32 bits kernel OS and is specific to 64 bits kernel with 32 bits support.

The variables described here will make configure pick up the right comiler and compiler options. On our initial system, the 32 bits compilation under the 64 bits kernel Makefile tried to do something along the line of -L/usr/X11R6/lib64 -lX11 but did not find X11 libs (since the path is not adequate). To correct for this problem, edit src/Makefile and replace XLIBDIR = -L/usr/lib64 by XLIBDIR = -L/usr/lib . You MUST have the 32 bits compatibility libraries installed on your 64 bits kernel for this to work.

AIX

I had to make some hack on AIX (well, who wants to run on AIX in the first place right ?? but since AIX do not have any emacs etc ... jed is great) as follow

  • make a copy of unistd.h and comment the sleep() prototype
  • modify file.c to include the local version (replace <> by "")
  • modify main.c to include sys/io.h (and not io.h) and comment out direct.h

Voila (works like a charm, don't ask).

 
emacs

Version 24.3

In the below options, I recommend the with-x-toolkit=motif as the default GTK will lead to many warnings depending on the user's X11 server version and supported features. Motif may create an "old look and feel" but will work. However, you may have a local fix for GTK (by installing all required modules and dependencies) and not need to go to the Motif UI.

% ./configure --with-x-toolkit=motif --prefix=$OPTSTAR

For the 32 bits version supporting 64/32 bits, use the below

% ./configure --with-crt-dir=/usr/lib --with-x-toolkit=motif --prefix=$OPTSTAR CFLAGS="-m32 -fPIC" CXXFLAGS="-m32 -fPIC" LDFLAGS="-m32"

Then the usual 'make' and 'make install'.

Below are old instructions you should ignore

- emacs
  Was repacked with leim package (instead of keeping both separatly)
  in addition of having a patch in src/s/sol2.h for solaris as follow
 #define HAVE_VFORK 1
 #endif
 
+/* Newer versions of Solaris have bcopy etc. as functions, with
+   prototypes in strings.h.  They lose if the defines from usg5-4.h
+   are visible, which happens when X headers are included.  */
+#ifdef HAVE_BCOPY
+#undef bcopy
+#undef bzero
+#undef bcmp
+#ifndef NOT_C_CODE
+#include <strings.h>
+#endif
+#endif
+

  Nothing to do differently here. This is just a note to keep track
  of changes found from community mailing lists.

  % ./configure --prefix=$OPTSTAR --without-gcc
  

- Xemacs (Solaris)
  % ./configure --without-gcc --prefix=$OPTSTAR
  Other solution, forcing Xpm 
  % ./configure --without-gcc --prefix=$OPTSTAR --with-xpm --site-prefixes=$OPTSTAR

  Possible code problem :
  /* #include <X11/xpm.h> */
  #include <xpm.h> 

- gcc-2.95 On Solaris was used as a base compiler
  % ./configure --prefix=$OPTSTAR
  % make bootstrap

  o Additional gcc on Linux
  Had to do it in multiple passes (you do not need to do the first pass
  elsewhere ; this is just because we started without a valid node).

  A gcc version < 2.95.2 had to be used. I used a 6.1 node to assemble
  it and install in a specific AFS tree (cross version)
  % cd /opt/star && ln -s /afs/rhic/i386_linux24/opt/star/alt .
  Move to the gcc source directory
  % ./configure --prefix=/opt/star/alt
  % make bootstrap
  % make install
  install may fail in AFS land. Edit gcc/Makefile and remove "p" option
  to the tar options TAROUTOPTS .
  
  For it work under 7.2, go on a 7.2 node and
  % cp /opt/star/alt/include/g++-3/streambuf.h /opt/star/alt/include/g++-3/streambuf.h-init
  % cp -f /usr/include/g++-3/streambuf.h /opt/star/alt/include/g++-3/streambuf.h
  ... don't ask ...


  o On Solaris, no problems
  % ./configure --prefix=/opt/star/alt
  etc ...

- Compress-Zlib-1.12 --> zlib-1.1.4
  If installed in $OPTSTAR,
  % setenv ZLIB_LIB $OPTSTAR/lib
  % setenv ZLIB_INCLUDE $OPTSTAR/include
  

- findutil
  Needed a patch in lib/fnmatch.h for True64
  as follow :
  + addition of defined(__GNUC__) on line 49
  + do a chmod +rw lib/fnmatch.h  first

#if !defined (_POSIX_C_SOURCE) || _POSIX_C_SOURCE < 2 || defined (_GNU_SOURCE) || defined(__GNUC__)






* CLHEP1.8                      *** Experimental only ***
printVersion.cc needs a correction #include <string> to <string.h>
for True64 which is a bit strict in terms of compilation.

On Solaris, 2 caveats
o gcc was used (claim that CC is used but do not have the include)
o install failed running a "kdir" command instead of mkdir so do a
% make install MKDIR='mkdir -p'

Using icc was not tried and this package when then removed.
- mysqlcc ./configure --prefix=$OPTSTAR --with-mysql-include=$OPTSTAR/include/mysql --with-mysql-lib=$OPTSTAR/lib/mysql The excutable do not install itself so, one needs to % cp -f mysqlcc $OPTSTAR/bin/

 

libtool

First, please note that the package distributed for STAR contains a patch for support of the 32 / 64 bits environment. If you intend to download from the original site, please apply the patch below as indicated. If you do not use our distributed package and attempt to assemble a 32 bits library under a 64 bits kernel, we found cases where the default libtool will fail.

Why the replacement of libtool? Sometimes, "a" version of libtool is added along software packages indicated in this help. However, those do not consider the 32 bits / 64 bits mix and often, their use lead to the wrong linkage (typical problem is that a 32 bits executable or shared library is linked against the 64 bits stdc++ versions, creating a clash).

This problem does not existswhen you assemble a 64 bits code under a 64 bits kernel or assemble a 32 bits codes under a 32 bits kernel.

In all cases, to compile and assemble, use a command line like the below:

% ./configure --prefix=$XOPTSTAR CFLAGS="-m32 -fPIC" CXXFLAGS="-m32 -fPIC" \
FFLAGS="-m32 -fPIC" FCFLAGS="-m32 -fPIC" LDFLAGS="-m32"                        # 32 bits version
% ./configure --prefix=$XOPTSTAR CFLAGS="-m64 -fPIC" CXXFLAGS="-m64 -fPIC" \
FFLAGS="-m64 -fPIC" FCFLAGS="-m64 -fPIC" LDFLAGS="-m64"                        # 64 bits version
% make
% make install

Patches

libtool 2.4

The file ./libltdl/config/ltmain.sh needs the following patch

< 
< 	# JL patch 2010 -->
< 	if [ -z "$m32test" ]; then
< 	    #echo "Defining m32test"
< 	    m32test=$($ECHO "${LTCFLAGS}" | $GREP m32)
<    fi	
< 	if [ "$m32test" != "" ] ; then
< 	  dependency_libs=`$ECHO " $dependency_libs" | $SED 's% \([^ $]*\).ltframework% -framework \1%g' | $SED 's|lib64|lib|g'`
< 	else
< 	  dependency_libs=`$ECHO " $dependency_libs" | $SED 's% \([^ $]*\).ltframework% -framework \1%g'`
< 	fi
< 	# <-- end JL patch
< 
---
> 	dependency_libs=`$ECHO " $dependency_libs" | $SED 's% \([^ $]*\).ltframework% -framework \1%g'`

 

 

gdb (patch)

in gdb/linux-nat.c

         /*
fprintf_filtered (gdb_stdout,
"Detaching after fork from child process %d.\n",
child_pid);
*/

and go (no, I will not explain).

 

astyle

Version 2.03

% cd astyle_2.03/src
% make -f ../build/gcc/Makefile CXX="g++ -m32 -fPIC"   # for the 64 bits version, use the same command
                                                       # with   CXX="g++ -m64 -fPIC" 
% cp bin/astyle $XOPTSTAR/bin/
% test -d $XOPTSTAR/share/doc/astyle || mkdir -p $XOPTSTAR/share/doc/astyle
% cp ../doc/*.* $XOPTSTAR/share/doc/astyle

The target

% make -f ../build/gcc/Makefile clean

also works fine and is needed between versions.


Version 1.23

Directory structure changes but easier to make the package so use instead

% cd astyle_1.23/src/
% make -f ../buildgcc/Makefile  CXX="$CXX $CFLAGS"
% cp ../bin/astyle $OPTSTAR/bin/
% cd .. 

Note that the compressed command above assumes you have define dthe envrionment variables as described in this section. Between OS (32 / 64 bits) you may need to % rm -f obj/* as the make system will not reocgnize the change between kernels (you alternately may make -f ../buildgcc/Makefile clean but a rm will be faster :-) ).

Documentation

A crummy man page was added (will make it better later if really important). It was generted as follow and provided for convenience in the packages for STAR (do not overwrite because I will not tell you what to do to make the file a good pod):

% cd doc/
% lynx -dump astyle.html >astyle.pod 

[... some massage beyond the scope of this help - use what I provided ...]

% pod2man astyle.pod >astyle.man 
% cp astyle.man $OPTSTAR/man/man1/astyle.1 

 

Versions < 1.23

Find where the code really unpacks. There are no configure for this package.

% cd astyle_1.15.3 ! or 
% cd astyle/src
% make
% cp astyle $OPTSTAR/bin/

Version 1.15.3

The package comes as a zip archive. Be aware that unpacking extracts files in the current directory. So, the package was remade for convenience. Although written in C++, this executable will perform as expected under icc environment. On SL4 and for versions, gcc 3.4.3, add -fpermissive to the Makefile CPPFLAGS.

 

valgrind

MUST be installed using $XOPTSTAR because there is an explicit reference to the install path. Copying to a local /opt/star would therefore not work. For icc, use the regular command as this is a self-contained program without C++ crap and can be copied from gcc/icc directory. The command is

% ./configure --prefix=$XOPTSTAR  

Note: valgrind version >= 3.4 may ignore additional compiler options (but will respect the CC and CXX variables) as it will assemble both 32 bits and 64 bits version on a dual architecture platform. You could force a 32 build only by adding the command line options --enable-only32bit.

Caveats for earlier revisions below:

Version 2.2

A few hacks were made on the package, a go-and-learn experience as problems appeared
 

coregrind/vg_include.h
123c123
< #define VG_N_RWLOCKS 5000
---
> #define VG_N_RWLOCKS 500
coregrind/vg_libpthread.vs
195a196
> __pthread_clock_gettime; __pthread_clock_settime;

to solve problems encountered with large programs and pthread.

 

APR

The problem desribed below DOES NOT exist if you use 32 bits kernel OS and is specific to 64 bits kernel with 32 bits support.

For a 32 bits compilation under a 64 bits kernel, please use % cp -f $OPTSTAR/bin/libtool .  after the ./configure and before the make (see this section for an explaination of why) for both the apr and expat packages.

apr is an (almost) straight forward installation:                

% ./configure --prefix=$OPTSTAR

apr-util needs to have one more argument i.e. 

% ./configure --prefix=$OPTSTAR --with-apr=$OPTSTAR

The configure script will respect the environment variables described in this section and, provided you have defined them properly for the intended target (32 or 64 bits executable), the resulting Mkaefile will be properly generated without further modifications needed.

Note however that the package distributed in STAR has one hack to the fconfigure script a follows (apply if you download from anoher source than STAR's distributed packages):

% diff configure.orig configure
4255c4255,4256
<     CFLAGS="-g -O2"
---
>     # CFLAGS="-g -O2"
>     CFLAGS="-g -m32"
4261c4262,4263
<     CFLAGS="-O2"
---
>     # CFLAGS="-O2"
>     CFLAGS="-m32"

This will allow another way to assemble the package (without having to define the env variables) but you will need to substitute -m32 by -m64 as appropriate.

 

expat package is similar

% ./configure --prefix=$OPTSTAR

 

log4cxx

log4cxx 0.10.x

This distribution is part of the Apache project and requires APR library (see above).

The package was taken nearly as-is apart from the following patches:

  • inputstreamreader.cpp, socketoutputstream.cpp, console.cxx    - requires adding #include <string.h> on top as memmove, mencpy are no longer implicit but declared in string.h

After installing APR and using the patches as indicated, use

% ./configure --prefix=$XOPTSTAR --with-apr=$XOPTSTAR CFLAGS="-m32 -fPIC" CXXFLAGS="-fno-inline -g -m32" LDFLAGS="-m32"
or
% ./configure --prefix=$XOPTSTAR --with-apr=$XOPTSTAR CFLAGS="-m64 -fPIC" CXXFLAGS="-fno-inline -g -m64" LDFLAGS="-m64"

% cp -f $OPTSTAR/bin/libtool . 
% make
% make install

Please do NOT forget to use % cp -f $OPTSTAR/bin/libtool .  after the ./configure and before the make (see this section for an explaination of why). This assummes you installed libtool as instructed.

Finally, there is one patch needed if you download the package from other sources than where STAR provides the packages. The patch relates to a problem with atomic operations handling.

Index: src/main/cpp/objectimpl.cpp
===================================================================
--- src/main/cpp/objectimpl.cpp (revision 654826)
+++ src/main/cpp/objectimpl.cpp (working copy)
@@ -36,12 +36,12 @@

void ObjectImpl::addRef() const
{
- apr_atomic_inc32( & ref );
+ ref++;
}

void ObjectImpl::releaseRef() const
{
- if ( apr_atomic_dec32( & ref ) == 0 )
+ if ( --ref == 0 )
{
delete this;
}

 

log4cxx 0.9.5

There is a bug on Linux so, start with commenting all lines related to HAVE_LINUX_ATOMIC_OPERATIONS in configure.in before the below. Finally, two code had to be patched are now repacked

  • filewatchdog.cpp line 21: comment #ifdef WIN32 and the associated #endif
  • cocketimpl.cpp line 41: add #include <errno.h>

For ODBC support, one needs

% setenv CPPFLAGS "-I$XOPTSTAR/include"
% setenv LDFLAGS  "-L$XOPTSTAR/lib"

log4cxx 0.9.7

Also need to do the below or it will not even find the libs at configure.

% setenv CPPFLAGS "-I$XOPTSTAR/include"
% setenv LDFLAGS  "-L$XOPTSTAR/lib"


On Scientific Linux 4.4 aka Linux 2.6 replace as follows

#AC_CHECK_FUNCS(gettimeofday ftime setenv)
AC_CHECK_FUNCS(setenv)

Linux 7.3 distributions note

On Version 7.3 of Linux, this is hard to install. You will need to upgrade m4, autoconf to at the least the versions specified for "other platforms".  It won't compile easily with gcc 2.96 though. But it can using

% ./configure --prefix=$OPTSTAR CC=/usr/bin/gcc3 CXX=/usr/bin/g++3

if you have all gcc 3 ready.

Finally, if you install log4cxx from a new Linux version (especially one having a different version of autoconf tools), you better start from a fresh directory and not attempt to use the 'clean' target (it will fail).

Summay then:

  Linux gcc  (general instructions, all log4cxx) 

% ./autogen.sh 
% ./configure --prefix=$OPTSTAR [--with-ODBC=unixODBC]

  Linux icc

% setup icc 
% ./configure --prefix=$XOPTSTAR CC=icc CXX=icc [--with-ODBC=unixODBC]

   If icc is the second target, you should use 'make clean' before the configure.

  Solaris
 
   Does not configure (need the autoconf tools)

  True64
   Not tried yet

 


libxml2


Platform: so far needed to update RH 8.0 only, add to propagate to other platform in 2006 due to a component dependence issue.

% ./configure --without-python --prefix=$XOPTSTAR


pine


Re-added at BNL since SL4.4 because it was removed from the base installation, this may not be needed for your site (install from RPM, it exists).

Scientific Linux (don't get fooled by the targets)

% ./build lrh 
% cp bin/pine bin/pico $OPTSTAR/bin/

On a mixed of 32 / 64 bits architecture and/or with alternate  gcc versions, the command examples below could be used:

% ./build lrh CC=`which gcc` SSLDIR=/usr/include/openssl/ EXTRALDFLAGS="-m32" EXTRACFLAGS="-m32 -fPIC"
[or]
% ./build lrh CC=`which gcc` SSLDIR=/usr/include/openssl/ EXTRALDFLAGS="-m64" EXTRACFLAGS="-m64 -fPIC"

 

GSL - GNU Scientific Library

The install is straight forward with the usual configure but on 64 bit machine, you will need to add the CCFLAGS and LDFLAGS as showed below

% ./configure --prefix=$OPTSTAR                                    ! default bits 
% ./configure --prefix=$OPTSTAR CFLAGS="-m32 -fPIC" LDFLAGS="-m32" ! 32 bits
% ./configure --prefix=$OPTSTAR CFLAGS="-m64 -fPIC" LDFLAGS="-m64" ! 64 bits
% make
% make install

Python

The below were fine with verison 2.7.1

% setenv BASECFLAGS "-m32 -fPIC"
% setenv CXXFLAGS "-m32 -fPIC"
% setenv LDFLAGS "-m32"
%  ./configure --prefix=$XOPTSTAR

On a mixed architecture, I had to modify the generated pyconfig.h as the use of VA_LIST_IS_ARRAY would get pythong to crash.

1069c1069
< //#define VA_LIST_IS_ARRAY 1
---
> #define VA_LIST_IS_ARRAY 1



For the 64 bits version, please substitute -m32 with -m64  as follows

% setenv BASECFLAGS "-m64 -fPIC"
% setenv CXXFLAGS "-m64 -fPIC"
% setenv LDFLAGS "-m64"
%  ./configure --prefix=$XOPTSTAR

Note: The default compilation (without using the environment variable setting) may succeed but binding with ROOT and other package will fail and require -fPIC and additionally, it is best to have in all configurations -m32/-m64 specified explicitely.

 

MySQL-python

Build is straight forward in principle i.e.

% python setup.py build 
% python setup.py install 

but

  • on a mixed 32/64 bits platform, the 32 bits will need to be persuaded by
    • editing site.cfg and setting mysql_config to the 32 bit version (likely /usr/lib/mysql/mysql_config ) - you can be sure of this as the return value for --cflags would have -m32
    • Add the following code in setup_posix.py
      65,68d64
      <     for i in range(len(extra_compile_args)):
      <         if extra_compile_args[i] == '-m32':
      <             extra_link_args += ['-m32']
      <
      
      I am sure there are other more ellegant ways but this works fine.
       
  • If your default MySQL distribution resides in $OPTSTAR, you will also need to edit the site.cfg script before running the setup script and modify the mysql_config variable accordingly. This will prevent running into symnbol loading conflicts later (usually happening because a third party product would, in this case, use one version of MySQL and STAR code another). 

 

fastjet

The basic compilation requires

./configure --prefix=$OPTSTAR CXXFLAGS="-m32 -fPIC -fno-inline" CFLAGS="-m32 -fPIC -fno-inline" LDFLAGS="-m32"

For the 64 bits, replace -m32 by -m64. -fno-inline is needed still to circuvnet a gcc bug with inlining.

 

boost

Certainly, the most helpful reference was this boost reference. But those are not immediate instructions. Here is what you will need to do:

% ./bootstrap.sh --prefix=$XOPTSTAR

In any cases, this will build a few 64 bits executables on a 32/64 bits machine but don't panic yet ... To build, use one of the below (as appropriate):

% ./bjam cflags="-m64 -fPIC" cxxflags="-m64 -fPIC" linkflags="-m64 -fPIC" address-model=64 threading=multi architecture=x86 stage
or
% ./bjam cflags="-m32 -fPIC" cxxflags="-m32 -fPIC" linkflags="-m32 -fPIC" address-model=32 threading=multi architecture=x86 stage

I am sure you already see the problem  - on AMD processors, you may have a different "arhitecture" so we cannot give you the exact instruction to use here. Possible architectures are x86, x86_amd64, x86_ia64, amd64 or ia64.

When you are done wth compiling, execute nearly the same command but instead of "stage" use

% ... install --prefix=$XOPTSTAR

and you will be hopefully done.

unuran

This package follows a typical install i.e.

% ./configure --prefix=$XOPTSTAR CXXFLAGS="-m32 -fPIC" CFLAGS="-m32 -fPIC" LDFLAGS="-m32"
or
% ./configure --prefix=$XOPTSTAR CXXFLAGS="-m64 -fPIC" CFLAGS="-m64 -fPIC" LDFLAGS="-m64"

% make
% make install

This will allow ROOT to build the TUnuran classes.

LHAPDF-6.1.6

This package is not straight forward to install. Use the usual initial setup i.e.
 

./configure --prefix=$XOPTSTAR CXXFLAGS="-m32 -fPIC" CFLAGS="-m32 -fPIC" LDFLAGS="-m32"

or

./configure --prefix=$XOPTSTAR CXXFLAGS="-m64 -fPIC" CFLAGS="-m64 -fPIC" LDFLAGS="-m64"

then the usual

% make

However, before make install, modify  lhapdf-config and add -m32 -fPIC (-m64 -fPIC for 64 bits platform) to cflags and -m32 (-m64) to the ldflags i.e.
 

40c40
< test -n "$tmp" && OUT="$OUT -m32 -fPIC -I${prefix}/include "
---
> test -n "$tmp" && OUT="$OUT -I${prefix}/include "
46c46
< test -n "$tmp" && OUT="$OUT -m32 -L${exec_prefix}/lib -lLHAPDF"
---
> test -n "$tmp" && OUT="$OUT -L${exec_prefix}/lib -lLHAPDF"

as the configure will not do that and hence, not generate a config script suitable for a a mix 32/64 bits.

 
vim

Configure is standard, use a minimal option set and features=big as below

% ./configure --enable-pythoninterp=yes --enable-perlinterp=yes --enable-cscope --with-features=big \
--prefix=$XOPTSTAR CFLAGS="-m64 -fPIC" CXXFLAGS="-m64 -fPIC" LDFLAGS="-m64"

or

./configure --enable-pythoninterp=yes --enable-perlinterp=yes --enable-cscope --with-features=big \--prefix=$XOPTSTAR CFLAGS="-m32 -fPIC" CXXFLAGS="-m32 -fPIC" LDFLAGS="-m32"

then the usual make and make install.





 

CERN libraries

The STAR simulation framework will require the CERN libraries to be installed. This will likely be the most problematic portion of the STAR software installation as there is little support for the CERNLib nowadays (so, you must rely on existing supported versions).

  • Information about the CERN libraries can be found here with the appropriate download area.
    If you do not find a version of the CERNlib for your platform, well ... you are out of luck.
    If you have the same OS as BNL and do not want to spend to much time, copy from BNL and move on.
     
  • If you experience problems with the 64 bit versions and this was not copied from BNL, consider one of the following:
    • 64bit CERN libs showed to work at first were taken from DESY colleagues from this link.
    • CERNLib are also available in rpm format for Fedora EL4/EL5 (SL4/SL5). See this link for more information.

Note however that (for example) a generic Linux distribution or an older Linux version based distribution may work for respectively a different flavor of Linux or a more recent of Linux.

 

Building ROOT in STAR

Building ROOT in STAR

How to build ROOT at BNL and other sites (support documentation)

Some of the above help is similar than what you will find in the PDSF page mentioned above.
Older help version could be found from the revision tab.

ATTENTION

  • BEWARE that the target for 64 bit machine is specific to your platform. We use linuxx8664gcc at BNL.
     
  • Several patches are available for this revision (see at the bottom as usual). All sources were repackages in Additional software components as root_v5.34.30_2-star.source.tar.gz  (this revision was last made on 2014/06/25). Taking the repacked tar ball is the preffered and recomended approach. Alternatively, you may download the root code from root.cern.ch (the source tar balls are available on their site) but extra steps/actions noted in orange below are needed.
     
  • This revision supports Qt4 in its latest incarnation i.e. qt-everywhere-opensource-src-4.8.7 (see Additional software components for more information) - qt 4.4 was left for use for SL5.3
     
  • We substantially dropped several OS support since the previous release and this still applies to 5.34.09 i.e.:
    • cygwin not tested at all
    • Mac OSX not tested
    • Solaris and True64 are no longer available to us as build platform and hence, we cannot conclude
    • Linux revisions prior to 5.3 (Boron) are no longer supported.
       
  • Only Pythia6 is supported in this version (and is default)

The build

The build is in several steps ; we will assume for this example that we are building root version 5.22.00 ; the % sign is used for the Unix prompt.

  1. Go to the $ROOT directory, create a directory named after the version i.e. 5.22.00,  and unpack the tar ball using

    % cd $ROOT
    % mkdir 5.34.30_2

    % cd 5.34.30_2

    % setenv ROOTB `pwd`

    One note : the ROOTB environment variable will be used throughout this help but is not used by the STAR environment. It only serves the purpose of finding back easily the base tree for the ROOT version you are trying to install.

    If you are a STAR user, you can also find a copy of the package in AFS at the location
    /afs/rhic/star/common/General/Sources/  (again, this is the reocmmended approach to avoid extra complications).
    % tar -xzf root_v5.34.30_2-star.source.tar.gz

    Remember to remove the archive tar.gz file after unpacking. It is not helpful to keep it around.
    ALL required files (including the STAR specific configurations) are part of our repackaged sources. Starting from ROOT 5.34, no other means are recommended (you may still download the original codes from the ROOT Web site but the STAR specific configs and patches will not be present).
     
  2. The structure is not ready ... yet ... You still have to create the OS dependent tree structure we use to allow concurrent platform support in addition of the two layers (with debugging, without debugging information). This is done in a one-does-it-all script.

    % $STAR/mgr/MakeRootDir.pl

    This command needs to be executed ONCE on EACH supported platform.
    IMPORTANT NOTE: $STAR environment variable needs to be defined here but it case it is not, the latest version available in /afs/rhic.bnl.gov/star/packages/dev/mgr is the safest to use (because it is the latest available). You need to access this script from the AFS path (or make a local copy) if you start building a site from scratch as well (ROOT needs to be installed before $STAR).
     
  3. The next step will give an example on how to build it under the linux platform. The above assumes that root is installed in AFS (to make it generic) and that .@sys allows to separate the OS/compiler flavors (which is the case, but not valid for off-site NFS resident tree structure where you will have to specifically use .$STAR_HOST_SYS instead of .@sys ).

    % cd $ROOTB/.$STAR_HOST_SYS/rootdeb
    % setenv ROOTSYS `pwd`
    % $STAR/mgr/fixrootmk
    -unlink
    % ./configure linux --build=debug ... remaining options ... +

    % cd $ROOTB/.$STAR_HOST_SYS/root
    % setenv ROOTSYS `pwd`
    %
    $STAR/mgr/fixrootmk -unlink
    % ./configure linux  ... remaining options ...
    +

    The above examples are both building the standard root distribution and option. This IS NOT what we are using in STAR and the above table shows the default options currently in use. You MUST of course add the --build=debug flag where appropriate as shown above while using that table (i.e. the main point of the two commands as shown is that the rootdeb tree is build with the configure command --build=debug while the optimized version is not).

    Notes:
    1. You MUST issue a similar command on ALL platform you are supporting before proceeding to the next step.
    2. You need to run ./configure ONCE only. In case of any further updates, you should NOT repeat this action.
     
  4. Execute the following script before compiling
    % $STAR/mgr/fixrootmk
    This will check the configuration file for special tags (in the old versions, it would check and fix CERNLIB for example).
    Note that the same command prior with -unlink moved the system.rootrc aside before the configure and make local copies of some of the files which should not be shared between multiple-platforms. At the end of compilation, we will ask you to execute the script again to re-install the system.rootrc and similar files (i.e. make use of the global one for all platforms).

  5. Now we are ready to compile. To do this, simply use the below sequence where you have configured the package.
    % gmake


    Notes:
    1. the gmake step may display messages like No such file or directory .
      Please, ignore since this is normal and will trigger the proper directory tree creation.
    2. All is compiled AND INSTALLED. There is NO NEED to use make install.
    3. For those compiling on AFS, if gmake (or make) keeps calling the reconfigure script over and over again, you have two solutions. Either 'sleep 1 && touch Makefile'  and type make again or use 'make --assume-new=Makefile'. The former will reset the Makefile timestamp to a second later of the last command executed and the later tell make to consider the timestamp of the Makefile file as if it has just been modified. You may also try a 'make clean' and try again.
     
  6. As noted in a previous step, at the end, finalize the build by doing
    % $STAR/mgr/fixrootmk -fix

    and at least once

    % cd $ROOT
    % test -e 5.34.30 && rm -f 5.34.09 && ln -s 5.34.09_1 ./5.34.09

    This last command link your current build tree to the official 5.34.09. If you had a previous patch level version, it will be replaced at that stage and the new build will become active. There should be NO need to rebuild root4star for incremental patch levels.
    You are now done.

 

Table of options

Platform/OS

32/64 bits

configure script options

Linux = linux 32 bits --enable-table --enable-qt --with-pythia6-libdir=$XOPTSTAR/lib --enable-roofit --enable-mathmore --with-mysql-libdir=/usr/lib/mysql --enable-unuran --enable-xrootd --with-thread-libdir=/lib --enable-vc --enable-cxx11
Linux = linuxx8664gcc 64 bits --enable-table --enable-qt --with-pythia6-libdir=$XOPTSTAR/lib --enable-roofit --enable-mathmore --with-mysql-libdir=/usr/lib64/mysql --enable-unuran --enable-xrootd --with-thread-libdir=/lib64 --enable-vc --enable-cxx11

 

Notes:

  1. You should use $OPTSTAR at remote sites having a local /opt/star and $XOPTSTAR if you provide support over AFS.
  2. If you build Python support, please refer to You do not have access to view this node for quick tests to see if it works
  3. To enable the build with QT4, you MUST define QTROOT to point to the QT4 directory prior to executing the configure script (the build will oherwise be silent on missing it)
  4. For SL5.3, this version of ROOT was built with the additional option --enable-builtin-pcre
  5. If you intend to build ROOT with another compiler revision, ad the options --with-cc=`which gcc` --with-cxx=`which g++`.
  6. --disable-xrootd should be used wherever you do not have Xrootd support (or do not need it)
  7. If you change the Python verison on your system, note that the PyROOT binding will need to be remade. However, this will not be as easy as typing make as specific version-keyed includes will be needed. You may however try something like
    % modify python2.4 python2.7 ./config/Makefile.config
    in the $ROOTSYS directory and type make afterward (this will work). Ultimately, you may also re-build in your private area and
    % cp -fp lib/libPyROOT.so $ROOTSYS/lib
    % cp -fp lib/ROOT.py* $ROOTSYS/lib

    i.e. no need to compile in-place.

 

List of updated files 

The below list is provided for convenience but you should send a note if you note ANY differences from this list and what was packaged for use by remote sites. In the below,  A=added, P=patched, U=updated:

     
P    root/cint/cint/inc/G__ci.h
P    root/math/vc/Module.mk
P    root/bindings/pyroot/Module.mk

 

Typical patched codes The following codes are tweaked

 

cint/cint/inc/G__ci.h
#define G__LONGLINE
#define G__ONELINE
#define G__MAXNAME
#define G__ONELINEDICT
Check if appropriate (like at least 1024, 512, 256, 8) Alter behaviors of CINT but generally, G__LONGBUF setting is fine (usually forced).

 

STAR codes

Basic


The STAR code and libraries follows a structure and policy described in Library release structure and policy. Changes in each version is described in Library release history.

Installing the core STAR software is (should be) as simple as getting a full set of code for a given library, unpacking it into $STAR_PATH (default is $STAR_ROOT/packages as described in Setting up your computing environment) and issuing the following commands (in our example, we use STAR_LEVEL=SL09b with revision 1 from that library).

% cd $STAR_PATH
% mkdir SL11d && cd SL11d
% cvs co -rSL11d asps mgr QtRoot StarVMC StRoot kumacs pams StarDb StDb OnlTools
% starver SL11d 
% cd $STAR
% cons


And wait ... until all is done. This will actually build the non-optimized version of our libraries.

Modifiers

To build the optimized version, use

% setenv NODEBUG yes

before you execute the starver command. If you need both, you will hence have to build twice per libraries.

To build using alternate compilers, you will need to run the setup command before running cons. For example, for the icc compiler you will need an appropriate version of $OPTSTAR and

% setup icc

and for an alternate version of gcc (and pending the fact you have the specific version installed), you will need to use something similar to

% setup gcc 4.5.1

Note that those syntax assumes specific path for gcc (installed in either /opt/gcc/$version or $OPTSTAR/alt/) while icc is expected to have an setup program located in $GROUP_DIR (as intelcc.csh) defining the paths.

Finally, on kernel supporting it, you can also switch to an alternate bits environment like this

% setup 64bits

and get the compilation proceed with the 64 bits support.

Tips

Excluding problematic directories

Sometimes, our libraries get packed with the "Pool" (user space) libraries and their support may vary. To be on the safe side, exclude several of them from compilation by setting the environment variable SKIP_DIRS before executing cons.

% setenv SKIP_DIRS "StEbyePool StHighptPool StAngleCorrMaker StSpinMaker StEbyeScaTagsMaker 
StEbye2ptMaker StDaqClfMaker StFtpcV0Maker StStrangePool GeoTestMaker"

Special levels

The levels pro, new and dev are special levels as described in Library release structure and policy. pro is especially relevant as if no level is specified, the STAR login will revert to whatever pro is set to be. You may then do something like the below (again, our example assumes the default library is SL07b - please adjust accordingly).

% cd $STAR_PATH
% test -e pro && rm -f pro
% ln -s SL11d ./pro

Your default STAR library is then set for your site.

Post installation

Soft-links for compatibility

In /opt/star (or equivalent), $STAR and $ROOT/$ROOT_LEVEL, run the script $STAR/mgr/CreateLinks. This will create a few compatibility links to support additional (tested and proven to work) OS / sysname version.


Other

The list below is not exhaustive. Note that most does not need to be done and we separate the action items in two categories

To be checked or modify

  • Consult the script site_pre_setup.csh and site_post_setup.csh . ANY site specific envrionment variables (such as STAR_ROOT, OPTSTAR and so on) can be set there. See the Setting up your computing environment used in STAR for more information on defaults (you should have read it before reaching this section though ;-)  ).
     
  • Please, modify your local template login files ${GROUP_DIR}/templates/ files and make sure the GROUP_DIR directory is properly reflected. Note that ANY other variables set for your site should be set in site_pre_setup.csh
     

Optional - for large site and Tier centers wanting network independence.

  • Database connection setup - a local database setup could be used and is expected to be located in ${STAR_PATH}/conf/dbLoadBalancerLocalConfig_${SITE}.xml . Modify this file (based on existing template) to point to the local database service.
    To setup a local database service, consult You do not have access to view this node for getting a local database service going.
    Note: Attention, if you run on the Cloud, those instructions may be useful (a service can be created per VM)
     
  • Setting up SUMS - only for site intending to submit jobs locally
    SUMS will need a local batch system and a local config (ask the scheduler Hypernews).
    We support LSF, PBS, Condor and SGE.
      
  • FileCatalog - not needed unless you are a Tier1 center, holder of files in our distributed data management system.
    A copy of the FileCatalog can be set to the local site and/or a multi-master setup can be done.
    To first order, your local FileCatalog replica would minimally allow querrying the FC without the load we see at BNL (but is not that useful). A multi-master would allow full implementation of our data-management system and scheme including file transfer between sites and multi-site transfers.
    But for the API to know about the local setup, you need to modify the unique local configuration file $STAR_PATH/conf/Catalog.xml and add your configuration in. The key is SITE name="XXXX" (add a new block for your site and please, pass on the information back to the Catalog maintainer). Make sute this value matches the value of $SITE set in group_env.csh .
     
  • [*** data management / multi-site transfer instructions needed ***]

Provision CVMFS and mount BNL/STAR repo

These instructions are provided for you to install the CVMFS client and mount the STAR CVMFS repo – star.sdcc.bnl.gov The STAR software has been installed here and may be used as a replacement for AFS.

 

1. Get CVMFS yum repository    

# wget -O /etc/yum.repos.d/cernvm.repo http://cvmrepo.web.cern.ch/cvmrepo/yum/cernvm.repo

1. Get CVMFS RPM GPG KEY

# wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-CernVM http://cvmrepo.web.cern.ch/cvmrepo/yum/RPM-GPG-KEY-CernVM

3. Install cvmfs (At this time the version we are installing is cvmfs-2.5.1-1.el7.x86_64)

# yum install cvmfs cvmfs-config-default

4. Run the command below to automate the creation of the cvmfs user and create an entry in autofs

# cvmfs_config setup

Note: At this time with these instructions (config files provided) and/or other undetermined factors, CVMFS pointing to the star.sdcc.bnl.gov repo via autofs is not stable. Further instructions have been provided (i.e hard mount)

5. Add the file /etc/cvmfs/default.local

# wget -O /etc/cvmfs/default.local http://www.star.bnl.gov/~mpoat/cvmfs/default.local

6. Add the file /etc/cvmfs/config.d/star.sdcc.bnl.gov.local

# wget -O /etc/cvmfs/config.d/star.sdcc.bnl.gov.local http://www.star.bnl.gov/~mpoat/cvmfs/star.sdcc.bnl.gov.local

7. Add the file /etc/cvmfs/domain.d/sdcc.bnl.gov.local

# wget -O /etc/cvmfs/domain.d/sdcc.bnl.gov.local http://www.star.bnl.gov/~mpoat/cvmfs/sdcc.bnl.gov.local 

8. PUBLIC KEY NAME: star.sdcc.bnl.gov.pub - Create the sub directory for the public key & place the key in this directory

# mkdir /etc/cvmfs/keys/sdcc.bnl.gov

Note: Currently we are not making the ‘public key’ publicly available to all users. The key will be provided by request only. For scalability of this new infrastructure reasons, we have a need to trace the CVMFS connections, make recommendation on how to expand larger resources wherever we see problems etc ... Contact: Jerome Lauret (jlauret@bnl.gov) or Michael Poat (mpoat@bnl.gov) for further information.


9. You may need to restart autofs after adding the keys in

# systemctl restart autofs.service

10. Check to see if you can access CVMFS

# ls /cvmfs/star.sdcc.bnl.gov
  

RCF Contributions

This page will list and make available a few STAR contributions we made to the
RCF. Feel free to use them as needed.

LDAP scalability

The following scripts are suggested to provide a LDAP scalability. The basic idea is that LDAP lookups are flaky in the version we are using (client 2.0.27) and as a consequence, some lookups (including /usr/bin/id) fails in mass. Additionally, in our LDAP version, caching do not seem to work either.

The scripts below "dump" the entry list to a local file so that network lookups are un-necessary while LDAP is still used to centralized the account information. They were written and deployed at PDSF and passed as-is to the RCF team for consideration.

create_files in /etc/cron.hourly

#!/bin/sh

#
# This script dumps the user and group data from ldap and creates standard
# UNIX passwd and group files. This are propogated b