Bridging EPICS and High-Level Services at STAR

Bridging EPICS and High-Level Services at STAR


Outline:

  1. Introduction: STAR, EPICS, MQ and MIRA/DCS
  2. R&D Topics
    1. Integration of EPICS, RTS and DAQ using MIRA
    2. Remote Access Capabilities (web+mobile)
      • Advanced Alarm Handler
      • Historical Data Browser
      • Experiment Dashboard
    3. Advanced Data Archiver: Pluggable Storage Adapters
      • SQL: MySQL
      • DOC: MongoDB
      • NoSQL: HyperTable
    4. Complex Event Processing
      • Esper Engine
      • WSO2 middleware
  3. Summary and Outlook
  4. Figures

1. General Concepts: STAR, EPICS, MQ and DCS/MIRA

An acronym for the Solenoidal Tracker At RHIC (Relativistic Heavy Ion Collider), STAR detector tracks thousands of particles produced by ion collision, searching for signatures of a state of matter called the quark-gluon plasma (QGP), a form that is thought to have existed just after the Big Bang, at the dawn of the universe. A primary goal of STAR is to bring about a better understanding of the universe in its earliest stages, by making it possible for scientists to better understand the nature of the QGP. The STAR collaboration consists of over 500 scientists and engineers representing 60 institutions in 12 countries. As the size of the collaboration and the scope of its work continues to grow, so does the challenge of having the computing power and data processing resources to carry out that work efficiently.

STAR's detector control system (also referred as Slow Controls) is based on EPICS toolkit. EPICS is a set of Open Source software tools, libraries and applications developed collaboratively and used worldwide to create distributed soft real-time control systems for scientific instruments such as a particle accelerators, telescopes and other large scientific experiments. The STAR experiment started in 1999, with just one Time-Projection Chamber and a few trigger detectors, but today it is equipped of 18 subsystems. Initially, STAR Slow Control system had 40,000 control variables, now it is expanded to over 60,000 variables and this list is still growing due to the RHIC II upgrade, beam energy scan program, and possible upgrade to eRHIC in future. STAR had just 120 types of structures to migrate to the calibrations database at the early days of the experiment, and we now migrate over 3,000 types of structures annually. STAR’s Data Acquisition (DAQ) – physics data taking component – was upgraded three times, adding one order of magnitude to the rates each time.

STAR’s Messaging Interface and Reliable Architecture framework (MIRA), was created as an attempt to improve meta-data archiver operations in 2010. It relies on an advanced message-queuing middleware, which provides the asynchronous, payload-agnostic messaging, and has adapters to get data from EPICS- and CDEV-based data sources. We have selected AMQP as a messaging middleware standard for High-Level services, as well as MQTT for low-level intra-service communications. It allowed us to design a loosely coupled, modular framework architecture, resulting in a scalable service, suitable for a highly concurrent online environment. During the deployment validation phase in 2010, just three subsystems used the MIRA framework. By 2014, all eighteen subsystems were completely integrated, with over sixty collector services deployed and continuously monitored by the framework. The total number of messages passing through the system reached three billion messages per year, with rates varying between one hundred and fifty messages per second to over two thousand messages per second. MIRA provided STAR with a solution to handle the growing number of channels (x15), and data structures (x25), allowing smooth operation during Runs 10-15. In 2014 we have extended MIRA with the stream-based Complex Event Processing capability, which successfully passed our tests. A few alarms implemented for Run 14 saved months of work for the core team and collaborators.

The MIRA framework is still evolving. In near future, we are planning to add features, commonly encountered in the Detector Control Systems domain: experiment workflow and hardware control, as well as many High-Level Services, extending and generalizing the functionality of the underlying framework(s). This document is focused on the proposed R&D related to futher development of MIRA framework and related services.

2. R&D Topics

We expect STAR to double the number of channels in the next five years, hence, system scalability is our primary objective. To allow seamless migration from wide variety of existing legacy hardware to the modern detector control equipment, at the same time keeping the existing Detector Control system based on EPICS fully operational to avoid interruption of service, we propose a gradual functionality upgrade of existing tools and services. This means, our primary objective is to extend existing services like experiment's EPICS-based Alarm Handler, Meta-data Collectors, RTS- and DAQ-components and to provide improved ways to orchestrate this growing set of services.


2.1 Integration of EPICS, RTS and DAQ using MIRA

The intent to integrate vide variety of STAR Online services is driven by the growth of the STAR experiment's complexity, and greatly increased data processing rates. We have identified the requirements for the software infrastructure, desired to maintan STAR for the next decade. The upgrade team has collected the following key demands from collaborational users and detector experts: Scalable Architecture, Low-overhead Inter-operable Messaging Protocol, Payload-agnostic Messaging, Quality of Service, Improved Finite State Machine, Real-time, web and mobile-friendly Remote Access. While some of these features are already covered by by MIRA framework, others require R&D studies to accomplish our goals.

2.2 Remote Access Capabilties

With web and mobile-based data access and visualization, MIRA, as the Detector Control framework, can provide data when and where the user needs it, and tailor that data for specific user requirements. This will lead to the rapid sharing, knowledge creation, engaged users, and more opportunities for the efficient scientific data processing. As operators become more fully engaged, they are apt to grow their involvement, contributing in unexpected ways that could ultimately decrease operational overhead.

Advanced Alarm Handler
Due to the geographically distributed nature of STAR collaboration users and detector experts, one of the highly demanded upgrade targets is the Alarm Handler improvement. It is desired to provide a real-time, web (and/or mobile) version of the existing alarm handler application, which is currently based on EPICS-MEDM interface. To accomplish this task, we need to investigate the following opportunities:
(a) bi-directional propagation of alarm messages to/from EPICS to MQ/MIRA
(b) create a web/mobile interface, resemblng the existing MEDM-based Alarm Handler, keeping the overall look and feel to reduce the learning curve for the detector operators and subsystem experts

Estimated Efforts: 6 months / 1 FTE (2 x 0.5)

Historical Data Browser
Both EPICS and MIRA framework have their versions of Historical Archive interfaces. While EPICS Archiver provides a detailed access to all meta-data collected by the experiment, and MIRA's interface is appraised by the users for an easy categorization and expanded set of plotting features, there is a demand for the user-customizable interface which may/should include scripting features for advanced users.

Estimated Efforts: 4 months / 0.5 FTE

Experiment Dashboard
The process of unification of STAR's components leads to the need of the creation of Experiment's Dashboard (EDash), which will provide a high-level summary of the activies happening in STAR Online domain. EDash will serve as a single entry-point for the aggregated status summaries coming from Slow Control System, Data Acquisition System, Run-Time System and other systems integrated via MIRA messaging capabilities.

Estimated Efforts: 8 months / 0.5 FTE

Images


Figure 1. Component diagram for MQ-based Alarm Handler. Green components - existing infrastructure. Pink components - to be developed in a course of this R&D. Blue components - existing EPICS services, serving as data source for the test bed.