SN0491 : Fair-share scheduling algorithm for a tertiary storage system

Author(s):Pavel Jakl, Jérôme Lauret, Michal Šumbera
Date:May. 13, 2009
File(s): fairshare_scheduling_chep2009.v3.0.pdf

Any experiment facing Peta bytes scale problems is in need for a highly scalable mass storage system (MSS) to keep a permanent copy of their valuable data. But beyond the permanent storage aspects, the sheer amount of data makes complete data-set availability onto live storage (centralized or aggregated space such as the one provided by Scalla/Xrootd) cost prohibitive implying that a dynamic population from MSS to faster storage is needed. One of the most challenging aspects of dealing with MSS is the robotic tape component. If a robotic system is used as the primary storage solution, the intrinsically long access times (latencies) can dramatically affect the overall performance. To speed the retrieval of such data, one could organize the requests according to criterion with an aim to deliver maximal data throughput. However, such approaches are often orthogonal to fair resource allocation and a trade-off between quality of service, responsiveness and throughput is necessary for achieving an optimal and practical implementation of a truly faire-share oriented file restore policy. Starting from an explanation of the key criterion of such a policy, we will present evaluations and comparisons of three different MSS file restoration algorithms which meet fair-share requirements, and discuss their respective merits. We will quantify their impact on a typical file restoration cycle for the RHIC/STAR experimental setup and this, within a development,
analysis and production environment relying on a shared MSS service.

Submitted: CHEP2009
Status: Published
Ref: Pavel Jakl, J. Lauret and Michal  Šumbera, 2010 J. Phys.: Conf. Ser. 219 052005

Keywords:MSS, HPSS, storage, tertiary, Xrootd, Monte-Carlo, fiar-share, NP-complete, scheduling