SN0526 : Building Efficient Data Planner for Peta-scale Science

Author(s):Michal Zerola, Jérôme Lauret, Roman Barták, Michal Šumbera
Date:Feb. 25, 2010
File(s): zerola_acat2010.v5.pdf
Abstract:

Unprecedented data challenges both in terms of Peta-scale volume and concurrent distributed computing have seen birth with the rise of statistically driven experiments such as the ones represented by the high-energy and nuclear physics community. Distributed computing strategies, heavily relying on the presence of data at the proper place and time, have further raised demands for coordination of data movement on the road towards achieving high performance. Massive data processing will be hardly “fair” to users or unlikely be using network bandwidth efficiently whenever diverse usage patterns and priorities will be involved unless we address and deal with planning and reasoning of data movement and placement. Although there exist several sophisticated and efficient point-to-point data transfer tools, the lack of global planners and decision makers, answering questions such as “How to bring the required dataset to the user?” or “From which sources to grab the replicated data”, is for most part lacking.
We present our work and status of the development of an automated data planning and scheduling system, ensuring fairness and efficiency of data movement by focusing on the minimal time to realize data movement (delegating the data transfer itself to existing transfer tools). Its principal keystones are self-adaptation to the network/service alteration, optimal selection of transfer channels, bottlenecks avoidance and user fair-share preservation. The planning mechanism relies on Constraint Programming and Mixed Integer Programming techniques, allowing to reflect the restrictions from reality by mathematical constraints. In this paper, we will concentrate on
clarifying the overall system from a software engineering point of view and present the general architecture and interconnection between centralized and distributed components of the system. While the framework is evolving toward implementing more constraints (such as CPU availability versus storage for a better planning of massive analysis and data production), the current state of our implementation in use for STAR is limited to a multi-user, multi-site and multi-source environment for data transfers and we will present the implications and benefit of our approach as well as a use case in practice based on requests made with multiple choice for sources.

Submitted: ACAT 2010 proceedings
Status: Pending review - PoS publishing

Keywords:Scheduling, Planning, Mixed Integer Programming, neywork, data transfer
Category:Computing