Skip to Main content Skip to Navigation

Size-based disciplines for job scheduling in data-intensive scalable computing systems

Abstract : The past decade have seen the rise of data-intensive scalable computing (DISC) systems, such as Hadoop, and the consequent demand for scheduling policies to manage their resources, so that they can provide quick response times as well as fairness. Schedulers for DISC systems are usually focused on the fairness, without optimizing the response times. The best practices to overcome this problem include a manual and ad-hoc control of the scheduling policy, which is error-prone and difficult to adapt to changes. In this thesis we focus on size-based scheduling for DISC systems. The main contribution of this work is the Hadoop Fair Sojourn Protocol (HFSP) scheduler, a size-based preemptive scheduler with aging; it provides fairness and achieves reduced response times thanks to its size-based nature. In DISC systems, job sizes are not known a-priori: therefore, HFSP includes a job size estimation module, which computes approximated job sizes and refines these estimations as jobs progress. We show that the impact of estimation errors on the size-based policies is not signifi- cant, under conditions which are verified in a system such as Hadoop. Because of this, and by virtue of being designed around the idea of working with estimated sizes, HFSP is largely tolerant to job size estimation errors. Our experimental results show that, in a real Hadoop deployment and with realistic workloads, HFSP performs better than the built-in scheduling policies, achieving both fairness and small mean response time. Moreover, HFSP maintains its good performance even when the cluster is heavily loaded, by focusing the resources to few selected jobs with the smallest size. HFSP is a preemptive policy: preemption in a DISC system can be implemented with different techniques. Approaches currently available in Hadoop have shortcomings that impact on the system performance. Therefore, we have implemented a new preemption technique, called suspension, that exploits the operating system primitives to implement preemption in a way that guarantees low latency without penalizing low-priority jobs.
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Monday, December 12, 2016 - 6:05:06 PM
Last modification on : Friday, July 31, 2020 - 10:44:09 AM
Long-term archiving on: : Monday, March 27, 2017 - 11:46:22 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01415094, version 1


Mario Pastorelli. Size-based disciplines for job scheduling in data-intensive scalable computing systems. Distributed, Parallel, and Cluster Computing [cs.DC]. Télécom ParisTech, 2014. English. ⟨NNT : 2014ENST0048⟩. ⟨tel-01415094⟩



Record views


Files downloads