Abstract
Virtual Machine (VM) consolidation in the cloud has received significant research interest. A large body of approaches for VM consolidation in data centers resort to variants of the bin packing problem which tries to minimize the number of deployed physical machines while meeting the Service-Level-Agreement (SLA) constraints. In this paper we introduce the concept of workload delay as a Quality-of-Service (QoS) metric that captures directly the resulting degradation that a cloud user would experience in the case where the SLA is violated. Our results, that are based on real-life trace-based simulations, show that consolidating VMs based on the level of utilization results in little control over the resulting delay, a particularly significant drawback when running jobs with deadline requirements, while we are able to control the delay much better if we take into account our suggested metric of the delay.