Workload Management For Power Efficiency in Virtualized Data Centers
By Gargi Dasgupta, Amit Sharma, Akshat Verma, Anindya Neogi, Ravi Kothari
Communications of the ACM,
July 2011,
Vol. 54 No. 7, Pages 131-141
10.1145/1965724.1965752 Comments
By most estimates, energy-related costs will become the single largest contributor to the overall cost of operating a data center. Ironically, several studies have shown that a typical server in a data center is seriously underutilized. For example, Bohrer et al.3 find the average server utilization to vary between 11% and 50% for workloads from sports, e-commerce, financial, and Internet proxy clusters. This underutilization is the consequence of provisioning a server for the infrequent though inevitable peaks in the workload. Power-aware dynamic application placement can simultaneously address underutilization of servers as well as the rising energy costs in a data center by migrating applications to better utilize servers and switching freed-up servers to a lower power state.
Though the concept of dynamic application placement is not new, the two recent trends of virtualization and energy management technologies in modern servers have made it possible for it to be widely used in a data center. While virtualization has been the key enabler, power minimization has been the key driver for energy aware dynamic application placement.
Server virtualization technologies first appeared in the 1960s to enable timesharing of expensive hardware between multiple users. As hardware became cheaper, virtualization gradually lost its charm. However, since the late 1990s there has been renewed interest in server virtualization17,18 and is now regarded as a disruptive business model to drive significant cost reductions. Advances in system management allow the benefits of virtualization to be now realized without any appreciable increase in the system management costs.11
The benefits of virtualization include more efficient utilization of hardware (especially when each virtual machine, or VM, on a physical server reaches peak utilization at different points in time or when the applications in the individual VMs have complementary resource usage), as well as reduced floor space and facilities management costs. Additionally, virtualization software tends to hide the heterogeneity in server hardware and make applications more portable or resilient to hardware changes. Virtualization Planning entails sizing and placing existing or fresh workloads as VMs on physical servers.
Figures 1a and 1b shows the actual CPU utilization patterns respectively for two clusters in a large data center averaged over the entire month. Colors toward "white" represent high utilization and colors closer to "black" represent low utilization. It is clear that in Cluster A (Figure 1a), some of the machines are grossly underutilized throughout the day. Therefore, the workloads on these machines can be consolidated statically and the number of machines can be reduced.
Cluster B (Figure 1b) and some of the machines in Cluster A show long low utilization periods during the day (not for entire day). This indicates that even though the number of machines cannot be reduced permanently, it is possible to consolidate the workloads on fewer machines using live migration and put the remaining servers in some low power state during those periods. The utilization pattern of Cluster B for the next month (Figure 1c) shows the utilization changes over time. For example, B7 and B8 are used moderately in first month and heavily in the second month. On the other hand B9 shows opposite behavior.
Even if the average machine utilizations are not very low, the utilizations on different machines are sometimes complementary in nature, that is, when one machine in a cluster is at high utilization, another machine in the same or different cluster may be at a low utilization. This is illustrated in the set of CPU utilization plots on a pair of machines shown in Figure 1d. The dotted lines are the CPU utilizations captured in a time window for two servers and the solid line is the summation of the two dotted lines. Dynamic consolidation can happen either due to periods of low resource utilization of the packed applications (Figure 1e) or when applications demonstrate complementary resource behavior during a period, such as, when an application is I/O bound while another on the same server is CPU bound. A consolidation engine should be able to make use of all such opportunities to minimize energy consumption.
In this article, we simplify resource utilization of a workload to be captured only by CPU utilization. However in practice, multiple parameters, such as memory, disk, and network I/O bandwidth consumption, among others, must be considered.
A second trend that is driving the need for dynamic application placement is the growing awareness about energy consumption in data centers and the substantial carbon footprint of these data centers. The power density of data centers is typically around 100 Watts per square feet and growing at the rate of 15%20% per year.3 Also, if a single server rack consumes 13KW of power then 100 such fully populated server racks will consume 1.3MW of power. It will require 1.3MW electricity consumption by the cooling system to dissipate the heat generated by the rack. The cost of electricity per year for the total 2.6MW electricity is an astounding $2.6 million. At a server level, the cost of energy consumed for a 300Watt server comes to around $300 per year excluding additional facilities costs. Thus, the inefficient use of servers in a data center leads to high energy costs, expensive cooling hardware, floor space, and an adverse impact on the environment.
The two major and complementary ways to solving this problem are building energy efficiency into the initial design of components and systems, and adaptively managing power consumption of servers and systems in response to the changing conditions in the workload.5,6,11,14,16 Examples of the former include circuit techniques such as disabling clock signal to unused components, micro-architectural design changes12 and support for low power states in processors, memory, and disks.8 At the system level, the latter approach requires intelligent workload management software to optimize the power consumption at server and cluster levels. Combining the software intelligence with hardware mechanisms is usually the most beneficial approach. For example, a workload manager can use virtualization to resize VM containers of applications or migrate VMs at run time to consolidate the workload on an optimal set of physical servers. Servers unused over a period of time can be switched to the available low power states to save energy.
The inefficient use of servers in a data center leads to high energy costs, expensive cooling hardware, floor space, and an adverse impact on the environment.
A tighter integration of server power management technologies with cooling solutions could create additional opportunities for consolidation. Coupling the cooling and other second order effects in the data center would perhaps bring operational costs down even further but that discussion is beyond the scope of this article.
Continuous reconfiguration in data centers due to dynamic consolidation make performance and power modeling a challenging task. The challenges primarily arise as heterogeneous workloads, which may stress different resources (for example, CPU, memory, cache) move across heterogeneous systems. Further, since consolidation entails co-locating multiple workloads on a server, the impact of such co-location on power and performance needs to be modeled. The fundamental modeling challenges relevant for consolidation are:
Virtualization overhead
Co-location overhead
Workload normalization across heterogeneous systems
Power modeling of multiple co-located applications
Virtualization overhead modeling. Traditional VM monitors consist of a host layer, often called hypervisor, which supports one or more guest VMs. The additional layer between an operating system and the hardware leads to a performance impact, which is often referred to as virtualization overhead. Models that relate application performance to available resourcees need to factor in this virtualization overhead while determining the resource usage of applications. One of the most important reasons for this virtualization overhead is because of the I/O processing. There are two different virtualization strategies employed by various vendors. Xen has a 'thin' hypervisor where all the device drivers are separated out in a component that sits on top of the hypervisor. This component is called dom0 and performs I/O processing on behalf of all other guest VMs. VMWare, on the other hand, has device drivers embedded in the hypervisor layer along with various other hardware-aware code optimization techniques. The common thread in both these architecture is that I/O processing is performed by the host layer as a proxy for each guest VM. This redirection causes extra processing cycles and leads to a CPU overhead. The overhead occurs because the host layer must multiplex requests from various guest VMs based on an internal map it maintains. Hence, the amount of network and disk operations performed by an application dictate the virtualization overhead imposed on the server and additional resources need to be provisioned for it.
Co-location overhead. Virtualization has the goal of consolidating multiple applications on the same physical server with each application typically running in its own VM. This isolation is obtained by partitioning CPU and memory temporally or spatially, ensuring one application does not impact any other applications. However, this partitioning is limited to CPU and memory and does not extend to shared resources like processor caches. As a result, an application with high access rates may block the full cache and other applications may not be able to use the cache, leading to performance degradation. Co-location of multiple VMs thus introduces cache contention between competing workloads and may impact performance significantly. Verma et al.15 study this co-location overhead and propose a working set based partitioning scheme where applications with small working set are not mixed with applications that have moderate or large working set.
Virtualization has the goal of consolidating multiple applications on the same physical server with each application typically running in its own VM.
Applications with small working set are those that do not store a lot of state, for example, an application server in a three-tiered Web application. On the other hand, the database tier of the application may work on a large number of tables and have a large working set. The http server needs to maintain the state of the incoming requests and may have a moderate working set. A working set size-based partitioning policy is found to be successful in ensuring relative performance isolation between applications. Applications with small working set can use cache without any contention from large applications that may pollute the cache. Similarly, large applications can be collocated as none of these applications are able to use processor caches. Hence, any additional cache contention leads to no performance penalty. However, medium size applications need to either run stand-alone or they see a performance impact because of co-location. Co-location forces consolidation planning to take into account two important features. First, it prevents applications with diverse working set size to be consolidated together and second, it requires the characterization of the co-location overhead for applications with moderate working set sizes.
CPU workload normalization. Dynamic consolidation implies that a workload may move from one server to another. Since server systems involved in the data center are inherently heterogeneous with respect to process family and type, resource allocation for a workload requires normalization across servers. To take an example, an application may be moved from a Dell PowerEdge server with PIII 1.4GHz processors to an IBM HS-21 blade with Intel Xeon 3.16GHz processors. In order to perform an accurate consolidation, one needs to estimate the resource requirements of the application on the new platform. To accurately combine the processor utilization of the systems based on different processors, industry benchmarks can be employed to normalize the CPU workload data.
Processor benchmarks such as SPEC CINT2000 are better suited than basic processor speeds (MHz) since benchmarks account for different processor architectures that affect performance significantly. Another popular choice is the RPE2 metric from IDEAS1 commonly used for benchmarking many servers.
Dynamic power modeling. Power modeling has traditionally focused on modeling power as a function of CPU utilization. The approach has worked well for predicting the power drawn by a fixed application running on a given server. The continuously varying application placement on servers brings about a fundamental change in power modeling as different resource of a server are used to varying degrees by different applications. It has been observed that CPU utilization is not a good indicator of power consumption if the applications placed on a server are changed. Hence, a power model for dynamically varying application mixes on a server needs to be cognizant of other system metrics like memory bandwidth, usage of various processing units, I/O and network activity. However, some of these parameters are very difficult to ascertain from application level parameters and a system parameters based power modeling looks infeasible. This leaves two approaches for comparing multiple candidate configurations during dynamic consolidation: Building a power model for applications with respect to application level parameters like application throughput, or investigating properties in power models that may allow comparison between configurations without prediction.
Approaches in Ranganathan et al.13 have tried to use the former approach of modeling application behavior versus power consumption. On the other hand, Verma, Ahujan, and Neogi14,15 use a benchmark application to relatively order servers in terms of their power efficacy and then utilize these rankings to select servers during dynamic consolidation. The accuracy of the approach depends on the fidelity of the benchmark application to the actual application and is shown to be accurate for applications within the same class (for example, HPC workloads, Web serving workloads, and so on).
A Workload Management tool for Data Center Efficiency
The goal of workload management in a data center is to bring down the operating costs of the data center while maximizing the revenue earned by the data center. We first present a brief overview of the way workload management has been traditionally done and follow it up with the design and implementation of a tool we have built.
Workload management tools and methodologies. Workload management in virtualized data centers has focused primarily on static one-time placement of applications or VMs on physical servers. The placement has been driven primarily by feasibility as opposed to efficiency, that is, the goal of workload management has been to come up with a sizing and placement that meets all the constraints, even at a high cost. Hence, sizing has often been done in a conservative manner with each application being assigned resources close to its maximum resource utilization. Similarly, placement of applications has been a painstakingly manual exercise with the sole aim of ensuring that all constraints are met. Hence, the early tools in workload management primarily focus on monitoring and reporting the current state of the data center to an administrator. The administrator would then use the reports to manually decide the appropriate workload management tasks. Some of the most popular capacity planning tools include the Cirba Data Center Intelligence, VMWare Capacity Planner, and Platespin PowerRecon.4 Similar tools exist for dynamic workload management as well.7,9
Recently, there has been some tooling support to automatically identify the optimal workload configuration for a data center. However, these tools are based on brute force search and are oblivious to the application sensitive power models or co-location overhead.
The Emerald Workload Management tool developed at IBM Research, India, focuses on minimizing the overall cost of the data center, while meeting the required performance guarantees. The costs can include both the ownership (such as, cost of hardware, software, facility) as well as the operational costs (cost of maintenance, power, cooling).
Figure 2 shows the key components in the Emerald workload management flow. Inputs to the tool are in the form of workload traces, source and target server configurations along with their associated cost and power models.
Readers. These represent multiple adapters that read monitored data from the current infrastructure. In most cases the traces are represented as univariate timeseries. The traces are generated for each application in the source environment. If the source environment is nonvirtualized then the data represents resource usage of the application on a native system. For virtualized environments, the data represents resource usage within a VM. A trace can consist of multiple parameters measured in the application/ server context. Parameters can be of two types: resource usage parameters, for example, CPU, memory, I/O (disk, network) and application level parameters like active memory working set size, and memory bandwidth.
Smoothers. The raw traces are cleansed and filtered to reduce noise. The smoothers help fit the collected data to existing models that can help in forecasting their future values.
Normalizers. Target and source server specifications are often very different in terms of the CPU processing capacity. Accurate normalization across server configurations is needed to estimate the fraction of the resource capacity any application will occupy on new target hardware. While data sizers estimate the resource requirements of the application, the data normalizers perform the task of normalizing the requirements across heterogeneous server configurations. We use the RPE21 metric to use a normalized capacity value across heterogeneous server types. For standalone servers, the RPE2 capacity of the host server multiplied by the maximum CPU utilization of the server in the period is used as an estimate of the resource requirements of the application. For virtualized servers, the size is computed based on the entitlement of the virtual server on its host server, the CPU utilization of the virtual server, and the RPE2 of the host server.
Size estimation and workload profiling. Size estimators perform the task of estimating resource requirements or the size of the application. They use monitored time series data to create appropriate VM sizes that can be used by the placement algorithms. The sizing is done such that the performance characteristics obtained from the VM satisfies the current SLA requirements. This may include multi-dimensional sizing based on CPU, memory and I/O parameters.
Constraints. Business and technical constraints arising from either higher-level policies or the topology of the data center can be easily specified through the tool. These constraints are passed to the placement engine, which generates one or more placement configurations that minimize total costs while satisfying the constraints.
Placement engine. This constitutes a library of backend VM Packing Algorithms that finds the near-optimal placement of VMs on servers based on different optimization criterion like power minimization, migration cost and power cost minimization, power budgeting, and so on. These algorithms are discussed in the section on power aware placement.
ROI estimator. The final output from Emerald is in the form of ranked options that an administrator can adopt along with their associated confidence/risk factors. The ROI associated with each plan is also presented to the administrator in the form of generated reports.
Extensibility and customization. The design of Emerald ensures that each component is highly customizable and additional logic can be incorporated easily without changing the core interfaces. The tool internally maintains an extensible knowledge-base for models (for example, cost, power, migration and virtualization overhead models) of popular server configurations. Once the analysis is completed Emerald also provides capabilities for incremental "what-if" analysis. These include specifiying new constraints, defining new server models, modifying the consolidation ratio etc. This enables analysis of various tradeoffs during the virtualization planning.
The primary goal of any virtualization based workload consolidation initiative is to establish the best possible consolidation roadmap, which is often guided by a set of existing constraints. These could be arising from technical limitations of the infrastructure or business policies.4 While it is possible to succeed in small-scale initiatives without addressing all of these areas, large-scale initiatives require the careful consideration of business and technical constraints, and a systematic way to assess virtualization, and the return on investment (ROI) of their implementation on the target environment.
Technical constraints. Virtualization initiatives are subject to numerous technical constraints, and these are often dictated by the nature of the virtualization technology being contemplated. For example, hypervisor-based technologies like VMWare on ESX impose relatively few constraints related to the operating system configurations, as each VM will have its own OS image in the final configuration. In contrast, there are "containment based" technologies that allow applications to see a common OS image, and analysis in these scenarios must therefore factor in OS compatibility. Other technical constraints may arise from network connectivity requirements, storage requirements, controllers, peripherals, UPS requirements, among others.
Business constraints. Common examples of business constraints include availability targets, application SLAs, DR policies, maintenance windows, compliance restrictions, and DR relationships. Other business constraints could be more subtle. For example, consider a data center that has utilized all its available raised floor space. This could be due to a very tight sizing of the raised floor area or the real estate that was available at the time. To address this particular data center's problem, the space savings will have to be given higher preference than power savings.
The placement of VMs on a set of servers for power management is a packing problem where we want to pack the VMs on the set of servers such that the total power drawn by the VMs is minimized. The placement is recomputed whenever there is a significant change in either the configuration of the data center (for example, addition of new workloads) or the workload patterns (for example, change in application traffic for some workloads).
A packing that minimizes power should have the following characteristics: use fewer number of servers to pack the VMs; select power-efficient servers before selecting servers that consume more power per unit capacity packed; pack servers at close to their capacity and avoid fragmentation; and minimize the number of migrations during any reconfiguration. The second property ensures the most power-efficient servers are selected. The third property ensures that overhead of workload management is minimal; all other properties ensure the selected servers are utilized in the most efficient manner.
Emerald uses a two-phase algorithm pMapper14 that possesses the required characteristics. In the first phase, pMapper ranks the servers based on their cost and power-efficiency. While packing an application, pMapper ensures that a higher ranked (better) server is considered before a lower ranked server. Further, to ensure that servers are packed with minimal fragmentation, Emerald uses a variant of First Fit Decreasing bin packing algorithm for the second phase. In this phase, all the VMs are sorted in decreasing order based on their resource requirement. The placement of these VMs on the servers is simulated in this particular order to obtain the target utilization for each server. Finally, since the algorithm is aware of the previous placement, it attempts to reduce the overhead of reconfiguration by only selecting the minimal set of workloads for migration, in order to achieve the target server utilization.
Elaborating further, all servers with current utilization greater than their newly computed target utilization are labeled as Donors. All the servers with current utilization less than the target utilization are labeled as Receivers. The servers who are neither Donors nor Receivers retain their original VMs. All the donor servers donate VMs to reach close to their target utilization.
These donated VMs are then sorted based on size and placed on the Receivers such that each Receiver approaches its target utilization. This enhancement ensures that only the minimal number of VM migrations is required to reach the new target server configuration.
Apart from single dimensional sizing, Emerald also deals with multidimensional usage (for example, CPU, memory, i/o). In this case priority is given to the most critical resource for an application and the placement is optimized for the most critical resource, while other resources' requirements are used as constraints to be satisfied by the desired allocation.
The two-phase history-aware algorithm in Emerald is thus able to achieve a very low-level of fragmentation while minimizing migration costs. Further, by assigning higher priority to more cost-effective and power-efficient servers, it is able to minimize the ownership cost as well as the power drawn by the data center.
The ROI analysis presents the static and dynamic costs pre- and post-consolidation. Static costs include one-time hardware and software inventory purchase. Key factors affecting static costs are purchase of new power-efficient hardware, virtualization licenses, and live migration licenses. In addition, these costs can be offset if old hardware can be disposed of for a resale value or if data center space can be reduced.
Dynamic costs involve costs of server power, data center cooling, dynamic migration, among others. Server power costs can be estimated from the power efficiency of the current set of operational servers. Cooling costs could be derived either as a fraction of the server power costs or from elaborate thermodynamics based cooling models for data centers. Dynamic migration analysis identifies the applications that may benefit from runtime placement changes and captures the trade-off involved in the parameters of response time, throughput, and so on.
With the renewed focus on virtualization and power management technologies, many data centers are in need of a detailed analysis of their infrastructure that can make comprehensive recommendations about power management, capacity planning and virtualization options for the data center. In our case studies, we will show how Emerald can assist data centers in the following ways:
To optimally manage customers' existing infra-structure
To help them plan for virtualization (on new infra-structure)
To additionally assist in reducing their operating costs through power saving options and VM migration
Consider a data center as shown in Figure 3, composed of 14 non-virtualized servers, divided into three clusters, Cluster0, Cluster1 and Cluster2, all executing part of the same Fulfillment business process. Each cluster has some computing resources connected to a common storage. Cluster0 comprises of two HS-21 x-series blades and three x3650 servers, Cluster1 comprises of five x3650 servers and Cluster2 comprises of one P6 JS-22 Blade and three P5 servers. In terms of power efficiency, JS-22 has the best (least) peak power/peak capacity. The HS-21 power/capacity ratio is 10% above its P5 counterpart and the x3650 servers have the worst power efficiency.
The business process is composed of multiple applications, such as Application0Application13. In the initial set-up each application is running on a stand-alone server. There is a business constraint that specifies the computing and the storage infra-structure for Application0, Application5 and Application10 cannot be shared. Hence, they need to continue executing in separate clusters. Also Application5 can only run on Intel hardware, which excludes it from running on Cluster2. These constraints are either specified through the Emerald constraint specification framework or inferred from the topology.
The first study demonstrates how a workload manager plans for virtualization of the above infrastructure and consolidation of the computing resources. Figures 4a and 4b show the pre and post consolidation views of CPU utilization and power distributions respectively of servers in Cluster0. We note that pre-consolidation view shows that the average CPU utilization of Cluster0 is around 40% while the total power consumption is near-peak, with all the servers being turned on. Each server also has sufficient spare capacity. The post-view shows the consolidation recommendations where there are now only three servers active in Cluster0. The rest of the servers are recommended to be powered down, thereby stretching the average CPU utilization of the powered-on servers in Cluster0 to around 80% and reducing its power consumption by 30%. Similar consolidation recommendations are made for Cluster1 and Cluster2.
Figure 5 shows an individual machine, such as, Server5 in Cluster1 that is recommended to host four VMs including Application4, Application5, Application6 and Application1, with a projected peak utilization of 80%. Figure 6 zooms in on two of these applications that are recommended for co-location, such as, Application4 and Application5 on Server5. To understand the reason for their co-location we look at their workload timeseries graphs (that is, CPU utilization versus hour of day). As shown in Figure 6, the particular workloads for Application4 and Application5 have complementary behavior. For example, at time 02:00 hrs the workload peaks for Application4 but is low for Application5. At time 06:00 hrs, the reverse behavior is noted with Application4 having low CPU utilization and Application5 having a high CPU utilization. This complementary behavior makes the two workloads good candidates for colocation.
This case study demonstrates how emerald can assist in continuous cost savings by making recommendations for turning on hardware dynamic power-saving features of voltage/frequency scaling in advanced kernels and for switching off idle servers during certain times of day/days of week. Consider the example in Figure 7 demonstrating workload traces for Application2 running on Server10 in Cluster2. The traces show significant variability during a 10-hour analysis period, with lowest CPU utilization during times 2 A.M.4 A.M. One could exploit this information and run the server at a lower power setting between 2 A.M.4 A.M., if dynamic power savings features are supported and enabled at hardware level. If Application2 is the only application running on the server, the application can be live migrated to another target and Server10 can be switched off. While live migration provides the capability to move virtualized applications within minutes from one host to another, it also requires the purchase of expensive licenses. Thus its use must be limited to environments that exhibit enough variability to benefit from it. On the other hand Application7 demonstrates less variability in peaks and troughs over sustained periods. Since the overhead in changing power states of the server (such as, shutdown-startup cycle) or the cost of migrating the application at runtime could potentially overcast its power savings, this application should not be identified for the dynamic saving options.
Figure 8 shows the ROI estimation of savings and the investments involved in this venture. In this case study of 14 servers, using consolidation techniques Emerald recommended a sizing of total eight servers, thereby removing the overhead of as much as six servers. The investments table (right) show the one-time expenditures involved in the new recommended plan. These comprise of virtualization licenses to support OS stacking of VMs, live migration licenses to support live migration of applications as well as the labor cost to manage the virtual servers. Environments that do not exhibit high variability in workloads, need not incur the cost for migration licenses. In this example, the total costs involved for virtualization (including live migration) and labor come to around $48K. Note that since existing power-efficient servers are being reused, there is no cost incurred on server purchase.
In comparison, the savings table (left) shows the power, cooling, and space costs saved each year. Sample rates for the parameters are shown at the top of the table. Note that the released hardware of six servers can be utilized for other purposes or they could be disposed of for a resale value. In our example, the net savings in operational costs add up to about $89K/ Yr, indicating the ROI for this particular transformation would be around six months to one year. The average and maximum utilization of the servers are also reported, indicating that post-consolidation the systems are much better utilized with a mean value of 31%.
The workload manager also prefers newer power efficient configurations over older power-hungry ones. Hence the power efficient HS21 servers Server_0 and Server_1 in Cluster0 remain switched on for hosting VM configurations, whereas the power hungry x3650s, such as Server_5, Server_6, are switched off to maximize power benefits.
In this article, we reviewed how virtualization and energy management technologies can help in consolidation and reduce the infrastructure and operating costs of a data center. We gave an overview of power-aware dynamic application placement and outlined the challenges involved in system modeling for consolidation and live migration. Based on our experience we presented how Emerald addresses these challenges. There is little doubt that going forward, energy related costs will occupy an even more central discussion in the design and operation of data centers.
We anticipate middleware vendors to expose power management controls that can be used in concert with the technologies discussed in this article for even greater power efficiency.
3. Bohrer, P., Elnozahy, E.N., Keller, T., Kistler, M., Lefurgy, C., McDowell, C. and Rajamony, R. The case for power management in Web servers. In Proceedings of Power-Aware Computing (2002).
5. Cardosa, M., Korupolu, M.R. and Singh, A. Shares and utilities based-power consolidation in virtualized server environments. In Proceedings of IFIP/IEEE Integrated Network Management (2009).
6. Chase, J.S., Anderson, D.C., Thakar, P.N., Vahdat, A.M and Doyle, R.P. Managing energy and server resources in hosting centers. In Proceedings of ACM SOSP, (2001).
10. Kumar, S., Talwar, V., Kumar, V., Ranganathan, P. and Schwan, K. vmanage: loosely coupled platform and virtualization management in data centers. In Proceedings of ICAC (2009).
11. Nathuji, R., Schwan, K., Somani, A. and Joshi, Y. VPMTokens: Virtual machine-aware power budgeting in data centers. Cluster Computing (2009).
12. Ranganathan, P., Leech, P., Irwin, P. and Chase, J. Ensemble-level power management for dense blade servers. In Proceedings of International Symposium on Computer Architecture (2006).
13. Steinder, M., Whalley, I., Hanson, J.E. and Kephart, J.O. Coordinated management of power usage and runtime performance. NOMS 2008.
14. Verma, A., Ahuja, P. and Neogi, A. pMapper: Power and migration cost aware application placement in virtualized servers. In Proceedings of ACM/IFIP/ Usenix Middleware (2008).
15. Verma, A., Ahuja, P. and Neogi, A. Power-aware dynamic placement of HPC applications. In Proceedings of ACM ICS (2008).
16. Verma, A., Dasgupta, G., Nayak, T.K., De, P. and Kothari, R. Server workload analysis for power minimization using consolidation. In Proceedings of Usenix ATC (2009).
Figure 1. (a-c) Heatmaps showing average intra-day variation in CPU utilization of two clusters in a single month. Each column corresponds to a physical server. The y-axis represents an average day in terms of 48, 30-min intervals. Each 30-min sample for a server is averaged for the same time interval over the whole month. (d) Example showing a peak-trough pattern. (e) Example of low utilization VMs.
Figure 2. Core components of Emerald.
Figure 3. A sample data center scenario with three clusters.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from [email protected] or fax (212) 869-0481.