Greening our Datacenter Team Recognized with Director's Achievement Award and UC Larry Sautter Award

The core team responsible for our multi-year initiative to improve the energy efficiency of our research and operational computing datacenter has been recognized with two awards in one month. The team was recognized with a Director's Achievement Award and a Larry Sautter Award from the University of California. Congratulations to the team! The Pandas thank you.

Summary: Through a unique collaboration between Berkeley Lab’s energy efficiency researchers and IT Division, the team undertook a number of experimental and cutting edge approaches to improving legacy data center energy efficiency. These measures included one of the earliest implementations of a complete wireless sensor network to monitor the impact of the changes, extensive changes to the facilities infrastructure in the room, testing of early device prototypes with industrial partners, virtualization and consolidation, testing of modified controllable CRAC units, and many more. In November 2010, the team achieved an important milestone - a view of realtime power utilization effectiveness in the datacenter.

Project Description:
The team came together out of the combination of EETD’s expertise in energy efficiency in datacenters and IT’s pressing need to expand scientific computing resources in its datacenter to meet increasing demand. The goal of the project was to simultaneously explore and validate solutions for EETD’s research and to use these solutions to improve operational efficiency in the data center thereby allowing an increase the computational capability.
While the estimated PUE (Total energy use divided by the IT equipment energy use) of the datacenter suggested that it was relatively efficient in comparison to others that the EETD team benchmarked, the team believed that significant improvements were possible. The team initially developed a computational fluid dynamics (CFD) model of airflow in the datacenter. The data from this model confirmed the conclusion that airflow mixing in the datacenter contributed to its inefficiency. This suggested the need for a monitoring system that would allow the team to fully visualize and understand the scope of the problem. This enabled immediate feedback on the impact of various changes to the datacenter as they were implemented. The team engaged Synapsense, which at the time was just beginning development of its real-time, wireless, monitoring application, to deploy a system which would permit detailed analysis of the environmental conditions (humidity and temperature) along with air pressure and power monitoring at hundreds of points within the datacenter. The team worked with Synapsense to improve their product based on their experience within the datacenter. This work was conducted in phases over several years and is continuing to explore new efficiency opportunities.

Once the system was deployed, the team used the data to begin to change the airflow and make other operational adjustments in the datacenter. The team undertook a variety of fixes, some small, and some large:
Floor tile tuning to improve air pressure
Hot Aisle/Cold Aisle Isolation
Conversion of the overhead plenum to hot air return
Extension of CRAC returns to connect to overhead
Installation of curtains to further reduce hot aisle/cold aisle mixing
Installation of water cooled doors based on non-chilled water (collaboration with the vendor to reduce energy use)
Piloting of fully enclosed racks
Use of higher ambient temperature setpoints to improve efficiency

Throughout the process, the team collaborated with industrial partners to pilot new technology while influencing the technology roadmap for these products in the marketplace. This trend continues today with testing of a prototype APC in-row cooler and another project that may be the first ever computer controlled air conditioner fan and compressor control system which can dynamically adjust the Computer Room Air Conditioning cooling power depending on the conditions in the data center.

The culmination of this initial work occurred in November 2010, when LBL became one of the first organizations in the federal space, and among a handful of smaller data centers in the world, to be able to calculate and view the data center’s Power Utilization Effectiveness (PUE) in real-time. This critical metric, which indicates the power used by the infrastructure in the data center in comparison to the power used by the computers themselves, helps staff manage the data center on a dynamic basis to best achieve environmental goals. One vendor partner visited in November 2010 to present awards for the role LBL staff played in this achievement and in the roadmap for their product (http://today.lbl.gov/2010/11/12/berkeley-lab-data-center-at-cutting-edge-of-efficiency/).

In addition to the extensive collaboration between IT’s facilities experts and the researchers, the High Performance Computing, Infrastructure, and Collaboration teams also helped to support these goals. During this time, IT consolidated and virtualized its business systems, further reducing the impact on energy and floorspace in the datacenter. In addition, the move to cloud-based systems for email and collaborative applications also increased resiliency while reducing the impact on the datacenter. Finally, the HPC group continues to work with researchers to support demand-response testing, allowing for shedding load from scientific computing during times of reduced energy availability or in response to data center and environmental conditions.

By any measure, the impact of this achievement has been felt far beyond LBL’s data center. Numerous publications and reports have been generated from the work, as well as kudos for LBL’s efforts from around the world.

In the datacenter itself, LBL went from a situation where new datacenter construction was going to be needed imminently, to one in which we have headroom to continue to add scientific computing. Indeed, the room, which we believed to be at capacity in 2007, is currently running 50% more scientific computing yet has improved its efficiency (PUE) from 1.6 to 1.4 over that time. The room still has space to grow, as we continue to make use of cutting edge energy efficient cooling technologies to improve the room’s performance.

One good indicator of the quality of this achievement is the extent to which this achievement has been studied by others. Dozens of individuals and groups from industry, academia, and government have toured our datacenter as a model for how to green datacenters that weren’t built with modern efficiency standards in mind. While Google, Yahoo, and Facebook’s efforts get most of the industry’s attention, most companies, agencies, and universities have one or more legacy datacenters to deal with and no resources to build a new one from scratch.

Overall, this project represents a rare confluence of achievements: it simultaneously enabled new research and demonstration projects in a scientific division related directly to data center efficiency, enabled science in virtually every other scientific area of the laboratory by allowing for significant expansion of scientific computing resources, and reduced operational costs by allowing for continued and expanded capacity in a resource that was believed to be exhausted. In bringing together an operational need with scientific research, the Laboratory has shown by example how energy can be saved in older data centers, and has demonstrated how a continuous improvement process can lead to on-going energy savings.