iStockphoto.com
There is a lot of information on how storage systems fail, but new research on Google's main storage infrastructure provides more answers about the overall availability for cloud-based storage services.
"Highly-available cloud storage is often implemented with complex, multi-tiered distributed systems built on top of clusters of commodity services and disk drives," say Google researchers. As a result, "sophisticated management, load balancing, and recovery techniques are needed to achieve high performance and availability amidst an abundance of failure sources that include hardware, software, network connectivity, and power issues."
The Google researchers developed a series of statistical models for different design choices, including variable replication and data placement, and used them to examine availability against system parameters tested and encountered in Google's fleet. The researchers conclude that transitory node failures account for most unavailability.
From HPC in the Cloud
View Full Article
Abstracts Copyright © 2011 Information Inc.
, Bethesda, Maryland, USA
No entries found