In the aftermath of the "IT Doesn't Matter" debate [3], a consensus has emerged that if corporate IT assets such as hardware, software, and networks are susceptible to replication by competitors, a competitive advantage can only come from the information created by such assets [11]. Meanwhile, a precipitous decline in data storage costs of 45% per annum, on a cost-per-gigabyte basis [6], has satiated firms' desire to capture information on multiple business transactions and relationships. Moreover, the much anticipated adoption of RFID and recent regulatory reforms such as SEC Rule 17a-4, mandating the retention of electronic communications records (email, VoIP, and IM) in securities trading firms for up to three years, are likely to accelerate the pace of data accumulation even further. In information-rich sectors such as credit card lending, retail, and health care, data growth has already begun to outstrip the decline in hardware costs, prompting a net increase in storage spending [8]. With storage now consuming 12%15% of IT budgets, CIOs fear that further increases could erode strategic IT spending [4].
Faced with higher storage costs and burgeoning data growth, the concept of information life cycle management (ILM) has emerged to help management understand their information needs and to structure their storage spending in a way that meets those needs. The underlying premise of ILM is that information follows a natural life cycle from capture through application and decline (see the table here). At each point in its life cycle, the issue for ILM is to identify the value of its information and how best to protect that information from loss and corruption. In this way, storage is like an insurance policy whose cost mirrors the value of the underlying asset and the risk that value will decline due to adverse events.
Unfortunately, the complex task of valuing information has forced firms to apply ILM using cost criteria alone. Thus, firms tend to spend more on fault-tolerant hardware and backup monitoring for data in regular usethe perception being that frequent use implies higher value. Data that is no longer in use or that is perceived to have lower value is archived onto inexpensive media such as magnetic tape or are deleted outright. Intel, for example, uses a 35-day email retention policy for its employees; after this period, email messages are automatically deleted regardless of their perceived value to the end user or the firm as a whole.
A problem with this cost-centric approach is its failure to consider risk. In the event of a systems or media failure, time to recovery and point of recovery (that is, the age of the last backup) may vary widely. For example, hot sites routinely provide synchronous mirroring of data in real time but this entails a much greater level of investment than RAID devices or periodic backing up to tape. Delayed recovery may not be an issue for low-value data such as payroll records or social calendars but for critical data such as stop loss orders in a brokerage firm, any delay could prove embarrassing and lead to severe financial penalties. From a legal viewpoint, there is also a possibility that courts will order that electronic records be provided to opposing counsel within a certain time frame; failure to comply with such orders can prove costly [12].
If risk is overlooked, firms have no way of knowing if they are spending too much or not enough on protecting their data. In an era of regulatory oversight and paranoia over data loss, firms are unlikely to be risk seekers. Storing high-value data on unreliable, albeit less expensive media constitutes a risk that even the most reckless firms are unlikely to accept. For risk-averse firms, a desire to have all data recoverable in real time is impractical and cost prohibitive, especially if the volume of data is expected to rise sharply. Risk neutrality represents a compromise where firms are neither exposed to inordinate levels of uninsured risk nor spending vast sums on storage solutions to eliminate risk entirely. Risk neutrality does not mean that firms are indifferent to risk; rather, they are cost-effectively insured against all known risks.
In theory, risk neutrality assumes perfect knowledge of all adverse events that could put the value of information at risk. For example, firms must know the probability of calamitous events such as terrorist attacks or natural disasters and the probability that the storage systems and backup media will perform as expected. In practice, bounded rationality and less-than-perfect foresight means that all risks will never be fully quantifiable [2]. The best that a firm can do is to review historical outage patterns and extrapolate to a level of storage spending that protects against as many future adverse events as possible. In reality, this task is rarely performed with any appreciable degree of accuracy and so, adapting from a similar problem facing fund management firms, we outline a framework that balances the risk associated with data loss or corruption against storage spending that is meant to prevent or contain such adverse events.
The task of measuring the risk to information value and deciding how additional storage spending may be able to reduce that risk to more tolerable levels is not unlike the task facing fund managers within the financial services sector. Fund managers know that fund values will vary based on market conditions. Managers are willing to accept some losses but only within predefined limits. To help establish this limit, fund managers apply value-at-risk (VaR), a measure that "summarizes the worst loss over a target horizon with a given level of confidence" [9]. For example, using historical data, managers may determine that with 99% confidence, the worst percentage loss a fund is likely to suffer is 5% or $5M on a $100M fund. If, on any given day, the fund value falls by more than 5%, managers may opt to use a hedging strategy to guard against further losses. VaR, expressed in absolute or relative percentage terms, acts as a trigger for corrective action but it also shows how much a firm can spend to protect itself if VaR is exceeded. For example, if a 6% decline in the value of a fund creates a $1M loss above what is expected at a 99% VaR level, fund managers know they can spend up to $1M to neutralize this loss. In practice, it is impossible to insure against all adverse market events but setting a VaR level at 95% or above gives managers an opportunity to identify how the most severe market perturbations might damage their portfolio.
To see how VaR can be adapted to a storage environment and used to understand the risks facing a firm and the value of its information, consider Figure 1 showing a typical distribution of storage-related risks. Recognizing that valuing adverse events is not an exact sciencefor instance, it can be difficult to accurately assess how much an hour of CRM downtime costs a firmit is nevertheless possible to create a probability distribution of storage-related events and their cost to the firm from backup and restore logs, help desk tickets, and end-user surveys. As seen on the left side of the diagram, most events are of minor significance and have no lasting effects on the firm; accidental deletions and restoring earlier versions of files are typical examples. Meanwhile, other events can disrupt business activities leading to losses in the form of missed sales, court-imposed fines, or expedited data recovery fees. For example, in 2005, Morgan Stanley's failure to report email messages as part of an investor lawsuit led to a jury award of $1.45 billion (this was reversed on appeal in early 2007) [12]. Lastly, a handful of events can be catastrophic if they directly threaten the survival of the firm. Events such as the terrorist attacks of September 11, 2001 or natural disasters have been known to destroy data centers (as happened in New Orleans following Hurricane Katrina) meaning that firms may have to bear significant additional cost in re-creating files from transaction-level data.
Using this distribution, firms can compute VaR at, for example, the 95% confidence level. For the time period covered by this data, each firm must then decide whether their level of storage investment is reasonable, given the worst-case scenario that VaR represents. A firm that previously tried to reduce its storage spending by migrating data to less expensive media or using less frequent backups may find that VaR has jumped as end users face longer recovery times and greater disruption. On the other hand, a firm may find that an earlier decision to increase spending by adopting more reliable technology or to pursue a strict backup regimen has contributed to a decline in VaR in the current period. In a financial services setting, the VaR on an investment can be altered with hedging strategies but only at a cost to the firm [9]. In the case of information value, VaR can be manipulated through storage spending. As firms assess the risk to their information from adverse events such as hardware failure or data corruption, their goal must be to link their current storage spending to VaR in order to decide if their current spending is too low, in which case VaR is dangerously high, or if spending is excessive, in which case VaR is unnecessarily low.
Since many firms have felt compelled to pursue ILM on the basis of cost criteria alone, total cost of ownership (TCO) is a standard metric for evaluating storage environments. Fixed and variable costs are accumulated for a defined period of time and divided by throughput, number of users, data center capacity, or footprint to yield a measure of cost utilization. Although hardware costs are less than 30% of TCO [1], vendors continue to market hardware as a way to lower TCO while, in reality, service or labor costs (least impacted by innovation) are the primary factors in TCO. Chargeback systems routinely use TCO to assign storage costs to end users and so these costs are identifiable with some degree of accuracy.
Relating storage TCO to VaR is complex. The critical task is to determine how VaR responds to a change in TCO. For firms to be fully insured against adverse events, the goal is to uncover a level of TCO where a marginal increase in storage spending (due to greater use of skilled labor, more frequent backups, increased monitoring, or more fault-tolerant hardware) is matched exactly by a marginal decrease in VaR. As firms expand their storage costs, VaR declines consistent with faster recovery times and a reduction in the level of risk associated with systems outages. Since storage costs tend to increase in large rather than small dollar increments, the result is a downward sloping step-function linking VaR to TCO, as shown in Figure 2. If TCO is low, reflecting an environment where storage costs have been excessively reduced, an increase in TCO will help to reduce VaR to more manageable levels. However, consistent with the law of diminishing marginal returns, at some point the marginal benefit from greater spending will be negligible. Beyond this point, spending is simply wasteful. The likelihood of unplanned events means that some risks will remain and so, in absolute terms, VaR will approach a minimum (non-zero) threshold. Equally, TCO can never fall to zero due to certain fixed costs associated with embedded systems (for example, PC hard disks).
To create this curve in reality, firms begin by plotting VaR (taken from their risk data in Figure 1) against their existing TCO. This yields a single point on the curve. Next, firms engage in a series of "what-if" exercises by asking how VaR might have changed if storage spending was increased or decreased by a certain amount. This may seem hypothetical but, in reality, firms may already be doing this exercise when investigating how certain severe outages occurred and how they can be prevented in future. For example, firms may discover that a certain percentage increase in storage spending would have prevented or limited the most financially punitive outages, essentially reducing VaR. However, to fully appreciate the inverse link between VaR and TCO, it is not just enough to ask what can happen to VaR if TCO is increased. It is also essential to ask how much higher VaR might be if TCO was reduced. This hypothetical situation may seem unusual but high TCO could mean that firms have overinsured themselves against very small risks.
Despite uncovering an optimal balance between VaR and TCO, this balance can become distorted by changes in information value. ILM recognizes that information can increase and decrease in value over time, often in dramatic fashion. For example, in the airline industry, the value of a passenger manifest will fall to zero the instant that a flight has landed safely. Equally, in the pharmaceutical industry, the value of clinical trial data will increase once a drug application moves to the next stage of FDA approval, although the data itself is unchanged. Consequently, VaR will fluctuate based on where information is in its natural life cycle and how quickly it moves through that life cycle. Similarly, legislation and the threat of lawsuits have altered the value of archival data to the extent that penalties and fines can be imposed if data is lost or unavailable to investigators. Hence, archival data may retroactively increase in value if a firm receives a court order to hand over its data, as was the case with Morgan Stanley where the discovery of 1,600 undocumented backup tapes was seen as evidence of an attempted cover-up [12].
As a consequence of an increase in the value of information, the curve linking VaR and TCO will shift up and out, away from the origin. As seen in Figure 3, if a firm maintains the same level of TCO as before, making no changes to its storage architecture or storage practices, VaR will increase because of an increase in risk. The probability of a system outage may not have changed but the cost associated with an outage will increase commensurate with an upward shift in information value. To re-establish equilibrium between VaR and TCO, a firm must either increase spending around its existing systems by, for example, expanding the frequency and scope of data backups, improving service and support, or by transferring the information to a safer and more secure set of technologies.
When information falls in value as it nears the end of its useful life, the curve will shift down and in. If a firm maintains the same level of storage spending as before, it will have overinsured itself against relatively minor risks. The firm will still need to protect its data but not with the same level of spending as before. The simplest solution is to reduce TCO by transferring the data to less expensive media. This will allow VaR to increase to the point where VaR and TCO are again in equilibrium.
Despite its intuitive appeal, ILM remains challenged by the complex nature of information value. At one extreme, all data is valuable when viewed through the eyes of end users who feel their data must be secured at all costs. Rather than haphazardly throwing money at an ever-increasing mountain of dataestimated to be increasing by two exabytes (1018) annually or 400MB for each of the earth's five billion inhabitants [10]an analysis of VaR can provide an objective overview of different instances when data was unavailable or when users were impacted by system failures.
ILM tries to match storage system capabilities with information value but as information value is resistant to measurement, erring on the side of increased storage spending constitutes the lesser of two evils. VaR, meanwhile, can be derived with some degree of accuracy on the basis that is it easier, for instance, to estimate the cost of an hour of CRM downtime than to accurately predict the value of a CRM application over its entire lifespan. Even if VaR and information value correlate, VaR is not a proxy for information value. It is true that some firms may invest in storage systems in order to improve information accessibility, accuracy, and relevance [5]ultimately seeking to boost sales or to enhance customer service and support [7]but few look to storage systems as a competitive differentiator when the underlying hardware is easily replicable.
Storage remains an unavoidable cost of doing business. As such, VaR recognizes that storage is meant to protect data from adverse events that could give rise to financially damaging business disruptions or worse [12]. In practice, differences in information value across applications such as payroll (low value), email (mid-level value), and CRM (high value) are managed using different hardware tiers; each tier offers a specific service level that matches the information value. Thus, high-value data is assigned to a premium tier where TCO and service levels are high while low-value data is assigned to a basic tier where TCO is lower.
To implement ILM is to determine a level of storage spending that fully insures firms against the consequences of data loss, corruption, or inaccessibility. If storage spending is seen as the premium on an information insurance policy, VaR represents the deductible on that policy. Perceptually, falling hardware costs have created a false belief that a firm is implementing ILM if it spends more on storage and saves all its data. Without some consideration of VaR, however, firms have no appreciation of the risks they face if their data is lost or corrupted. Consequently, VaR can significantly improve the implementation of ILM.
Faced with rapidly expanding mountains of data and new government regulation, IT managers are using ILM to bring order to a storage domain that has previously been ignored because of its low strategic value. If data and information are essential to a firm's strategic positioning, storage must be seen in a new light. In this article, we introduced a framework that enhances ILM by clarifying the relationship between VaR and TCO, a relationship that is often in flux because of the dynamic nature of information value and firms' growing desire to capture data on multiple aspects of their business. By considering this framework and VaR in particular, we argue that IT managers can forge an effective and secure storage environment.
1. Allen, N. Don't waste your storage dollars: What you need to know. Gartner Group Research Report COM-13-1217, 2001.
2. Bernstein, P. Against the Gods: The Remarkable Story of Risk. John Wiley, New York, NY, 1998.
3. Carr, N. IT doesn't matter. Harvard Business Review 81, 5 (2003), 4149.
4. CIO Insight. Is your IT budget being spent effectively? Feb. 2005, 6775.
5. Davenport, T. Information Ecology. Oxford University Press, New York, NY, 1997.
6. Gilheany, S. The Decline of Magnetic Disk Storage Cost over the Next 25 Years. Berghell Associates, 2004; www.berghell.com/whitepapers/Storage%20Costs.pdf.
7. Glazer, R. Measuring the value of information: The information-intensive organization. IBM Systems Journal 32, 1 (1993), 99110.
8. Goodwin, P. Enterprise SAN-attached storage: Market overview. Meta Group Report, 2003.
9. Jorion, P. Value at Risk: The New Benchmark for Managing Financial Risk, Second Edition. McGraw Hill, NY, 2001.
10. Lyman, P. and Varian, H. How Much Information? UC, Berkeley, School of Information Management and Systems, 2003; www.sims.berkeley.edu/research/projects/how-much-info-2003/.
11. Mata, F., Fuerst, W., and Barney, J. Information technology and sustained competitive advantage: A resource-based analysis. MIS Quarterly 19, 4 (1995), 487505.
12. Wall Street Journal. How Morgan Stanley botched a big case by fumbling emails. May 16, 2005.
Figure 1. Probability distribution of adverse storage events.
Figure 2. Understanding information life cycle management (ILM)
©2007 ACM 0001-0782/07/1100 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2007 ACM, Inc.
No entries found