Spurred by advances in computing technology and more sophisticated data analysis approaches, addressable advertising promises companies a greater ability to identify potential new customers and market products directly to them [5]. While this approach makes finding and selling to customers more effective, the collection and analysis of consumers' personal information from various electronic databases also raises significant privacy concerns. With improvements in technologies and methods, advertisers are increasingly able to analyze consumer informationsuch as where a consumer lives, what he or she buys, and even what he or she watches on televisionand then to use that information to infer other demographic characteristics about the consumer. Demographics like age, income, and gender are highly valuable indicators of the types of products that might interest a consumer, and they allow companies to market more effectively and efficiently. Nevertheless, they also are highly personal and potentially sensitive attributes that a consumer might not wish to share.
Our research focuses on a specific area of addressable advertising: that is, the use of data mining techniques to generate demographic profiles of television viewers from their viewing behavior. In our research we have developed a data mining application called the Viewer Profiling Module (VPM), which is capable of creating accurate, automated viewer profiles by processing television viewing behavior data through various statistical models and producing predictions about individual viewer characteristics [9]. Notwithstanding its technical success, the capability of the technology to identify the personal characteristics of consumers is in potential conflict with the legitimate privacy rights of individuals. As underscored in a recent article discussing the implications of VPM, separating a technology capable of inferring personal attributes from the societal implications inherent in that capability is essentially impossible [2]. In the context of technology implementation and change management, the broader privacy and ethical issues become as important and difficult as the technical onesif not more so.
With that in mind, this article discusses the privacy implications of building viewer or consumer profiles through data mining, within the context of VPM and in light of the reaction it already has prompted [2]. Our intent is not to answer all of the questions here, but rather to outline the issues and to propose a framework within which both academics and practitioners can further explore the issue of privacy. Our analysis centers around three essential issues involving technology and privacy: stakeholder perceptions regarding the fairness of the process used by a company for collecting and distributing personal information, including the level of choice provided to the individual regarding whether and to what extent they will provide access to their personal characteristics; stakeholder perceptions regarding the fairness of the outcomes of those processes, including the cost-benefit trade-offs inherent in the exchange of personal information for some real or perceived gain; and stakeholder perceptions regarding the accuracy of inferred personal informationparticularly the differential perceptions of consumers and advertisers regarding the impact of accurate versus inaccurate profiles. The analytic capability of VPM adds a layer of complexity to these issues by increasing the ability of a company to gather personal (viewing) information in an exceptionally unobtrusive manner, and then to use that information to infer individual demographic characteristics.
TiVo was able to determine precisely who watched which commercials during the Super Bowl, the amount of time viewers spent watching commercials, and the number of times a particular viewer might have paused, rewound, and rewatched a particular segment of the game or commercial.
The new threat to privacy begins with the basic technology of the Personal Video Recorder (PVR), which is able to directly monitor and report the viewing choices of individuals. To some extent, this is a reflection of the increased monitoring of individual behavior through a variety of data-gathering means, including point-of-sale devices, online ordering forms, and product registration requests. It also includes less salient technologies such as spyware, which is conceptually similar to the PVR in its PC-based monitoring and reporting capabilities [10]. As we note later, the main difference between the PVR and spyware is that while PVR companies attempt to communicate their data collection and usage procedures in clearly worded privacy policies, spyware often skirts ethical constraints by installing itself on a user's PC with little or no advance warning.
The other major privacy issue in turn involves treatment of the information after it is collected. TiVo, a leading PVR manufacturer and service provider, is selling the viewing behaviors of its customers to Nielsen Media Research, which in turn will use that information to enhance its own collection and analysis of television viewing [6]. The type of information collected can be exceptionally detailed. TiVo was able to determine precisely who watched which commercials during the Super Bowl, the amount of time viewers spent watching commercials, and the number of times a particular viewer might have paused, rewound, and rewatched a particular segment of the game or commercial. Not surprisingly, this simple ability to collect viewing data, combined with the increasing number of PVRs in homes, has raised concerns from privacy advocates [1, 3].
The monitoring-profiling capabilities of the PVR have a direct impact on each of the stakeholders involved in the creation, distribution, and consumption of television advertising. In the domain of PVR-based targeted advertising, the direct stakeholders include television viewers, PVR providers, service providers (for example, cable and satellite companies), content providers (for example, broadcast and cable networks), and advertisers. The impact on these stakeholders can be explored, in turn, in the context of the three privacy issues raised earlier: procedures used to collect and distribute information, the perceived outcomes of those procedures, and the accuracy of inferred personal information. We discuss each of these issues, in part using TiVo as an example when appropriate, and then summarize the issues within a proposed framework for further investigation.
Procedures for Collecting Viewing Choices. Policies and procedures used by companies in the collection and dissemination of private information have become commonplace, motivated in large part by the growing privacy concerns of consumers and privacy advocates. To the extent that privacy policies are intended both to inform and to reassure a company's customers, the development, implementation, and communication of those policies has become vital in ensuring the continued viability of what is broadly considered e-commerce. For these reasons, most e-commerce companies have established privacy policies and communicated those policies in public forums, most commonly on their Web sites [8]
TiVo can be considered an e-commerce company in the sense that it maintains interactive business relationships with consumers through electronic networks. As such, TiVo has in place a highly detailed, well-documented procedure that communicates its intent to keep subscribers' personal viewing choices private, and the technological steps it follows to enforce the policy [11]. Subscribers have options specifically concerning the collection and transmission of their viewing information. They can choose one of three options: Opt inwhere viewing data and identifiable information is transmitted from the receiver; Opt outwhere no viewing data is transmitted from the receiver; and Opt neutralwhere viewing data without identifiable information is transmitted from the receiver. If subscribers do not explicitly choose to opt in, all viewing information received by TiVo's servers is automatically separated from any information that could be used to match it to individual receivers or subscribers. Account information and anonymous viewing data then are stored in separate systems. To further ensure that anonymous viewing data cannot be associated with an individual subscriber, viewing data is randomly transferred to one of a number of servers when it is stored. The file transfer logs are turned off and timestamps are erased from the data every three hours. These measures serve no purpose except to guarantee that anonymous viewing data remains anonymous.
From an advertiser's or PVR provider's perspective, the formulation of acceptable privacy policies is in opposition to the company's ability to build individual viewer profiles. Because TiVo's privacy policies do not allow it to match a viewer with his or her viewing choices, those policies, as currently written, do not allow for customized advertising. Therefore, a fundamental question is how and whether a company can formulate privacy policies that allow for individual profiling and customized advertising, while also protecting the privacy rights of its customers.
Another key issue in the formulation and implementation of privacy procedures is the level of trust the subscriber and other stakeholders have in those procedures. While auditable policies and procedures are a staple of successful online companies, the steps a profiling company, in particular, must take to engender trust is an open question. Procedures might have apparent validity, but if they are not followed correctly, they might not produce acceptable outcomes. Considering media reports of corporate malfeasance, it is not surprising that some stakeholders might not trust TiVo to behave in a manner consistent with its published policy.
Potential Outcomes from Collecting Viewing Choices. An assessment of outcomes essentially is a cost-benefit calculation. In this regard, it is difficult to assign a monetary value to privacy, primarily because different individuals tend to place different values on their privacy; those values are inherently subjective, and the level of privacy infringement is often difficult to ascertain in the short term. Nevertheless, a recent report from Forrester Research suggests that people are willing to give up some privacy in exchange for something of value, which can be represented as monetary compensation, product discounts, increased convenience, and other tangible or intangible benefits [7]. For example, companies routinely capture individual consumer purchasing transactions involving frequent shopper cards, but that has not stopped consumers from using the cards in increasing numbers. AC Nielsen reports the number of consumers participating in a frequent shopper program more than doubled between 1996 and 2003, and now 80% of all consumers participate in at least one program [4]. Assuming they recognize the privacy implications, users of such cards clearly believe the cost of losing some privacy is outweighed by the lower monetary cost of using the card.
To a prospective PVR customer, any forfeiture of privacy is an intangible cost. It is difficult not only to detect whether a privacy infringement is presentor to verify its absencebut also, as noted, to determine the magnitude of the privacy violation. There is, for example, the question of whether the loss of privacy due to explicitly provided personal information is more or less costly than the loss of privacy due to implicit monitoring of viewing behaviors. Similarly, does the lack of certainty in profiling an individualor more precisely, the magnitude of the uncertaintyaffect the magnitude of the privacy violation and therefore its perceived cost to the viewer?
Similarly, while the benefits of providing personal information can be stated tangibly and quantitativelyfor example, as specific monetary payment to the individualthe advertiser and service provider can also argue the viewer achieves certain intangible benefits as well. Intangible benefits might include receiving more accurate viewing recommendations, based on the individual's inferred characteristics, as well as more relevant advertising, and therefore more interesting, to the individual.
Nevertheless, the cost-benefit argument in this domain is made somewhat more difficult both by the salience of the privacy infringementthat is, the extent to which viewers are made aware their viewing behaviors are being monitoredas well as by the apparent value that people place on their viewing choices (a private behavior), relative to their buying choices (a public behavior). For example, the same Forrester survey indicates that television viewers are much less willing to provide personal viewing information in exchange for something less tangible [7]. Specifically, only a quarter of all viewers in that survey stated they would provide viewing information in exchange for simple program recommendations.
A detailed exploration of the ways in which data mining technologies can be used to collect, analyze, and redistribute data is important not only because of the opportunities to enhance marketing efforts, but also because it sheds light on how consumers and society will react to the technologieseither positively or negatively.
Accuracy of Inferred Personal Information. The introduction of data mining techniques that go beyond simple viewing choices to infer the fundamental demographic characteristics of viewers, introduces new issues into the privacy debate. The most salient issue is the introduction of the demographic/psychographic profile itself. Regardless of its relative sensitivity, a profile derived from one or more statistical models, while clearly better than random choice, nevertheless is inherently error-prone. Consequently, even when a profile of a viewer can be developed, it is by no means certain the profile of that individual will reflect reality. If it does, the viewer's true characteristics could be communicated to third parties. If it does not, a false impression of the person could be disseminated.
Service providers and advertisers have a financial stake in seeking accurate viewer profiles. As profiles become more accurate, targeted advertisements can be delivered to fewer households in order to achieve a particular delivery threshold, thus minimizing opportunity costs. For example, if an advertiser wishes to send ads to a targeted group of 100 male teenagers in affluent households, and if the profiling model is 70% accurate in classifying members of this group, the PVR or service provider would have to create a pool of 143 households that it identifies as affluent and containing a male teenager in order to satisfy the advertiser (such as, 100/.70 = 143). Thus, the higher the error rate, the more households to which ads must be served in order to satisfy the advertiser.
On the other hand, a viewer might react negatively regardless of how accurate or inaccurate the profile might bedepending on issues such as the content of the profile and/or the (actual) characteristics of the individual. The viewer might also attempt to alter his or her profile in order to correct what is perceived as an inaccurate or unflattering inference. Anecdotal evidence suggests that individuals will attempt to modify information about themselves when that information is incorrect, even when they have no direct way of doing so [12]. The viewer's perception of the privacy violation and the provider's attempt to assuage or compensate the viewer are likely to impact the viewer's reaction to the advertiser and its product.
The issues discussed here suggest a general framework within which researchers, business people, and privacy advocates can explore the privacy implications of consumer profiling relative to the primary stakeholders mentioned earlier. Here, we list each of the three privacy issues and propose areas of investigation related to each issue, all within the context of consumer profiling.
Implications of policies and procedures. Areas of investigation would include:
Implications of procedural outcomes. Areas of investigation would include:
Implications of profiling accuracy. Areas of investigation would include:
A detailed exploration of the ways in which data mining technologies can be used to collect, analyze, and redistribute data is important not only because of the opportunities to enhance marketing efforts, but also because it sheds light on how consumers and society will react to the technologieseither positively or negatively. From a purely practical perspective, a negative reaction could cause consumers to turn away from the technology, the product, and the company, thus counteracting any marketing improvements delivered by the technology. In that regard, knowledge of consumer and societal perceptions of privacy infringements is as important as knowledge of individual consumer demographics and buying habits. With that knowledge, companies can take measures to anticipate and prevent violations, and to compensate consumers in an appropriate manner when a violation of privacy is considered "the cost of doing business."
1. Abreu, E. Is TiVo snooping on couch potatoes? PC World, 2001.
2. Bass, A. You are what you watch. CIO Magazine, 2004.
3. Charney, B. TiVo watchers uneasy after post-Super Bowl reports. CNET News, 2004.
4. Chesak, J. and Dippold, J. Frequent Doesn't Mean Loyal: Using Segmentation Marketing to Build Shopper Loyalty. AC Nielsen, 2004.
5. Cohen, D. Addressable TV: Myths and realities. ClickZ Marketing, 2004.
6. Nielsen Media Research. TiVo and Nielsen Media Research agree to market DVR usage information. Press release, 2004.
7. Overby, C. Forrester: Consumers want value. RFID Journal, 2004.
8. Ryker, R., Lafleur, E., Cox, C., and McManis, B. Online privacy policies: An assessment of the fortune e-50. J. Computer Information Systems 42, 4 (2002), 1520.
9. Spangler, W.E., Gal-Or, M., and May, J.H. Using data mining to profile television viewers in the digital TV era. Comm ACM 46, 4 (Apr. 2003), 6672.
10. Stafford, T.F. and Urbaczewski, A., Spyware: The ghost in the machine. Commun AIS 14 (2004), 291306.
11. TiVo. TiVo Privacy Policy, 2004.
12. Zaslow, J. If TiVo thinks you are gay, here's how to set it straight. WSJ, (Nov. 26, 2002).
©2006 ACM 0001-0782/06/0500 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2006 ACM, Inc.
No entries found