acm-header
Sign In

Communications of the ACM

Personal information management

Evaluating Personal Information Management Behaviors and Tools


The nature of personal information management (PIM) implies personalized approaches to understanding people's PIM behaviors and evaluating the tools designed to support them. People create and access their personal information collections over long periods of time, executing a variety of information management tasks and exhibiting a range of behaviors that are often unique to their collections, tools, environments, preferences, and contexts. Walk down any office hallway and you'll see as many PIM variations as there are people in the offices. Some have stacks of paper covering their desks, filing cabinets, and bookcases; others have bare desktops with, perhaps, a single stack of neatly ordered documents and file folders; other documents are filed away in hanging folders, and books are organized alphabetically by author's last name on shelves, as if they were in a library. Ask these people to retrieve any of these items, and their behaviors are likely to be as different as their organizing strategies. How did they come to have such different behaviors in such similar circumstances? Are any two people's PIM practices alike? What can we learn from this variation in order to design future PIM software?

While we may all share some high-level PIM behavior (such as organizing and retrieving), at the operational level PIM appears to be as unique as we are. Thus, using one-size-fits-all evaluation methods and tools is likely to be a less than ideal strategy for studying something as seemingly idiosyncratic as PIM. Studying PIM behavior and evaluating support tools (such as email and search) present many methodological challenges; here, I explore some potentially useful evaluation approaches and directions, aiming to increase awareness of the inherent difficulty of studying PIM while encouraging researchers to be thoughtful, systematic, and innovative.

Several methodological challenges must be addressed when designing studies of PIM behaviors and tools. PIM is an ongoing activity often done in anticipation of future actions (such as refinding information objects) or expected uses (such as sharing information objects). Because basic PIM events often occur at unpredictable times, it is difficult to schedule them as part of an experimental protocol. PIM encompasses a range of activities and tools; understanding just one of them provides only a partial picture of what users want to accomplish and how they might accomplish it. People should be observed in their natural environments at home, at work, and in between as they engage in PIM behavior in real time, recording both the process and the consequences of the behavior [3]. These challenges also make it necessary to leverage laboratory studies of behaviors and tools to understand more about general PIM behavior [4]. It is only through an iterative combination of such studies that we will be able to understand PIM behavior, then construct tools to support it and perform valid and reliable evaluations.

Back to Top

Approaches to Studying PIM

A variety of approaches is available for studying PIM in real-life settings. Naturalistic approaches, including fieldwork [2, 6] and ethnography, are appropriate, since they allow people to perform PIM behaviors in familiar environments with familiar tools and collections. Case studies, focusing attention on a particular user or small number of users, are valuable for studying PIM behavior. Findings can motivate further studies where specific hypotheses can be posed.

Longitudinal approaches allow us to capture data over an extended period and take measurements of behavior at fixed points in time; they can then be combined with naturalistic inquiries and case studies [1, 5]. One challenge in longitudinal approaches is determining an appropriate interval for taking measurements and the length of time for conducting the study. When you measure is as important as what you measure, and PIM behavior can vary based on what people are trying to accomplish at a given moment. For instance, external events (such as holidays) often interrupt or change a person's activities and behavior, possibly affecting the kinds of PIM they perform.

While the quality and quantity of data we gather using these approaches can be rich, vast, and varied, such intensive and personalized approaches to data collection are not without limits. In-depth studies of small sets of participants over time are expensive, and results may not generalize to larger populations. Personalized approaches to evaluation involve other costs, too, some apparent during data analysis. Depending on the number of participants, we might end up customizing n different instruments, conducting n different statistical tests, or creating n different behavioral models. Results can be unruly and difficult to describe concisely. Thus, it is important to follow up these approaches with laboratory studies to refine, explore, and expand on the findings and contribute to the development of general theories of PIM behavior.

Since the early 1990s, the Text Retrieval Conference (TREC) sponsored by the National Institute for Standards and Technology (trec.nist.org) has produced sharable, reusable data sets, tasks, and metrics for those interested in conducting research in the field of information retrieval. It is worth considering how sharable, reusable collections can help facilitate PIM research. A shared collection for PIM research would necessarily differ from a TREC collection, since it makes little sense to assign PIM tasks to users or ask them to conduct their PIM tasks with collections containing someone else's information objects or collections about which they know little (or nothing).

Rather, a shared collection for PIM research might contain a collection of information objects with annotated metadata collected in real time describing original users, tasks, situations, and behaviors. While building good collections is difficult, labor-intensive, and time-consuming, it has the potential to facilitate progress in designing PIM technology. For instance, sharable collections allow for multiple modes of inquiry (such as comparing PIM techniques, examining alternative hypotheses, and replicating previous findings).

Developing a common set of sharable tasks has the potential to facilitate PIM research. Creating and using sharable tasks for conducting research in the field of human-computer interaction may lead to more holistic, incremental, and generalizable research [6]. Identifying tasks for PIM research is inherently difficult because PIM involves tasks that can be defined at many levels of specificity. For instance, "doing email" might be subdivided into at least four separate tasks: searching for a specific piece of email; managing and filing email; setting up and accessing an address book; and reading email. It is difficult to create well-defined tasks each time one is interested in conducting a new study of a particular user population or use environment. Moreover, without some similarity in tasks across studies, it is difficult to compare results and develop theories of PIM behavior.

Developing valid and reliable metrics for the study of PIM behavior and evaluation of PIM tools is an important area that needs attention, since such development is often ad hoc. Rigorous scientific procedures—where both conceptual and operational definitions are provided and justified and where a variety of techniques is used to establish the validity and reliability of measures—are rarely used. Consequently, research and theory concerning PIM behavior and tools have been stymied, since it's difficult to accumulate, compare, and integrate results across studies.


Developing valid and reliable metrics for the study of PIM behavior and evaluation of PIM tools is an important area that needs attention, since such development is often ad hoc.


To progress, PIM research must employ a variety of approaches and strive to support theoretical, experimental, and practical developments. Along with the proliferation of PIM gadgets, we must look to develop evaluation methods and metrics that produce valid, generalizable, sharable knowledge about how users go about the PIM activities and interactions in their daily lives.

Back to Top

References

1. Boardman, R. and Sasse, M. Stuff goes into the computer and doesn't come out: A cross-tool study of personal information management. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2004) (Vienna, Austria, Apr. 24–29). ACM Press, New York, 2004, 583–590.

2. Czerwinski, M., Horvitz, E., and Wilhite, S. A diary study of task switching and interruptions. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2004) (Vienna, Austria, Apr. 24–29). ACM Press, New York, 2004, 175–282.

3. Dumais, S., Cutrell, E., Cadiz, J., Jancke, G., Sarin, R., and Robbins, D. Stuff I've Seen: A system for personal information retrieval and re-use. In Proceedings of the 26th Annual ACM International Conference on Research and Development in Information Retrieval (Toronto, July 28–Aug. 1). ACM Press, New York, 2003, 72–79.

4. Jones, W. and Dumais, S. The spatial metaphor for user interfaces: Experimental tests of reference by location versus name. ACM Transactions on Information Systems 4, 1 (Jan. 1986), 42–63.

5. Kelly, D. and Belkin, N. Display time as implicit feedback: Understanding task effects. In Proceedings of the 27th Annual ACM International Conference on Research and Development in Information Retrieval (Sheffield, U.K., July 25–29). ACM Press, New York, 2004, 377–384.

6. Whittaker, S., Terveen, L., and Nardi, B. Let's stop pushing the envelope and start addressing it: A reference task agenda for HCI. Human Computer Interaction 15, 2–3 (Sept. 2000), 75–106.

Back to Top

Author

Diane Kelly ([email protected]) is an assistant professor in the School of Information and Library Science at the University of North Carolina, Chapel Hill, NC.


©2006 ACM  0001-0782/06/0100  $5.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2006 ACM, Inc.


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account
Article Contents: