Why do end users need to program? Anyone can become overwhelmed trying to cope with information in a world flooded by information. In light of the ubiquity of computer networks, the information challenge for end users is no longer about accessing information but about processing it. The direct-manipulation paradigm, popularized in the 1980s, begins to break down when they are unable to directly manipulate the sources or their information, such as the location of their files on a haed drive or the emails in their in-boxes.
End-user programming is emerging as a crucial instrument in the daily information-processing struggle [8]. Such programming allows customized personal information processing. But few end users have the background, motivation, or time to use traditional programming approaches or the means to hire professional programmers to create programs for them. Simple forms of end-user programming include email filters to clean up email by directing specified emails into separate folders and spreadsheets to explore, say, the total cost of a new house.
Programming by example (PBE) is a powerful end-user-programming paradigm enabling users without formal training in programming to create sophisticated programs. PBE environments create programs for end users by observing and recording software behaviors as they manipulate information on the graphical user interface (GUI) level. For example, in Microsoft Word, PBE mechanisms allow users to build macros.
Here, we explore the problem of PBE reuse, or how users can reuse old program behavior. A user might find a PBE-generated program useful but may need to either generalize or modify it for a related but different task. Program reuse is a well-known software design problem. Reuse problems hamper the productivity of a wide range of software projectsfrom individual end-user programmers to large distributed teams of software developers. In the PBE context, reuse poses even more complex issues. While PBE initially shields end users from having to deal with programming issues, reuse forces them to leap cognitively between two levels of representation:
The PBE representation chasm mirrors the difficulty users have comprehending the mapping between the GUI level and the program level. In reuse, this chasm is especially problematic, because users are required to have at least a minimal understanding of the program-level representationsif they want to be able to adapt these representations to their needs. For end users with no previous exposure to programming in Visual Basic or even to programming in general, this transition may be too complex and leave them frustrated. Frustrated, they may give up on the idea of end-user programming and continue to solve their original problems manually over and over again for years.
Analogies are powerful cognitive mechanisms people use to construct new knowledge from knowledge they've already acquired and understand. When analogies are used with PBE, the result is a new end-user programming paradigm combining the elegance of PBE to create programs and the power of analogies to reuse existing programs. We call the combination of PBE and analogies "programming by analogous examples." Analogies are an effective representation level bridging the GUI and program levels.
Two detailed examples help us explore the reuse problem. In the first, we created a Word macro for repetitive reformatting in order to understand macro reuse. We chose this particular example not because the Word macro recording mechanism is the most sophisticated PBE system, but because Word has an enormous user base, and readers may be able to relate more readily to use and reuse issues in the Word context. In the second example, we built a SimCity-like simulation using the AgentSheets simulation authoring tool [11] from AgentSheets, Inc., of Boulder, Colo.; this tool provides an end-user programming approach considerably more accessible than Visual Basic. We've found that even when the program level is relatively accessible, analogies are still needed as a PBE reuse mechanism.
As PBE techniques become more widely adopted in both commercial and research systems, end users incrementally construct and edit new behavior and commands as needed. As in conventional programming approaches, the effective reuse of new behavior is limited by the underlying representations. These representations run the risk of presenting end-user programmers with the same copy-paste-modify and inherit-reuse problems that professional programmers have struggled with for years. What makes reuse even more problematic in a PBE situation is that the programs that need to be modified for reuse are machine generated, typically containing little information as to how they might be changed.
As in conventional programming approaches, the effective reuse of new behavior is limited by the underlying representations.
The reuse scenario in Figure 1 involves an end user trying to automatically reformat an address list. Word and other applications allow for modular modification through a macro mechanism. Users are given an interface to program macros by PBE through a menu command called "Record New Macro." Additional support is given by the macro-editing toolbar and menu functions. Once a macro is named and assigned to a toolbar icon or to a keyboard command, it is created simply by performing the desired taskrepetitive reformatting in our example. The system records keystrokes and compiles the command into Visual Basic. The command can then be invoked by anyone editing a document.
End users want to modify regularly and, hence, reuse a macro with additional, related actions, then assign the entire new set of changes to the same keyboard command. Mechanisms are built into Word for renaming the macro, thus copying the same behavior, but any modification to an existing macro must be done through the Visual Basic programming language. For instance, how would a user have to modify the macro to reformat not just the Fax:
but also the Address:
fields in Figure 1. To modify this macro through the "Record" function, a user would have to overwrite the old macro, go through the same actions that were recorded earlier, and then record the desired additional behavior.
To be able to reuse existing macros, making them useful in more general situations, end users have to drop from the GUI level (Figure 1, left) to the program level (Figure 1, right). According to Microsoft's own Word documentation: "Recorded macros are great when you want to perform exactly the same task every time you run the macro. But what if you want to automate a task in which actions vary with the situation, or depend on user input ... ? To create powerful automations, you should learn to program in Visual Basic for Applications."
There is a large cognitive chasm between the two levels. The GUI one provides only limited options for reuse. Even though advanced reuse requires making a transition to the program level, no intermediate steps are provided to take the user from the simple recording mechanism to the Visual Basic programming language. At the GUI level, users are limited in reuse to copying macros as black boxes into other documents or to making macros generally available to all documents. At the program level, code may be copied, pasted, and modified with the same difficulty as these commands are invoked in any other object-oriented program. At this point, the user has jumped from the comfortable environment of direct manipulation at the GUI level directly into the uncomfortable or downright frustrating program level.
The AgentSheets simulation authoring tool helps explain how analogies in a PBE system can bridge the chasm between the GUI level and the program level. In AgentSheets, end users can build SimCity-like simulations and export them as interactive Java applets. (For a detailed description of AgentSheets, see www.agentsheets.com/userforum-publications. html.)
AgentSheets combines PBE with graphical rewrite rules into an end-user programming paradigm. Graphical rewrite rules (GRRs), which are also used in such systems as Cocoa/KidSim and Creator, are powerful languages for expressing the concept of change in a visual representation. GRRs declaratively describe spatial transformations with a sequence of two or more dimensional situations containing objects. Situations can be interpreted with respect to objects contained in them and the spatial relationships among these objects. The differences among situations imply one or more actions capable of transforming one situation into another. In AgentSheets, any number of GRRs can be aggregated to create complex behaviors for agents; these agents and their behaviors combine and interact to simulate any scenario a user can imaginefrom heat diffusion to urban planning. We have found it likely that behaviors created in one simulation are valuable to users building other simulations (see Figure 2).
Compared to the GUI-program chasm in Word, the GUI-program chasm in AgentSheets is much less daunting for end users. The GRR representation at the program level closely matches the representation at the GUI level. For instance, the user-produced train and train-track icons at the GUI level are also found at the program level. This representational mapping helps end users comprehend programs. But despite the closer match, it is still necessary for any end-user-programmable application to have a mechanism for dealing with reuse.
In an application called Sustainopolis used by urban planners to explore public transportation issues, an end user would want to incorporate cars and streets. Noticing that trains and tracks are already programmed and realizing that cars move on streets much like trains move on tracks, the user would want to reuse the "move" behavior already written for the train object and attach it to the car object.
One approach would be for the user to start from scratch and simply demonstrate all the examples of car and street interactions. Unfortunately, the set of rules attached to the train is quite large. The problem is that even for a relatively simple behavior, such as making the train follow the track, rules have to be demonstrated to specify all the meaningful combinations of interactions between trains and tracks. Trains are capable of heading in four different directionsnorth, south, east, and west. A complete specification accounting for 15 different train-track piecesstraight; four orientations of curves; four different T intersections and one crossing; and four orientations of cul-de-sacsrequires 256 rules.
These rules specify behavior in various situations, including: What if the train drives into a cul-de-sac? and Which way should the train go if it has a choice of going left or right? If the tracks are arranged carefully in the simulation, a good number of rules can be eliminated, but the point remains that it must have taken quite some time and patience to demonstrate the complete set of rules. Given that this behavior exists already in our scenario, it would be preferable to find a way to avoid having to demonstrate the entire set of rules again to specify the conceptually similar interaction between cars and streets.
A different approach is to copy all the train's rules into the car and manually edit them by substituting corresponding icons. In most GRR-based systems, the user would then copy all the rules from trains and tracks to streets and cars, then simply substitute cars for trains and streets for tracks. But transferring rules from one application scenario to another is basically the same arduous, error-prone process a programmer goes through to reuse a program or a Word user goes through to reuse a macro. So, is the user's time being saved? Without understanding what the user intends, how can the system facilitate the reuse process? We have found that the mechanism allowing an end user to program allows reuse as well.
Programming by analogous examples supports the reuse of previously recorded example programs. Instead of creating car behavior from scratcheither through demonstration or through manual editingusers generate a complete set of rules by specifying an analogy.
It is the relationship between trains and tracks that the user wishes to reuse for cars and streets. In contrast to object-oriented programming in which inheritance is used to define an ontology of object classes, analogies are used to define an ontology of object relationships represented by common active verbs, such as "move" (see Figure 3).
The user defines analogies through the analogy dialogue box (see Figure 4). The result is that the behavior programmed through GRRs for trains moving on tracks is transferred to cars moving on streets. The verb "moves" is differentiated for trains and cars, so trains do not move on streets, nor do cars move on tracks, because no relationship is implied between trains and streets, or between cars and tracks.
Here, reuse is expressed in terms of existing objects, and abstractions are not required. The result of the analogy is a new set of GRRs attached to cars, allowing them to move on streets. All the rules generated by analogy are like ordinary GRRs in that the end user edits them to deal with exceptions to the analogy.
Fortunately, PBE systems often feature representations thatthrough the use of analogiesserve as a bridge between the GUI level and the program level. In the train-car analogy, the end user created new behavior out of existing behavior. This method employs reuse-enhanced programming in which an initial set of programs is created through PBE and then extended through analogies.
At what depth does an analogy mechanism need to "understand" a program to create an analogous program? Mere syntactic symbol substitution is insufficient for the transformation of nontrivial programs. A deep understanding of the program semantic is not only an extraordinarily difficult and unsolved AI problem, but developing analogies would probably require users to annotate programs extensively with semantic information.
In the analogy specified in Figure 3, the four icons are from a much larger set of icons representing cars and trains with different headings, as well as streets and tracks representing different topologies. A mechanism that would merely substitute the car and street icons for the train and track icons in order to create a set of analogous rewrite rules would not be very useful; it would match only a limited subset of all relevant rules. How does the analogy system establish the more general mapping among all these different but related items?
Analogy mechanisms cannot stop at syntactic substitution but need access to some minimal semantic information in order to establish more general correspondence. It is no trivial matter for a computer system to substitute correctly at all the necessary levels to render the resulting code usable without further editing by a human. More than 10 years ago, Clayton Lewis at the University of Colorado in Boulder proposed a concept he called "pupstitution" to decrease the brittleness inherent in straight copy-and-paste techniques [5].
In the AgentSheets substrate, "systematicity," or correspondence, in analogy is facilitated through structural and behavioral similarity between the base and target objects [9]. Analogies are made by specifying which relationship should be transferred. For example, cars and trains are objects that can be as simple or as complex as a user wants and can include a single applicable behavior or many behaviors. However, the user has to specify that cars move like trains to transfer this specific behavior. And for the transfer to occur with a high degree of systematicity (and effectiveness to the user), the street and track objects must be structurally similar.
AgentSheets assures this high-fidelity transfer with connectivity information. From the base icons created for the urban-planning icon gallery, the system applies syntactic transformations at the user's request, automatically creating meaningful variations to illustrate intersections and curves, as well as directional orientation [10]. Syntactic transformations create the visual variations necessary for placing the agents on the AgentSheets worksheet and saves the user hours of work on icon depictions. Icons are then annotated by the user with semantic information, such as connectivity (see Figure 5). Connectivity defines the input and output ports for each icon. For instance, a horizontal street segment icon has inputs/outputs to its left and right. Transforming the annotated track creates an entire family of track icons that act essentially as throughputs for objects moving on them.
The system uses an icon's semantic and syntactic connectivity information to support pupstitution in programming by analogous examples. By specifying the connectivity of an icon, then transforming it, the "move" GRR acquires a useful dimension of complexity. After the analogy is established, cars have a rule set that is isomorphic to the train rule set, that is, cars know how to move on all kinds of streets. The system uses the semantic information available to ensure high systematicity between the base and the target; thus, the correct mapping between tracks and streets is established without the end user's intervention.
The general insight we derived is that a little semantics is necessary for creating meaningful analogies. This means that end users have to provide some additional up-front information to annotate their designs with minimalist semantics. It also means that this kind of semantic annotation dramatically improves the reusability of behavior. For AgentSheets, users need to provide only semantic information to the base agents, specifying, say, the connectivity of a street agent they have already defined. AgentSheets can automatically transform agents representing flow conductors, including streets, wires, and pipessyntactically (by applying geometric transformations to the agent bitmap) and semantically (by applying topological transformations to the agent's connectivity).
What makes reuse probematic in a PBE situation is the programs that need to be modified for reuse are machine generated.
In object-oriented programming, in order to facilitate reuse, inheritance can serve the role of the reuse mechanism in PBE systems. An object subclass not only inherits the behavior of the superclass but can overwrite and extend that behavior. How would inheritance work in our trains and cars example? One easy way to get analogous behavior is to make "car" a type of, or subclass of, "train," and "street" a subclass of "track." While this approach produces the desired behavior (due to inheritance), it is ontologically unsound, and changing the object hierarchy in this way produces a misleading model based on weak design.
To correct this problem, system developers expect end users to be a bit more like programmers. Therefore, in the class hierarchy, cars become siblings of trains and streets siblings of tracks through two abstract superclasses"moving-object" and "movement-guiding-object."
For the user to get the trains and cars to move, a generalized GRR has to be expressed in terms of "moving object" and "movement-guiding-object." However, while this new class hierarchy may be ontologically sound, it also introduces a couple of problems:
In PBE systems, the representation end users see at the GUI level may differ greatly from the representation at the program level. In these systems, special representations are often used to ease the end user's transition between the two levels. These representational features enable programming by analogous examples, simplifying program reuse.
Our work with programming by analogous examples is at an early but promising stage. The combination of some semantic information with structural information has allowed the reuse of complex behaviors in the context of interactive simulations. The more similarity there is between behavior the user has created and wants to reuse and the target behavior, the better is the analogy and the less likely the need for editing. We seek to increase differentiation by end users trying to determine appropriate analogical matches. For instance, cars moving on streets and electricity moving through wires share similarities, but electricity can go in both directions at an intersection. Therefore, an analogy between these two classes of object would produce crash-prone cars. We'll soon also explore the use of programming by analogous examples in nonsimulation application domains.
1. Chee, Y. Applying Gentner's Theory of Analogy to the teaching of computer programming. Int. J. Man Mach. Stud. 38, 3 (Mar. 1993), 347368.
2. Cypher, A. and Smith, D. KidSim: End-user programming of simulations. In Proceedings of the 1995 Conference of Human Factors in Computing Systems (Denver, Colo., May 711). ACM Press, New York, 1995, pp. 351358.
3. Goldstone, R., Medin, D., and Gentner, D. Relational similarity and the nonindependence of features in similarity judgments. Cognit. Psych. 23 (1991), 222262.
4. Lange, B. Some strategies of reuse in an object-oriented programming environment. In Proceedings of CHI'89 (Houston, Tex.). ACM Press, 1989, pp. 6973.
5. Lewis, C. Some Learnability Results for Analogical Generalization. Tech. Report CU-CS-384-88, University of Colorado, Computer Science Dept., 1988.
6. Lieberman, H. An example-based environment for beginning programmers. In Artificial Intelligence and Education, R. Lawler and M. Yazdani, Eds. Ablex Publishing, Norwood, N.J., 1987, pp. 135151.
7. Medin, D., Goldstone, R., and Gentner, D. Respects for similarity. Psych. Rev. 100, 2 (1993), 254278.
8. Nardi, B. A Small Matter of Programming. MIT Press, Cambridge, Mass., 1993.
9. Perrone, C. and Repenning, A. Graphical rewrite rule analogies: Avoiding the inherit or copy & paste reuse dilemma. In Proceedings of the 1998 IEEE Symposium of Visual Languages, Nova Scotia, Canada, Sept. 14). IEEE Computer Society Press, Los Alamitos, Calif., 1998, pp. 4046.
10. Repenning, A. Bending the rules: Steps toward semantically enriched graphical rewrite rules. In Proceedings of Visual Languages (Darmstadt, Germany, Sept. 59). IEEE Computer Society Press, Los Alamitos, Calif., 1995, pp. 226233.
11. Repenning, A. and Sumner, T. Agentsheets: A medium for creating domain-oriented visual languages. IEEE Comput. 28, 3 (Mar. 1995), 1725.
12. Roschelle, J., DiGiano, C., Koutlis, M., and Repenning, A. Developing educational software components. IEEE Comput. 32, 9 (Sept. 1999), 5058.
This research is supported by the National Science Foundation under grants: DMI-9761360, RED 925-3425, Supplement to RED 925-3425, REC 9804930, REC-9631396, and CDA-940860.
Figure 1. Creating a PBE application to format all occurrences of the "Fax:" into "Fax Number:" The GUI level (left) shows a document edited with a find-and-replace dialogue box; the program level (right) shows a Visual Basic editor.
Figure 2. Programming a train to follow a track by example in AgentSheets. When the user moves a train on a track at the GUI level (left) AgentSheets records the movement (including context) and represents it at the program level (right) as a rule.
Figure 3. Transferring the relationship between objects.
Figure 5. Icons are transformed syntactically and semantically. Users define only the basic root icons, such as the horizontal street segment, that are then transformed into such icons as curves in terms of appearance and meaning. Semantic information is used to match up icons for analogies. Cul-de-sac streets and tracks were decorated further by the end user.
©2000 ACM 0002-0782/00/0300 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2000 ACM, Inc.
No entries found