CHI 97 Electronic Publications: Papers

Relationships Between Users' and Interfaces' Task Representations

Robert B. Terwilliger and Peter G. Polson
Institute of Cognitive Science
University of Colorado, Boulder, CO 80309-0345
telephone: (303) 492-5622
email: ppolson@clipr.colorado.edu

ABSTRACT

In a previous experiment, we demonstrated that some users seem to significantly transform the instructions for a graph creation task before they even begin to interact with the interface, and furthermore, that this can create considerable difficulty with an interface that does not require the transformation. In this paper, we describe a contrasting experiment, showing that subjects without pre-existing task transformations initially have considerable difficulty with an interface that requires them, but acquire the transformations relatively quickly. Kitajima and Polson's LICAI model explains these effects as resulting from the problem representation being elaborated with task-specific schemata during the instruction comprehension process.

Keywords

empirical studies, cognitive models.

ABSTRACT

Keywords

INTRODUCTION

Task Elaboration vs. Instructions Following

EXPERIMENT 1: PRE-EXISTING SCHEMATA

Method
Subjects
Materials and Apparatus
Procedure
Results
Discussion

EXPERIMENT 2 NONEXISTENT SCHEMATA

Overview of the Experiment
Realistic Analogs to the Experiment
Method

Subjects
Materials and Apparatus
Procedure
Data Collection

Results and Discussion

SUMMARY AND CONCLUSIONS

Implications for Designers

ACKNOWLEDGMENTS
REFERENCES

INTRODUCTION

Developers are now designing applications for a generation of "discretionary users" [6, 7]. Such individuals are computer literate, and in fact may be highly skilled with certain applications. However, their general skill level is intermediate, and they are in the somewhat unfortunate position of using many different applications, or entire systems, on an irregular basis. We are interested in how the task representation implicit in the manual and interface affects the difficulties that these users experience when manipulating an unfamiliar interface.

For example, creating a graph is a task that many users do only infrequently. One aspect of this task is the assignment of variables to axes. In Cricket Graph Version 1, this is accomplished using a dialog box with two selections lists containing variable names: one for the X axis, and one for the Y axis. On the other hand, in Microsoft Excel Version 5, the user must indicate whether the "data series" are in rows or in columns, and then determine how many rows (or columns) are used for "Category (X) axis labels," and how many columns (or rows) are used for "Legend Text." It seems reasonable to assume that users' representations of the axes assignment task will significantly affect the difficulty of using these two interfaces.

The representation used by the solver is a long standing issue in the study of problem solving [2] . On the one hand, solvers can use superficial properties to search for a solution. On the other hand, they can transform these properties into a different, more effective representation. In some situations, the inability to correctly transform the initial representation may prevent the discovery of a solution. For example, certain specialized schemata are necessary to solve arithmetic word problems [4] . The transformation, or lack of it, from an initial problem description to a different representation used for problem solving, lies at the heart of the task elaboration vs. instructions following distinction we are investigating [8].

Task Elaboration vs. Instructions Following

Consider the following situation: An experienced user sits in front of a computer, attempting to complete a task using an unfamiliar application. For each step in the process, they first read some instructions from a manual and then turn to the computer to perform the actions about which they have read. What are the cognitive processes involved in this situation? And, how should the manual and interface be designed so as to minimize the difficulty of the task? We can tell at least two stories.

In the task elaboration story, the user has a complex, pre-existing representation of the task to be accomplished; therefore, when they read the instructions, they transform them into this representation and then attempt to manipulate the interface. Because of this, the task is easiest when the interface matches their pre-existing representation, regardless of the way in which the instructions are written.

On the other hand, in the instructions following story, the user constructs their task goals from a superficial representation of the instructions. Loosely, they remember keywords and phrases verbatim, and look for menu items and other labels matching what they remember. Therefore, the task is easiest when the manual and interface use identical terminology.

The LICAI model developed by Kitajima and Polson [5] can provide a more detailed version of either story. In the current version of the model, the written instructions are automatically elaborated with task-domain schemata, if such schemata exist. Therefore, task elaboration occurs when users have pre-existing task-domain schemata, and instructions following occurs when they do not.

EXPERIMENT 1: PRE-EXISTING SCHEMATA

Is the Kitajima and Polson account correct? If so, then users should display instructions following behavior on some tasks, and task elaboration behavior on others. In other words, on some tasks, they should prefer to transform the instructions into a different representation, and on other tasks they should not. Instructions following behavior has been observed [1] , and we have shown that users do prefer to elaborate the instructions under some conditions [8] .

In that study, we examined the graph creation task previously discussed in [1, 5, 8]. In our version of the task, the user first reads a sentence of instructions, then creates a graph by releasing on a menu item and assigning variables to axes in a simplified version of the dialog box used in Cricket Graph. The question we asked was: Are users faster at this task when the labels in the dialog box match the wording of the instructions, or when they match their preferred representation of the task?

As suggested by Kitajima and Polson [5], we considered two versions of both the instructions and the dialog box. For example, if the variables in the data to be graphed were "foo" and "bar," then the XY instructions read "create a graph with foo on the X axis and bar on the Y axis," and the FN instructions read "create a graph of bar as a function of foo." Similarly, the axes assignment dialog box had two selection lists: In the XY version, the left selection list was labeled "X Axis:" and the right selection list was labeled "Y Axis:". In the FN version, the left list was labeled "Plot:" and the right list was labeled "As a Function of:".

We exposed different subjects to all combinations of instructions and interface (dialog box) type. We assumed that subjects would prefer to transform the FN instructions into the XY representation; therefore, if the task elaboration story were correct, subjects would perform the task faster with the XY interface, regardless of the instructions. On the other hand, if the instructions following story better described their behavior, then subjects would complete the task faster when the instructions matched the interface, regardless of its type.

Method

This experiment was presented briefly in [8]. The basic design was a 2 X 2 factorial with both instructions and interface type as between-subjects variables. The apparatus and procedure were similar to those used by Franzke [1].

Subjects

Sixteen subjects were drawn from the introductory psychology subject pool at the University of Colorado and received course credit for their participation in the experiment. On average, they were 18.9 years old, had been using 3.1 different applications on the Macintosh for 3.0 years, and had made about 50 graphs before beginning the experiment. The data for four additional subjects was not used: two due to equipment failures, and two because they could not complete the task unaided.

Materials and Apparatus

All subjects performed their tasks on custom applications constructed using Visual Basic (Applications Edition) on top of Microsoft Excel Version 5.0. Four versions of the system were created, one for each combination of "XY" or "FN" instructions with "XY" or "FN" interface. Figure 1 shows a sample interface screen. Subjects read the detailed instructions for a task on one sheet of a workbook, then switched to a different sheet to perform the necessary actions. The time to complete each task was recorded automatically by calling from Visual Basic to an external C procedure and then invoking the Ticks routine in the operating system.

Figure 1. Sample interface screen for Experiment 1.

One significant feature of this system is its slowness. Being the first Visual Basic application written by the author, it was riddled with inefficiencies. At best, it was over five seconds from the time the subject finished reading the instructions to create the graph (step 2 in Figure 1) until the dialog box was displayed. However, the only actions required were to switch to the "Data" sheet, pull down the "Create" menu, release on the "Graph" item and wait for the dialog box to appear.

Procedure

Upon arriving at the location of the experiment, subjects were first given a questionnaire assessing their computer and graphing experience. They were then given think aloud instructions before performing three warm-up tasks. The first introduced them to unusual features of the experimental interfaces, the second had them sort a table of data, and the third had them do a number of simple formatting tasks. After completing the warmups, the subjects performed the graph creation task and were then allowed to leave.

Results

The total time to create the graph was recorded for each subject. The average times for each condition are shown in Table 1. An ANOVA with two between-subjects variables revealed that, on average, the task took significantly longer when the interface had the FN labels than when it had the XY labels (F(1,12) = 14.31, p = .0026, partial r2 = .544). There were no other significant effects or interactions.

A set of planned comparisons revealed that the average times for the two versions of the interface were significantly different both for the FN instructions (F(1,12) = 6.70, p = .0237, partial r2 = .358), and for the XY instructions (F(1,12) = 7.62, p = .0173, partial r2 = .388). The times for the two versions of the instructions were not significantly different for either version of the interface.

Table 1. Average time in seconds to create graph by instructions and interface type.

Interface

Instructions XY FN Average

XY M = 80.31
SE = 10.52 M = 117.58
SE = 11.89 M = 98.95
SE = 10.18

FN M = 78.59
SE = 6.43 M = 113.55
SE = 8.45 M = 96.07
SE = 8.23

Average M = 79.45
SE = 5.71 M = 115.57
SE = 6.80 M = 97.51
SE = 6.34

Discussion

The shorter completion times for conditions with the XY interface, regardless of the type of instructions, suggests that subjects were elaborating the instructions as suggested by Kitajima and Polson [5]. Further support is provided by the lack of any significant difference between conditions in which the instructions matched the interface, and those in which they didn't. At least in this situation, subjects did not seem to be instruction following.

We can plausibly explain these phenomenon in the framework of the LICAI model. First, assume that the majority of subjects had pre-existing schemata which transformed the FN instructions for the graph task into the XY representation. In other words, assume that they preferred the XY representation of the problem. Therefore, the XY instructions / XY interface condition was easy because the interface matched subjects' preferences, but the XY instructions / FN interface condition was difficult because the preferences and interface did not match.

On the other hand, when the subjects read the FN instructions, they automatically transformed them into the XY representation, then used this representation to construct task goals, which guided the rest of the process. In the FN / XY condition, the interface matched these goals, and the task was completed easily. However, in the FN / FN condition, the goals did not match the interface, and the subjects had to return to the original representation, construct new task goals, and repeat the rest of the process.

We must ask: Why couldn't the subjects simply retrieve the original representation of the instructions from memory? If this was possible, then they also would also have performed well in the FN / FN condition. There are at least two possibilities: either the subjects were unable to hold both the original instructions and the goals in short-term memory simultaneously, or they were suppressing the original representation.

We believe that the former is more likely, as the subjects were under a reasonably heavy cognitive load. As mentioned previously, one significant property of the apparatus was its slowness. Subjects would have had to hold both the instructions and the transformed representation in memory for over five seconds while manipulating an unfamiliar interface. It seems reasonable that they were unable to do so. Anecdotally, we can report that most of the subjects used a verbal rehearsal strategy to retain the (original or transformed) task instructions until the dialog box appeared.

EXPERIMENT 2 NONEXISTENT SCHEMATA

Experiment 1 showed that many users do seem to possess task-domain schemata for the graph creation task. Specifically, at least in some situations, they seem to transform the FN version of the instructions into the XY representation. Of course, the above explanation relies on subjects' pre-existing schemata. Therefore, a good way to either verify or invalidate this explanation would be to run a second experiment using a task for which subjects have no pre-existing transformations. We chose a simple formatting task, designed to be as similar as possible to the graph task.

Subjects assigned fonts to the different variables in a data set using a dialog box similar to the one for the graph creation task. As before, there were two versions of both the dialog box and instructions. If "foo" and "bar" were the variables in the data set, then the BI version of the instructions read "Format the data with foo in bold and bar in italics," and the CE version read "Format the data with bar in corporate and foo in executive." Similarly, one version of the dialog box was labeled "Bold:" and "Italics:," and the other was labeled "Corporate:" and "Executive:."

Bold was equivalent to executive, and corporate was equivalent to italics. We assumed that subjects would have little experience with using two names for the same font in the same application, and would not have a pre-existing transformation from the CE to the BI representation. These assumptions turned out to be reasonable; however, we should note that the format task is not as contrived as it first may seem.

When the instructions match the interface, the font assignment task is quite common, although the particulars of the dialog box are unusual. While the mismatched versions of the task (for example, CE instructions / BI interface) may at first seem purely arbitrary, they actually correspond to some fairly common tasks involving styles. In general, we can say that a transformation between representations corresponds to the definition of a style. For example, the corporate style would be defined as italics and the executive style would be defined as bold.

When would one have to understand or use these definitions? Consider the following example. Microsoft Word interacts with a number of support applications such as an equation editor and a graph creation tool. These applications produce individual objects which are embedded in the larger document. Now, suppose that we are given the task of applying the corporate style to an entire document. Further, assume that the document contains both an embedded graph and an embedded equation.

As a first step, we might select the entire document and apply the style. Unfortunately, this would not complete the task: the fonts in the embedded objects would not have changed. To change them, we would first have to understand the definition of the style, because styles are not shared with the support applications. We would then have to open the embedded objects and individually change them based on the style definition. While this situation is more complex, at core it is quite similar to the formatting tasks which we studied in the second experiment.

More specifically, in the second experiment, we asked the following questions. First, will subjects show an initial preference for instructions following or task elaboration behavior on the formatting task? Second, how quickly will subjects adapt to an interface which requires the other form of behavior? And third, will experience with the other form of behavior affect subjects' performance on an interface which requires the behavior that they originally preferred?

Overview of the Experiment

In this study, subjects performed the format task a number of times using different versions of the instructions and interface. Subjects were divided into two groups. The task elaboration group saw a sequence of problems which required them to transform the CE instructions into the BI representation. On the other hand, the instructions following group saw a sequence of problems where this transformation was not required.

Both groups performed the same tasks and saw the same instructions; however, the two groups used different interfaces for some problems. For the task elaboration group, the instructions changed, but the interface always used the BI representation. On the other hand, for the instructions following group, the representations for the instructions and interface always matched. Both groups performed the BI / BI task, but the instructions following group performed CE / CE task, while the task elaboration group performed the CE / BI task.

Subjects began by repeatedly performing the BI / BI task to become accustomed to the interface. Then, subjects' performance was measured the first time they performed a task where application of the transformation might be required: Subjects in the instructions following group performed the CE / CE task, while subjects in the task elaboration group performed the CE / BI task. We expected that the task would be relatively easy for the instructions following group because the interface would match their expectations. We predicted that subjects in the task elaboration group would have considerable difficulty because they would not initially have the required transformations. In other words, we expected the subjects to show a preference for instructions following.

After this first test task, subjects were repeatedly given experience with either transforming, or not transforming the CE instructions into the BI representation. The task elaboration group performed the CE / BI task, and the instructions following group performed the CE / CE task. Subjects' performance on each task was measured, so that we could determine how long it took the task elaboration group to acquire the requisite transformation. We expected that subjects would have a difficult time creating the transformation; therefore, the task elaboration group would take significantly longer than the instructions following group on the first few problems in the experience phase.

Subjects then repeatedly performed tasks which alternated between instructions type: On one problem subjects saw the BI instructions, and then on the next problem they saw the CE instructions. As before, the instructions following group always saw an interface which matched the instructions, while the task elaboration group always saw the BI interface. This phase was both to give the subjects more experience with instructions following or task elaboration, and to ensure that they were flexible in their behavior.

After this second experience phase, subjects' performance was measured on a task which required the other form of behavior. Specifically, the instructions following group was given the CE / BI task (which required the application of a transformation), and the task elaboration group was given the CE / CE task (which did not require a transformation). We predicted that the task elaboration group would complete this second test task more quickly than the instructions following group.

We expected that subjects in the instructions following group would not have the required transformation when they were given the CE / BI problem, and that they would have considerable difficulty acquiring it. Therefore, the problem would be difficult. On the other hand, we expected that when subjects in the task elaboration group saw the CE interface, they would have both the original and transformed representation of the instructions available in memory. Therefore, they simply had to create new task goals from the original representation of the instructions and continue with the rest of the process.

Realistic Analogs to the Experiment

At this point, let us briefly consider: How realistic is this situation? We would argue that it is much more realistic that it first may seem. Let us return to our previous example of applying a new style to a Microsoft Word document containing embedded equations and graphs. Now, let us expand the situation to a number of documents, some of which contain embedded objects and some of which do not, and a number of tasks, some defined in terms of applying a new style, and some defined in terms of basic formatting operations.

For both groups, the warm up phase (the BI / BI task) corresponds to performing a number of tasks defined in terms of basic formatting operations. For the instructions following group, the first test and experience phases (the CE / CE task) correspond to changing the style of a number of documents which do not have embedded objects.

On the other hand, for the task elaboration group, the first test and experience phases (the CE / BI task) correspond to applying a style to (changing the format of) a number of embedded objects in a single document. Remember, styles are not shared with embedded objects; therefore, to apply a style the user must translate the style into formatting operations and apply them to the objects. Specifically, if multiple graphs of different types are embedded in a Word document, the font of the axes labels might have to be changed by hand on each one.

For the instructions following group, the second experience phase (the CE / CE and BI / BI tasks) corresponds to performing a number of different tasks, some defined in terms of changing styles, and some defined in terms of basic formatting. For the task elaboration group, the second experience phase (the CE / BI and BI / BI tasks) also corresponds to a number of tasks, with some defined as applying styles to embedded objects, and some defined in terms of basic formatting operations.

For the instructions following group, the second test (the CE / BI task) corresponds to performing a style application task on an embedded object, and for the task elaboration group, the second test (the CE / CE task) corresponds to a simple style change task on a document without embedded objects. Thus, although our sequence of experimental tasks is both simplified and arbitrary, it does have strong analogs in the practical world of day-to-day word processing.

Method

The experiment was a 2 X 2 factorial, with group (task elaboration or instructions following) as a between-subjects variable and task (problem number) as a within-subjects variable. Both the apparatus and procedure were improved versions of those used in the first study. Table 1 presents an overview of the tasks performed by subjects in this experiment. Each entry in the table shows the instructions / interface seen by the corresponding group during a particular phase of the study.

The tasks can be divided into six phases. In the first warm-up phase, subjects performed a number of tasks designed to introduce them to the interface. In the second warm-up phase, subjects repeatedly performed the BI / BI task to ensure that they could navigate the menus and dialog box effectively. In the first test phase, both groups were given CE instructions for the first time, but the instructions following group saw an CE interface while the task elaboration group saw a BI interface.

Table 2. Tasks performed by subjects in second study by group and phase.

Group

Phase Instructions Following Task Elaboration

Warm-up I Warm-up Warm-up

Warm-up II BI/BI BI/BI

Test I CE/CE CE/BI

Experience I CE/CE CE/BI

Experience II CE/CE
BI/BI CE/BI
BI/BI

Test II CE/BI CE/CE

In the first experience phase, the instructions following group repeatedly performed the CE / CE task, while the task elaboration group performed the CE / BI task. In the second experience phase, both groups alternated between CE and BI instructions. However, the instructions following group always saw an interface that matched the instructions, while the task elaboration group always saw the BI interface. Finally, in the second test phase, each group was given a problem which required the other type of behavior for the first time. The task elaboration group performed the CE / CE task, while the instructions following group performed the CE / BI task.

Subjects

Fourteen subjects were drawn from the introductory psychology subject pool at the University of Colorado and received course credit for their participation in the experiment. On average, they were 19.6 years old and had been using either the Macintosh or PCs, including Windows, for 4.0 years before beginning the experiment. The data from two other subjects were not used: one due to experimenter error, and the other because they could not complete the task unaided.

Materials and Apparatus

All subjects performed their tasks on a custom application constructed using Visual Basic (Applications Edition) on top of Microsoft Excel version 5.0. The system also supports the graph creation task. The time to complete each task was recorded automatically by calling to an external C procedure and then invoking the Ticks routine in the operating system. The sequence of tasks performed by each subject was controlled by a simple program, and a version was created for each group.

Figure 2 shows a sample interface screen. The system presents four windows. The top left window displays the data set for each task, the top right window was not used in this study, the bottom left window displays instructions, and the bottom right window displays hints. (Subjects could view the contents of the "Hints" window only after they had worked on a task for two minutes.)

The system displays dialog boxes in the center of the screen. The figure shows the BI version of the format assignment dialog box. As shown, it contains two selection lists, each containing the names of both variables in the current problem. Subjects can view only one window, or dialog box, at a time. When a window is selected, the previous window is blanked before the contents of the new window are displayed.

Figure 2. Sample Interface screen for Experiment 2.

In this experiment, subjects had to use only two menus, each containing only one item. The "Data" item on the "Format" menu calls up the format assignment dialog box. Subjects began a new task by pulling down the "Tasks" menu and releasing on the "Next Task" item. The system then displayed a message box stating the correctness of their solution to the current problem, and they were not allowed to proceed to the next task until the current task was completed correctly.

Procedure

Upon arriving at the location of the experiment, subjects were first given a questionnaire assessing their computer experience. They were then given think aloud instructions before performing the six phases of the experiment at their own pace. During this time, the experimenter took protocol notes describing their behavior and verbalizations while solving each problem. When they were finished, the experimenter informed the subjects of the purpose of the experiment, answered any questions to their satisfaction, and then allowed them to leave.

Data Collection

The total time to format each data set was recorded for each subject. The protocol notes for each subject were also used to classify their behavior on the two test tasks along two dimensions with two categories each. Unfortunately, it was not possible to definitively classify each subject on each dimension for each task. In such cases, the behavior was left unclassified.

First, subjects were classified on the behavior they first displayed when given the task instructions: instructions following (IF) or task elaboration (TE). Typical IF behavior involved verbal rehearsal of the instructions from the time they were read until the time the dialog box appeared. Typical TE behavior involved an explicit conversion to the BI representation once the instructions were read, followed by rehearsal of the converted representation.

Second, subjects were classified on how they first tried to convert the instructions to the BI representation, if they tried to do so. Subjects were rated as either trying to perform an explicit conversion (CV) or trying to find a solution by trial and error (TR). Typical conversion behavior involved either an explicit transformation once the instructions were read, or a pause, with attempts to recall the relationship from memory. On the other hand, typical trial and error behavior involved stating that guessing would be used, followed by a summoning of the dialog box.

Subjects were classified on this dimension for both test tasks, but with a slightly different meaning in each case. For subjects performing the CE / CE task, CV or TR behavior had to be displayed before summoning the dialog box. On the other hand, for the CE / BI task, subjects could display a categorizable behavior either before or after viewing the dialog box. For example, for subjects who were surprised by the dialog box, this was the method they tried once they realized the interface was not as expected.

Results and Discussion

The average completion times and the results of the protocol analysis for each group on both test tasks are shown in Table 3. An ANOVA with one between-subjects and on within-subjects variable revealed a significant interaction between group and problem (F(1,12) = 31.92, p = .0001, partial r2 = .727). The subjects performed the CE / CE task much faster than the CE / BI task, regardless of its place in the experimental sequence. There were no other significant effects or interactions.

As predicted, on the first test, the task elaboration group took significantly longer to complete the CE / BI task than the instructions following group did to complete the CE / CE task (F(1,12) = 18.78, p = .0010, partial r2 = .610). This is consistent with subjects not possessing pre-existing transformations for the format task. Therefore, the task elaboration group had to acquire these transformations on the first test task, and the problem became significantly more difficult.

Table 3. Data for Experiment 2 by group and task.

The protocol data support this explanation. All seven of the subjects in the task elaboration group displayed instructions following behavior, and they all tried to create the transformation by trial and error. Furthermore, all of the subjects in both groups displayed a very strong preference for instructions following behavior on the first test task. Typical behavior consisted of considerable confusion, and explicitly expressed expectations that the interface should be labeled using the CE terminology. In many cases, the subjects explored the entire menu structure before using trial and error methods to construct a transformation.

Also as predicted, on the second test, the task elaboration group completed the CE / CE task much faster than the instructions following group completed the CE / BI task (F(1,12) = 20.75, p = .0007, partial r2 = .633). As for the task elaboration group on the first test, acquiring the CE to BI transformation increased the completion time for the instructions following group on this problem. The protocol data provide further support for this conclusion. Six of the seven subjects in the instructions following group displayed instructions following behavior, and five tried to create the conversion by trial and error. No subjects displayed task elaboration behavior, but one did try (unsuccessfully) to recall the transformation from memory.

It is significant that while the task elaboration group did possess the CE to BI transformation by the second test, they still completed the CE / CE task quite easily, whereas in the first experiment, subjects had considerable difficulty performing the equivalent FN / FN task. We believe that this is due to the effect of cognitive load.

In the first experiment, the cognitive load on subjects was high. The apparatus was slow and clumsy, and they never got a chance to become accustomed to it. Therefore, the original representation of the instructions could have been forced out of memory after the subjects applied the transformation. Anecdotally we can say that many subjects verbally rehearsed the transformed representation for the five seconds until the dialog box appeared. Then, when the interface was not as expected, they had to redisplay the instructions before creating new task goals.

On the other hand, at the end of the second experiment, the cognitive load on subjects was low. The apparatus was much improved, much faster, and the subjects were given considerable experience with it. Therefore, subjects should have been able to retrieve the original representation of the instructions from memory when needed, making the task much easier. The fast completion times for the second test task are consistent with this interpretation.

The task elaboration group acquired the CE to BI transformation more quickly than we had anticipated. ANOVAs were performed on the completion times for each task in the first experience phase using group as a between-subjects variable. They revealed that the completion time for the TE group was significantly slower only for the first task in the sequence (F(1,12) = 7.67, p = .0170, partial r2 = .390). On the second, and subsequent, tasks for which the CE to BI transformation was required, the completion times for the two groups were not significantly different.

The protocol data for the second test task reinforce this interpretation. Four out of seven subjects in the task elaboration group displayed very pronounced task elaboration behavior on the second test task. Typically, they would perform an explicit conversion to the BI representation once the instructions were read, then rehearse the converted representation until the dialog box was displayed. It seems that task-domain transformations can be acquired quite quickly when needed.

SUMMARY AND CONCLUSIONS

The results reported above suggest that the task transformations described by the LICAI model do influence users performance on human-computer interaction tasks; that users have pre-existing transformations for some tasks, but not for others; that task transformations can be acquired fairly quickly when needed; and that the effects of these transformations differ with the cognitive load subjects are experiencing.

Experiment 1 suggests that many users do have pre-existing transformations for the graph creation task. Subjects completed the graph task faster with the XY interface, regardless of the instructions type. In other words, subjects displayed a strong preference for task elaboration behavior. On the other hand, Experiment 2 suggests that subjects do not initially have transformations for the format task. Unlike Experiment 1, subjects completed the first test task much faster when the instructions matched the interface than when they did not. All of the subjects displayed a very strong preference for instructions following behavior, and no subject tried to perform an explicit conversion.

In Experiment 2, subjects quickly acquired the necessary transformations for the format task. The completion times for tasks requiring or not requiring transformations were equivalent after only one performance (the second transformation). Similarly, after only a few repetitions of tasks requiring the transformations, subjects were clearly applying them automatically on first reading the instructions.

The difficulty users have manipulating an interface which does not require a habitually applied transformation (in other words, which uses a non-preferred representation that matches the instructions) depends on the cognitive load they are experiencing. Under high load, the transformation can force the original representation of the instructions out of memory. Therefore, when the user sees the unexpected interface they need to redisplay the instructions, and the task becomes more difficult. On the other hand, under low load, the original instructions can be retrieved from memory, and the task is completed relatively easily.

The experiments reported here do not provide support for all aspects of the LICAI model; however, they do provide strong evidence for the existence of task-domain transformations as distinct entities. We believe it would be hard to explain the above results without using something equivalent. Furthermore, they suggest that ideas from research on the interaction of text comprehension and problem solving, upon which LICAI is based [3, 5], may prove useful in understanding human-computer interaction.

Implications for Designers

First of all, our results strongly suggest that designers need to know what, if any, pre-existing conceptions of a task the users of a system may have. If the terminology used in the manuals, menus, and dialog boxes does not match their expectations, users may perform poorly despite the overall elegance of a design.

Second, we can describe some of the situations that make changing between interfaces difficult. The worst case is an interface requiring a transformation that users do not possess. In other words, a situation where the manual and interface use equivalent, but not identical, terminology that users aren't familiar with.

Users also have difficulty, under high cognitive load, with an interface that does not require a habitually applied transformation. In other words, users may do extra work (apply the transformation) that just makes things harder for themselves (because the interface matches the manual) if they are not familiar with the system and are used to doing things a different way.

Third, we found that, at least under some conditions, users can quickly adapt to a "difficult" interface; therefore, these troublesome situations are most important when the interface will be used only infrequently. In other words, these effects are most important for discretionary users [6, 7]. For example, a user modifying an embedded object in a Word document with an infrequently used support application.

Despite possible limitations, we believe these experiments have analogs in the real world, and that the effects reported have practical consequences. All in all, we believe that we are on an interesting and fruitful track, and that future studies will replicate and further clarify this work.

ACKNOWLEDGMENTS

The authors gratefully acknowledge research support from the National Aeronautics and Space Administration under grant NCC 2-904. The opinions expressed in this paper are those of the authors and not necessarily those of NASA.

REFERENCES

1. Franzke, M. Turning research into practice: characteristics of display-based interaction, in Proc. CHI'95 Human Factors in Computing Systems, (1995), ACM, pp. 421-428.

2. Greeno, J.G. and H.A. Simon. Problem solving and reasoning, in Stevens' handbook of experimental psychology, R.C. Atkinson, et al., Editors. 1988, John Wiley & Sons: New York. pp. 589-672.

3. Kintsch, W. The role of knowledge in discourse comprehension: A construction-integration model. Psychological Review. 95 (1988), pp. 163-182.

4. Kintsch, W. and J.G. Greeno. Understanding and solving word arithmetic problems. Psychological Review. 92 (1985), pp. 109-129.

5. Kitajima, M. and P.G. Polson. A comprehension-based model of exploration, in Proc. CHI 96 Human Factors in Computing Systems, (1996), ACM.

6. Mannes, S.M. and W. Kintsch. Routine computing tasks: Planning as understanding. Cognitive Science. 15 (1991), pp. 305-342.

7. Rosson, M.B. Classifying users. A hard look at some controversial issues, in CHI86 Human Factors in Computing Systems, (1986).

8. Santhanam, R. and S. Wiedenbeck. Neither novice or expert: The discretionary user of software. Intl. J. Man-Machine Studies. 38 (1993), pp. 201-229.

9. Terwilliger, R.B. and P.G. Polson. Task elaboration or label following: An empirical study of representation in human-computer interaction, in Conf. Comp. - CHI 96 Human Factors in Computing Systems, (1996), ACM, pp. 201-202.

CHI 97 Electronic Publications: Papers

		Interface
Instructions	XY	FN	Average
XY	M = 80.31 SE = 10.52	M = 117.58 SE = 11.89	M = 98.95 SE = 10.18
FN	M = 78.59 SE = 6.43	M = 113.55 SE = 8.45	M = 96.07 SE = 8.23
Average	M = 79.45 SE = 5.71	M = 115.57 SE = 6.80	M = 97.51 SE = 6.34

		Group
Phase	Instructions Following	Task Elaboration
Warm-up I	Warm-up	Warm-up
Warm-up II	BI/BI	BI/BI
Test I	CE/CE	CE/BI
Experience I	CE/CE	CE/BI
Experience II	CE/CE BI/BI	CE/BI BI/BI
Test II	CE/BI	CE/CE