![]() |
|
John Karat
Michael G. Kahn
Most complex software projects involve groups who develop the system and groups who use it. The project described in this paper involved a targeted group of USERS (a collection of physicians for whom the clinical workstation software is intended), a CUSTOMER (an organization of medical and other professionals representing a 15 hospital health system in the midwestern United States seeking to develop the system for the users), and a DEVELOPER (an organization located on the east coast of the United States with software designers, programmers and testers contracted to deliver the software system). The customer's primary goal is to provide a system which enables the users to deliver improved care to patients. The developers have so far had little direct interaction with the users. Most interaction has been with the customer. The customer has taken responsibility for representing the user in this project.
The experience we will draw on here comes from our efforts to provide physicians with a comprehensive, longitudinal clinical information system as part of Project Spectrum. Project Spectrum is a joint technology consortium consisting of Washington University School of Medicine, BJC Health System, IBM, Kodak, and SBC Corporation [8]. The current phase (Phase 1) of this project is aimed at providing clinicians with the capability to view all readily available, clinically significant test results (e.g., radiology, laboratory, pathology) for their patients from the office, home, or hospital. The target users for this phase are clinical physicians in the field of general medicine or general surgery in either the academic or community environment.
Due to past (less than successful) experiences with introducing information systems for physicians into the BJC Health System, the customer believed and emphasized that the resulting Clinical Workstation (CW) software must truly meet the needs of the physicians in a highly usable manner. To ensure that this outcome would be the case, the customer knew they needed to start with, and focus on, the physicians. As the process was being developed for determining the physicians' needs, the first author attended CHI '94. Based on what was learned at CHI `94, Contextual Inquiry (CI) was selected as the process for determining physicians' needs [9]. A tutorial on CI was attended by members of the customer organization who would be involved in the process and a modified version of CI was ultimately used to determine the physicians' needs and prioritize the requirements. This activity, which resulted in the development of a User Requirements document, took place in late 1994 (details are described in [4] and the book [13]).
Following the user requirements definition process, a vendor was chosen by the Customer to develop the CW. Functional Specification sessions were held between the developers and the customer to document the function that would be needed to satisfy the user requirements (mid 1995). Once completed, implementation of the functional specifications for the first phase of the project was divided up into iterations (each with a projected 3-4 month duration), and the design process began (late 1995). Each of the iterations was viewed as consisting of a number of cycles in which users would have a chance to provide feedback on the function included in the particular iteration. Iteration 1 was completed with the delivery of a working prototype of the CW (which actually took 9 months and ended in mid 1996), and we are currently entering the second iteration.
The first author (Coble) is a usability specialist for the Customer and the second author (Karat) is a usability specialist for the Developer. The third author (Kahn) is a physician and is Project Executive for the Customer.
In traditional software development practice, the understanding of users and tasks to be supported is generally assumed to be captured in a statement of requirements on which both customer and developer agree. Thus the development process begins by gathering user requirements. The user requirements, however they are represented, are subjected to a problem analysis activity which results in "functional" specification documents [10]. The problem analysis can be conducted either jointly by customer and developer or, as is generally the practice, by the developer alone. In either case, the resulting document, the functional specification, is agreed to by both parties. The assumption by both the customers and the developers is that the real user requirements are accurately represented in this document.
In our experience, misunderstandings about the true meaning of requirements based on the interpretation of functional specifications can give rise to considerable tension between customers and developers. Developers typically expect that the customer has agreed that their requirements are completely and exhaustively represented in the functional specifications document. When customers commission innovative applications, they are unsure of exactly what new technology might provide. They might justifiably feel that functional specifications form only part of the picture -- that functional specifications must necessarily evolve as customers and users learn more about what new technology can and cannot provide [6]. Ultimately, customers expect that user requirements will not become lost as development proceeds.
As mentioned above, an extensive contextual inquiry was carried out in late 1994. This effort included approximately 300 hours of observation and 1300 hours of data analysis occurring over three months. Ten physicians were observed in both inpatient and outpatient settings, representing both primary care and general surgery activities. From these observations, over 500 requirements were derived. These were reviewed in a series of group meetings with the users to validate and prioritize the requirements. The requirements were represented as a series of annotated scenarios in a User Requirements document of approximately 100 pages. An example of some requirements from this document is presented in Figure 1. The user requirements were communicated to developers using "day-in-the-life" presentations, walking through the requirements in a three-week functional specification process, and by making the document available to them.
Figure 1: Sample User Requirement
The User Requirements document examined a broad range of physician requirements. In Phase 1 of this project the customer did not intend to have the system meet all of the requirements -- the requirements were viewed as too extensive to try to meet in a single development effort. In Phase 1, a subset of the requirements were focused on while also trying to consider the long range objectives of supporting all aspects of providing quality health care. Decisions about what part of the requirements would be addressed in the first phase were worked out by the customer and developer together taking into account the priority the physicians gave the requirement, expected availability of clinical data, and the estimated development resources.
User requirements provide one view of what is needed, but they tend to be under-specified in meeting the developer's needs (i.e., they do not provide a detailed design for how something should be accomplished). For most systems it is the functional specifications which provide the detailed system development documentation, not user requirements. When the question is asked about whether or not a system "meets requirements", it is generally the functional specification document to which reference will be made. In many cases, user requirements documents are used only in early development activities and are put aside once functional specifications have been developed. The central difficulty is that a system that meets functional specifications may not meet user requirements if the user requirements are put aside, rather than having a continuing role in development.
We found this problematic. Functional specifications are an interpretation of user requirements, and generally represent a guess on the part of the system designers about what design will fulfill the user requirements. Such guesses are often thought of as needing to be "frozen" before "real development" can take place. From the developer's perspective, resource estimates for a project (the function, cost and schedule for completing the work) cannot be determined until they know exactly what to build. On the other hand, customers would like developers to be responsive to discoveries about the real meaning of requirements that occur during the time of development. Bridging the gap between these two perspectives -- enabling customers to understand that they might not know exactly what specifications are required to satisfy the requirements and developers to understand that user and customer satisfaction depends ultimately on user requirements that may not be represented in the functional specifications -- is a difficult exercise in partnership. It requires moving beyond viewing a system's functional specification as the final translation of user requirements to viewing it as an interpretation based on a necessarily incomplete understanding written at one point in time -- an interpretation that can and should be iteratively revisited. We think that the ongoing use of a user requirements document has helped us in working effectively with developers as they iterate on the software during development.
Working together as partners to represent user requirements, the usability specialists on this project have assisted the developers in shaping the design to meet user requirements. In the rest of this paper we focus on how we have made use of the user requirements and the challenges we faced in developing the clinical workstation software. We do not think of our User Requirements document as a "silver bullet" which will guarantee successful system development, but as a tool to help in ongoing conversations.
Representatives from the customer and the development team had each separately developed initial designs. A physician on the customer team (who has considerable computer experience, and who we will refer to as the domain visionary) had designed a vision of the CW during the user requirements gathering work (late 1994). He made this vision concrete in the form of a paper prototype consisting of cutout pieces of the interface. He became quite skilled at demonstrating and describing this prototype during design discussions (we eventually video-taped him "walking through" his prototype, and we used the tape for communicating this vision to the development group). At the beginning of Iteration 1, the developers began mocking-up their own visual designs (although the customer had already tried out designs, they wanted the developer to arrive at its own design ideas in the early stages). The prototypes evolved through many design discussions that took place over a period of 3 months in late 1995.
The first joint customer-developer design discussions were guided by having the domain visionary produce use-cases to address issues in the developer's prototype. The domain visionary led the evaluation of the developer's evolving design by providing use cases to show where designs might meet or might not meet requirements referring back to the User Requirements document and his own experience as a practicing physician. These discussions also involved looking beyond the current phase and iteration to additional requirements covered in the user requirements document but not included in the current iteration. The domain visionary, supported by other members of the customer team, would address how the current design might evolve to meet future system requirements (e.g., though iteration 1 only provides for results review, we want the design to easily accommodate information entry).
These discussions were a part of the development of a more precise functional specification for Iteration 1 -- in reality development of the system and its specification were occurring at the same time, guided by a consideration of the user requirements. Eventually, a general style evolved which featured a large central area for presenting focus information flanked by two side areas for presenting context information. This design was strongly influenced by the prototype developed by the domain visionary, and represented a modification of his vision to meet implementation constraints that were being explored in the developer's designs. Figure 2 gives an example display from an early paper prototype of the system.
As discussions moved toward apparent agreement on designs, efforts to develop and evaluate paper prototypes with future users also began. Overall, a series of 4 test and redesign cycles were conducted (two tests in January and two in June 1996). For each of these cycles, a version of the design, paper or computer-based prototype, was evaluated by representative physicians drawn from hospitals within the BJC Health System. Following each test, additional meetings between the customer and developers focused on the usability problems discovered in the usability testing.
The tasks chosen for usability testing were directly related to the user requirements that were to be supported by the functional specifications implemented in this iteration. For example, the user requirements:
"The user must be easily able to locate abnormal results" and "The user must be easily able to distinguish between normal and abnormal results"
were tested by a part of one task that asked the user to:
"Find all lab results completed in the last 24 hours and state aloud the abnormal results they contain."
![[Screen print from early Paper prototype]](jmc-fg2.jpg)
Figure 2: Sample display from early Paper Prototype
The developers, located on the East Coast, were encouraged by the customer to participate in the usability tests, conducted in the Midwest. However, the development team did not always view this as a high priority effort when faced with development resource and schedule constraints. Consequently, the goal of the analysis and summary phase of each of these tests has been primarily to communicate information to the development team. In the paper prototyping cycles we have focused more on qualitative analysis of "problems" experienced by our users. Our reports have been summaries of these problems augmented by a video clip summary which presents examples to illustrate the problems.
In these tests, guided by our understanding of the user requirements, we focused on a few key aspects of the design. For example, we were concerned about how far to carry the "patient chart" metaphor in the design of the electronic system. In this regard, we mocked up a version of the paper prototype which included "tabs" for selecting categories of information and another version in which categories were selected from a "table of contents." We learned about the users' preferences for one selection method over the other, but more importantly we learned the reasons the physicians preferred one or the other by asking them during the tests. Having the developers see and hear the users interacting with the prototype, either in person or through watching the video tape, helped in our discussions. For example, the requirement that the "user must be able to understand the system model to facilitate an efficient review of the chart" had been difficult to translate into any specific chart organization scheme. Seeing physicians interact with the prototype convinced both the customer and the developer that both tab and outline mechanisms would be useful. The current design is actually a combination of the two selection methods.
We feel that our summary sessions were an adequate way of communicating user experience to developers, though not as effective as having them observe the tests. We feel that the developers who have observed the tests have been more willing to make design changes than those who have not. Measured by the willingness of the developers to modify the design, both paper prototype evaluation cycles were very successful. Almost all recommendations made on the basis of the tests were accepted by the developers, while a much smaller percentage of recommendations presented by the customer during general design discussions were acted on. Each set of "problems" identified in a test was jointly worked on to design a change that would address the usability problems. During the design discussions, we found that developers referred to the video clips of physicians to help decide which design changes to make.
The first cycle of computer-based testing was performed in a single day. This was followed by a half-day of analysis in which we worked with the developers to identify problems which they might be able to fix quickly, followed by a half-day to actually work on code to fix the problems. The second cycle, the UAT, was then conducted on the third day. The UAT tests followed the same basic procedure of our other tests, but we focused the analysis more on measures of usability objectives [12] (see Figure 3).
| Measuring Methods | Min. Level | Results |
|---|---|---|
| Time to complete task | 3:21 | |
| Percent of task complete without assistance | Avg. >= 90% | 96.00% |
| Number of problems | Avg. <= 5 | 1.20 |
| Amount of assistance requested | Avg. <= 5 | 0.00 |
| Satisfaction level | Avg. >= 4 | 5 |
While we collected data on a few specific usability metrics (e.g., number of problems, user satisfaction) during all usability tests, we emphasized the importance of these to the developers more specifically in the UAT. We stressed to the developers -- with some difficulty -- that we expected these metrics to be used to evaluate formally whether the design of the current iteration was acceptable to the user. For iteration 1 we did not intend these to be difficult goals to achieve -- partly because we hoped to build up our own understanding of valid usability objectives for the system, and partly because we were more interested in indications of progress than of reaching absolute goals in the initial iteration. We found that while basing user acceptance on achieving measurable usability acceptance makes a great deal of sense, it is proving difficult to go directly from user requirements to specific metrics. Rather than abandoning usability objectives, our approach is to use the early tests to build our understanding of them and to set initial metrics based on this better understanding.
The specific challenges we faced were:
(1) A general resistance to being held responsible for responding to usability problems. An interesting dilemma has arisen in the design process for this project -- one which both customer and developer have come to view as a joint effort. It was not until we were making plans for the UAT that the developer raised an issue about whether testing to behavioral objectives should be considered grounds for making an acceptance decision. In this project, the customer declares they are satisfied with an iteration after a UAT. If the system passes UAT, payment is made to the developer. While the developers accept usability testing as a means for providing feedback into the design process, they argue that since the customer agreed to the functional specifications, the valid acceptance test is "meets specs" not "fulfills user requirements." In this context, we have had to ask what it means to perform an acceptance test for a design which is arrived at by input from both parties, but which the developer alone will have to expend resources to correct?
This has been an ongoing challenge for the role of iterative usability evaluation within the project. We have no general solution as yet. The completion of Iteration 1 of the system involved the usability test cycles described above. In all cases the developers have attended to the recommendations that we presented as a result of our user evaluations. Also, iteration 1 of the system "passed" the usability objectives in the UAT, and developers seem more likely to cooperate with future tests. However, there is no explicit commitment to continue to do so, and we are not certain what would happen if usability evaluations suggested that a major redesign was necessary.
We might point to some confusion that is contributed to by the terminology used on this project. While the developer has a process which calls for a "user acceptance test" as an understood condition for the customer to declare acceptance, they seem to view this as a "purchaser acceptance test." Indeed, viewing acceptance as purchaser-driven rather than user-driven is the common view in software product development [10]. However, the combination of the UAT terminology and the customer's focus on the users' needs helped create an expectation by the customer that acceptance would be driven, at least partially, by user evaluations.
(2) Developers changing designs that have successfully been usability-tested. Everyone enjoys being creative, and it is an important part of maintaining a motivated development team to listen to new ideas. While designs are generally worked out with both customer and developer participation, the two are in different locations and the developers code the design. Some ideas may prove difficult to implement as designed. Sometimes a new way of doing things might occur to a developer. It is often difficult for them to understand what elements of the agreed-on design are really critical for overall usability, and what can be "improved upon" without negative consequences.
For example, aspects of the design that had "tested well" were sometimes subsequently changed by the developers in response to other considerations. For the developers, as long as a design change preserved the ability to access function, there was a tendency to view any change as inconsequential to "meeting requirements." To discourage changes to user-tested and approved designs, we have continually pointed back to usability test results and to the User Requirements document. This reference back to original user requirements, while not always successful in making a case with the developers, has generally been an argument which is not easily dismissed. Though it is sometimes frustrating to have features that were positively evaluated by users changed by the developers as the design progresses, the customer has allowed new design ideas to be tried as long as the developer accepts the user feedback as the ultimate criterion.
(3) An early platform change (from Visual C++ to JavaScript and Netscape) created an impression that wholesale design change was called for. On many levels this switch is thought to be a very positive change and strategically the right decision for the project. For example, the resulting system will be more portable than the customer originally thought possible. However, this change comes with some costs in flexibility of interface design because some interface elements supported in a desktop environment are not easily implemented in a network browser. Because of the lack of good user interface development tools and features, coupled with the learning curve the developers experienced, we have had to accept "good enough" solutions in the user interface which have an adverse effect on usability.
Sometimes such decisions can be driven from the developer without making the customer fully aware of the consequences (in this case by highlighting the advantages - portability - while downplaying the disadvantages - more limited interface design possibilities). For a real partnership to exist, both customer and developer must be aware of the mutual impacts and work to accommodate them. As the user interface was re-implemented, design changes were made which the developers felt improved the CW despite the successful usability testing results from the previous design. Once again, the customer pointed to the successful usability testing results and encouraged the developers to stay with those original designs even though they were not sure how to implement them on the new platform. While this has caused additional work for the developers, the customer requests -- always motivated by user feedback -- have almost always been respected.
(4) Turnover of personnel in the development organization resulting in the need to educate the new developers in the users' needs. Over the 9 month course of Iteration 1 there were many personnel changes in the developer organization at both staff and management levels. With each new group of people the customer would attempt to convey a graphic understanding of the users and their work. This was accomplished through encouraging participation in usability tests, having the developers observe physicians in the hospitals or offices, and bringing a practicing physician, the domain visionary, to the developer's location for design sessions.
All of these methods have worked to some extent. The developers have participated in most usability tests. A few developers have followed physicians on hospital rounds and in the office. All developers have been in meetings with physicians who are part of the customer team. But these methods take time to execute and no one wants the development schedule to slip. This has been an ongoing struggle. It is difficult for the developers to take time away from development to visit the customer site and to observe usability evaluations. However, the developers who did this have uniformly reported it was a valuable experience for them.
One way in which this challenge might be addressed is by having the developers become familiar with the User Requirements document. While this document is available to them, most developers (staff and management) have not read or used it to any significant extent. The developer perspective is primarily that it is the functional specifications that they need to be aware of. Perhaps our most effective tool in providing a user perspective to the developers is through the video summaries that are a part of our test review sessions. As we document the details of the functional specifications for Iteration 2, we are adding the specific user requirements that the functional specification supports. We hope this will help us focus more on the relationship between the specifications and requirements.
Long range vision is important in this project. While we are not trying to meet all of the physicians' requirements in a single project, we would like to make design decisions that are informed by possible future directions. The User Requirements document provides much of this vision. The CI was conducted by observing physician's work that went beyond what we expected to address initially in this project.
The User Requirements document has helped bridge the gaps between "user requirements"-centered customers and "functional specifications"-centered developers. This bridge is not complete. Barriers must be overcome that are beyond the power of a simple document.
One thing that we have found to be missing from our User Requirements document is the detailed information needed to set measurable behavioral objectives for the system. Contextual Inquiry does not give us timing benchmarks for the tasks covered. We could not expect this with the methodology we used, because the CI interviewer focuses on asking the user questions while the user performs the tasks. As a result, the performance criteria were under-specified. The performance criteria in the User Requirements document are currently stated using terms such as "easily" and "quickly." These are not useful in discussions with developers faced with realities of cost and schedule constraints. We have become alert to this need in the course of the project, and we now see that incorporating activities to determine accurate performance criteria is important. We are currently using information gathered during usability testing to fill in criteria as best we can.
Unfortunately, setting valid performance criteria on a feature or task before it has been usability-tested is very difficult. Because the users of the CW are demanding professionals, we would like to determine the point in the user's performance where frustration becomes a factor in determining non-acceptance. As we begin Iteration 2, the developers want to know what the acceptance criteria will be. In the absence of our ability to do this based on clear user requirements, the customer may set tighter performance metrics than are really necessary to ensure that the users' true acceptance levels are met. Then, if the performance time of users exceeds the pre-established metrics and users are not frustrated, the customer can relax the acceptance criterion metrics. If both developers and customers are not fully aware of the "real requirements", this overstating of the performance metric can hurt their relationship.
In the end, though, the customer wants the acceptance of the CW to be determined by the users. If the CW is not acceptable, then it must be changed. In a hospital setting, the physicians have the ultimate authority for deciding whether or not to use a system made available to them. Customer acceptance means little if users do not accept the system. We have learned that having developers be aware of this will be an ongoing challenge in fielding a usable CW.
For complex systems such as the development of a clinical workstation for use by physicians, assuming that requirements can simply be passed from customer to developer is not likely to result in a truly usable system. This has necessitated an iterative evolution in developer understanding and interpretation of user requirements. In this project the evolution has been accomplished by having representatives of the customer and the developer work together in conducting iterative evaluations and providing feedback to both organizations on how the design is progressing. This has required an unusual relationship between the customer and the developer and much patience throughout the project by both parties.
For such an approach to work, there must be some level of trust between the customer and the developer [2]. Developers must come to view the gradual development of an understanding of the real requirements as something other than customers not knowing what they want. Customers must realize that changes in designs brought about by a reconsideration of specifications take development resources. As long as this can be carried forward from the solid foundation that we have found contextually-derived requirements can provide, both parties can be assured that they are acting to provide a system that will ultimately prove to meet user demands for quality.
We have been fortunate to have usability specialists available in both the developer and customer organization to work on this. In cases where the specialists are from only one of the organizations, it is generally difficult to expect both sides to view results of usability work as "unbiased" -- no matter how well good usability practice is followed. We believe that this is a coming benefit of the spread of usability expertise -- that teams of specialists with somewhat different perspectives and experiences can find ways to work together in partnership to develop truly useful systems.
Has the project been successful? Specifically, has attention to the user requirements played a role in improving the usability of the system? In any complex development activity it is difficult to make specific success or failure attributions. For the current project we can say that to date users are quite satisfied with the CW they have tested. This is in contrast to an earlier prototype of a system developed to meet the same requirements which received very poor user evaluations (the current effort replaced the earlier failed design). We feel confident that this success is due to the ongoing commitment to the needed result. That led to the level of communication needed for effective response to the real user requirements during development.
2. Bennett, J. L., & Karat, J. Facilitating effective HCI design meetings. In B. Adelson, S. Dumas, and J. Olson (Eds.) Human Factors in Computing Systems: CHI'94 Conference Proceedings (1994), ACM, New York, 198-204.
3. Carroll, J. M. Scenario-based design. Wiley, New York, 1995.
4. Coble, J. M., Maffitt, J. S., Orland, M. J., and Kahn, M. G. Contextual Inquiry: Discovering physicians' true needs. In Gardner, R., ed. Proceedings of the Nineteen Annual Symposium on Computer Applications in Medical Care (1995), Hanley & Belfus, Philadelphia, PA, 469-473.
5. Curtis, B., Krasner, H., & Iscoe, N. A field study of the software design process for large systems. Communications of the ACM, 31, 11 (1988), 1268-1287.
6. Danis, C., & Karat, J. Technology-driven design of speech recognition systems. In G. Olson and S. Schuon (Eds.) DIS'95: Symposium on designing interactive systems. (1995), ACM, New York, 17-24.
7. Floyd, C., Mehl, W., Reisen, F., Schmidt, G., & Wolf, G. Out of Scandinavia: Alternative Approaches to Software Design and Systems Development. Human-Computer Interaction, 4 (1989), 253-350.
8. Fritz, K. and Kahn, M. "Project Spectrum: An Eighteen Hospital Enterprise-Wide Electronic Medical Record." in C. Peter Waegemann (ed), Proceedings, Toward an Electronic Patient Record `95. Volume 2 (1995), Medical Records Institute, Newton, MA, 59-67.
9. Holtzblatt, K., & Jones, S. Contextual inquiry: A participatory technique for system design. In A. Namioka and D. Schuler (Eds.) Participatory design: Principles and practice. Erlbaum, Hillsdale, NJ, 1993.
10. Kehoe, R., & Jarvis, A. ISO 9000-3: A tool for software product and process improvement. Springer, New York, 1996.
11. Kyng, M. Making Representations Work. Communications of the ACM, 38, 9 (1995), 46-55.
12. Whiteside, J., Bennett, J., & Holtzblatt, K. Usability Engineering: Our experience and evolution. In M. Helander (Ed.) Handbook of human-computer interaction. (1988), Elsevier, Amsterdam, 791-817.
13. Wixon, D., Ramey, J., (eds.). Field Methods Casebook for Software Design. J. Wiley, New York, 1996.
![]() |
|