Research Needs for Human Factors

Research Note 83-07 RESEARCH NEEDS FOR HUMAN FACTORS Conmmittee on Human Factors Cot ittee on Behavioral and Social Sciences and Education National Research Council N- __ A 1983 IIJanuary i ~ Approvted for Dublit rellease; distribution unlimited. This report, as submitted by the contractor, has boon cleared for release to Defense Technical Information Canter c• | (DTIC) to comply with regulatory requirements. It has been given no primary distribution other then to UTIC i •., | and will be available only through DTIC or other reference services such as the National Technical Information •: I |Service (KITIS). The views, opinions, and/or findings contained in this report are those of the se, thor(s} and •* i Should not .be..onstruedas an official Department of the Army position, Policy, or dclim;on, unless so designated ] Resoher osficirl tocumentalon.S -Jauar 098 3V Unclassifiled S1CCuRITY CL.ASIIPFICATION Of THIS PAGE (VR'e.on'e ("an REPORT DOCUMENTATION PAGE READ O MSTRUCT'yONS - Via rookT04 U.- .. , GOVT ACCESSION NO. S. RECIP•gN'S CATALOG NUMBE -- Research Note 83-07 4. TITLE (e A2{>/)/.2 ? ___ 5110) S. TyPE of REPORT & PERIOD COVERED Renearch Needs for Human Facotrs S. PERFORMING ORG. REPORT NUMBER "T- ONT ACV OR GRANT NUM-KR(•) Crlmiltee on Human Factors N00014-81-C-0017 fl-PrRFONMINO ORGANIZATION NAME AND ADDRESS 10. PROGRAM E EMN. P•ROJECT, TASK AREA A WORK UNIT NUMSE National Rerearch Council :l *-•1 2101 Constitution Avenue, N.W. 2Q16110274F .WashLnnton. D.C. 20418, ..... 11. CONTROLLING OFFICE NAME AND ADRESS It.. REPORT DATE Engineering Psychology Programs January 1983 Office of Naval Research IS. NUMBER ODP PAGES Arlingrn. Virc.inta 22217 243 It. MOMITORINa AGENCY NAME A ADORESS(if dliferent t1rm ContrtollingOffice) 15. SECURITY CLASS. (of this. mpet Unclassified liS. DUC. ASSIFICATIONIDOWNGRADING 1. OSTRIBUTION STATEMENT (of this it"UO * Approved for public release; distribution unlimited 17. DISTRIBUTION STATEMENT (of the*abstract untoregln glek 20o It different ftm AWWe) IS. SUPP1..EMEMIAMY NOTES *IS. KEY WORDS (Contignue an revuers eWd Itnoesonw7 and identify by block asenime,) Human Factors Research Expert Judgement Hiusumsn Engineering Research Human Computer Interaction Engineering Psychology Research Supervisory Control Systems *. Human Decisiorn Making Population Group Differenc:d ltiMnan FVartnrva t•Pnaorslgy ZLAMTRACr eem'~awso ,mms~ a" N nuaemeyp Sde.*tfy by block ,nmnb~e) 4 This report describes basic research needed to improve the scientific basis of applied human factors work. Six topical areas are covereA; human decision -making; 0.-eliciting information from experts; user- computer interaction;,*) supervisory control systems;..$ population group differences; andQ applied methods. * DO 'AN 79 Unclassified SECt'MTY CLASSIFICATION OF THIS INAGE (When~ Doe. Entoet' -.. . •.--, ' -' •.. , ,-,: .j ,. NOT7IC: The project that Is the subject of this report was approved by the Coverning Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance. This report has been reviewed by a group other than the authors according to procedures approved by a Report Review Committee consisting of members of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The National Research Council was established by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy's purposes of furthering knowledge and of advising the federal government. The Council operates in accordance with general policies determined by the Academy under the authority of its congressional charter of 1863, which establishes the Academy as a private, nonprofit, self-governing membership corporation. The Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in the conduct of their services to the government, the public, and the scientific and engineering "communities. It is administered jointly by Loth Academies and the Institute of Mecicine. The National Academy of Engineering and the 'P Institute of Medicine were established in 1964 and 1970, respectively, under the charter of the National Academy of Sciences. -. The Committee on Human Factors Is sponsored jointly by the Air Force Office of Scientific Research, the Army Research Institute for the Behavioral and Social Sciences, the Office of Naval Research, and the National Aeronautics and Space Administration. This work relates to the Department of the Navy contract N00014-81-C-0017 issued by the Office of Naval Research, and no official endorsement should be inferred. The United States government has at least a royalty-free, nonexclusive and irrevocable license throughout the world for government purposes to publish, translate, reproduce, deliver, perform, dispose of, and to authorize others so to do, all or any portion of this work. Accession For ...................... .. ... . . ... A ' COMMITTEE ON HUMAN FACTORS Richard W. Pew (Chair), Information Sciences Division, Bolt Beranek & Newman, Cambridge, Mass. Nancy S. Anderson, Department of Psychology, University of Maryland Alphonse Chapanis, Department of Psychology, Johns Hopkins University Baruch Fischhoff, Decision Research, a Branch of Perceptronics Inc., Eugene, Oregon Irwin L. Goldstein, Department of Psychology, University of Maryland K. H. Eberhard Kroemer, Industrial Engineering and Operations Research Department, Virginia Polytechnic Institute and State University Herschel W. Leibowitz, Department of Psychology, Pennsylvania State University * J. C. R. Licklider, Department of Computer Science, Massachusetts Institute of Technology Charles B. Perrow, Institution for Social and Policy Studies, Yale University (resigned October, 1982) Michael Posner, Department of Psychology, University of Oregon (resigned April, 1982) Thomas B. Sheridan, Department of Mechanical Engineering, Massachusetts Institute of Technology Jerome E. Singer, Department of Medical Psychology, Uniformed Services University of the Health Sciences J.E. Keith Smith, Human Performance Center, University of Michigan (resigned October, 1982) Robert T. Hennessy, Study Director Karen English, Administrative Secretary Jeanne Richards, Administrative Secretary V. PREFACE The Committee on Human Factors was established In October 1980 under the joint sponsorship of the Office of Naval Research (ONR), the Air Force Office of Scientific Research (AFOSR), and the Army Research Institute for the Behavioral and Social Sciences (ARI) to identify basic research needs of the military services in support of human factors engineering applications and to make recommendations for basic research that wilJ. Improve the foundations of this discipline. The committee's first meeting was held in December 1980; in October 1981 the National Aeronautics and Spece Administration (NASA) joined the sponsors of the committee; and several other govetnment agencies have -" expressed Interest in the committee's work. "Human factors Issues arise In every domain In which humans interact with the products of a technological society. Consequently, the knowledge brought to bear in human factors application3 must be drawn from a wide range of scientific and engineering disciplines. Although no small group can be fully representative of all disciplines relevant to human factors, the expertise represented on the committee is quite broad. It includes specialists from the fields of engineering, biomechanico, psychology, cognitive science, and vii .•:.. . .. . . , . , ...-. . •..,: . ... ...- -- :, ,, . • ...- •. , -., • , _ ... .. j... ' sloclology as well as from human factors engineering. While other disciplines may be relevant, It Is these that are expected to 4, contribute most substantially to the basic data, theory. and methods needed to Improve the scientific basis of human factors. I wish to thank each member of the committee for their thoughtful contributions to this report. Individual members or small groups of members accepted primary responsibility for authoring each chapter. This authorship Is acknowledged in the note at the beginning of each chapter. All committee members, whether they were authors or not, . deliberated, reviewed, and contributed to improvements in the contear of each chapter. I am especially grateful to them for their geierous contribution of time, both In meetings and outside. Their efforts have contributed greatly to the quality of this report, which Ii s a product of the full committee. Special thanks are due to the study director, Robert T. Hennessy, who contributed both technically and administratively to every step in the report's development. In addition, he has taken the kind of Initiatives that made It possible for me to chair the committee with minimum effort and maximum reward. Martin A. Tolcott and Gerald S. Malecki of the Office of Naval Research, Alfred R. Fregly of the Air Force Office of Scientific Research, Robert M. Sasmor of the Army Research Institute, and Melvin D. Montemerlo of the National Aeronautics and Space Administration, representatives of the committee's sponsors, have also made Important contributions. Their support, encouragement, and Identification of relevant Issues have been most helpful. viii •~. . .... .. ,,•. . .... . ••- . .. , . ,...................... • _ * 4I7--_-77_ ' - '" I am grateful also to the participants in our workshop on applied -" methods: Stuart K. Card, David Meister, Donald L. Parks, Erich P. Prien, and John B. Shafer. Their broad understanding of applied methods and their cogent appraisal of the issues and needs in this area formed the basis for Chapter 7 of this report. Several people were helpful to the committee In specific ways. At Wright-Patterson Air Force Base, Kenneth R. Boff organized a series of briefings by personnel from the Air Force Aerospace Medical Research Laboratory and the Air Force Human Resources Laboratory as well as tours of several of their research facilities. During the committee's visit to the Naval Training Equipment Center, Walter S. Chambers and Stanley C. Collyer arranged for presentations by members of the Human Factors Laboratory and briefed the committee on the "* .*research uses of the visual technology research simulator as well as demonstrating this device. I extend my appreciation to these individuals and organizations for their efforts on the committee's "4 behalf. Many other individuals also have contributed to the work of the committee and thereby to the contents of this report. A number of human factors professionals provided thoughtful and detailed responses to a survey on research issues. Others served as outside reviewers of particular chapters. Karen A. English and H. Jeanne Richards have served 3bly and conscientiously as administrative secretaries over the course of the committee's history. Christine L. McShane, editor for the Commission on Behavioral and Social Sciences and Education, through skill and perseverance greatly improved the style and clarity ix - .' • . •: ' T- ," " ~ T W ' .'Y . " o " : ' , , *" " . . ° - " " •• ' - I I. " ' of this report. To all these Individuals I express my sincere thanks "for their significant contributions. *. The CoMMittee's work is ongoing. This is the first In what is expected to be a contnuin, series of reports on Issues in human factors research. I invite the reader's comments and reactions to * this and future reports. Richard W. Pew, Chair Committee on Human Factors .. . . . . . . . . ..* . I.- *- '* * *, * * 4. - - .. 4,- - . . CONTENTS Introduction and Overviev Human Decision Making lI-i Eliciting Expert Judgment 111-4 Supervisory Control Systems IV-1 User-Computer Interaction V-1 Population Group Dlfferences VI-] Applied Methods VII-l 5'x Ii x . - k• 2 2.. 1 * •INTRODUCTION AND OVERVIEW In the last several years the public has become sensitized to the importance of equipment designed to accommodate its human users. In the course of events at the Three Mile Island nuclear power plant many residents of Harrisburg were evacuated because of the accident precipitated by operators misinterpreting their instruments. Coal miners cover equipment lamps intended to illuminate the mine wall, because they object to the glare in their faces. The M-l, the most technologically sophisticated battle tank ever produced, is limited by the operating difficulties experienced by its crew. With computer terminals now pervasive in the workplace, more users are voicing their complaints about requirements to convers., in arcane dialects of computer languages. Each of these examples reflects a failure to consider the design of a system from the point of view of Its potential users; thus it is * not surprising that the public is demanding that more attention be ,. paid to such considerations. These demands may be expressed in the decisions of jurors in court cases involving product liability, in the renewed emphasis on human factors in military and aeronautics The principal author of this chapter is Richard W. Pew. 2 laboratories, and in the increase in job opportunities for human factors professionals in the computer industry. In Harch 1982, over 1,000 people participated in a conference devoted to discussing how to make computers more user-oriented. -hThe historical roots of the human factors profession are in industrial engineering and in psychology. In the early 1900s Frederick W. Taylor coined the term scientific management, by which he meant the application of scientific principles in the design of the Industrial workplace. Although overzealous "Taylorism" resulted in some early mismanagement, his work formed one of the building blocks for modern industrial engineering and operations research. During the latter stages of World War II, psychologists, w:. *-- been involved in the selection and training of aircraft pilots, were called on to take a novel perspective. Instead of selecting pilots to meet the severe demands of the cockpit, they were asked to select the cockpit design best suited to the characteristics of pilots. This approach reduced accidents and allowed a larger population of potential pilots to be certified. Because flying pushes the human body to its physiological limits, the effects of physiological stress on performance became a further consideration. After the war a small group of universities began training human factors specialists for research and development in the military services and the aerospace Industry. In 1957 the Human Factors Society was formed with 90 founding members; by 1977 the membership had grown to 1,956; and in the last five years the organization has expanded by an additional 50 percent. In addition, various engineering societies have formed groups related 3 to human factors. The formation of this committee within the National Research Council in 1980 is the latest explicit recognition of the importance of human fa~ctors in today's technological society* Human factors engineering can be defined as the application of scientific principles, methods, and data drawn from a variety of "disciplines to the development of engineering systems in which people "play a significant role. Successful application Is measured by improved productivity, efficiency, safety, and acceptance of the resultant system design. The disciplines that may be applied to a particular problem Include psychology, cognitive science, physiology, biomechanics, applied physical anthropology. and industrial ard systems engineering. The systems range from the use of a simple tool by a consumer to multiperson sociotechnical systems. They typically Include both technological and human components. Human factors specialists frow these and other disciplines are united by a singular perspective on the system design process: that design begins with an understanding of the user's role in overall system performance and that systems exist to serve their users, whether they are consumers, system operators, production workers, or maintenance crews. This user-oriented design philosophy acknowledges human variability as a design paramete'. The resultant designs Incorporate features that take advantage of unique human capabilities as well ap build in safeguards to avoid or reduce the impact of unpredictable human error. On the International scene this collection of activities has been called ergonomics, meaning the study of work. Its practitioners have placed somewhat more emphasis on biomechanics and the physiological N5 :£' Ž.i Ž m 4 A . . - ~ ~ .x 4 costs of doing work. than have human factors practitioners In the United States. Aside from this dlstlnctioný the two terme refer to the sameacollection of specialties. While Its foundations rebt ultimately In the parent disciplines, human factors research focuses on the solution of system design problers Involving more than one of these disciplines. Since World War II the major sources of funding for basic research underlying * human factors work have been the National Aeronautics and Space Administration (NASA) and the military services. Since the passage of the Mansfield Amendment (Public Law 91-441, 1970) to the U.S. defense budget, which mandated a shift toward system development avd away f•ou basic research, the real dollar volume of research has not increased very much. What research there is has focused increasingly on short-term goals. As a result the basic knowledge needed to provide the underpinnings for human factors applications to new technology has not been generated. The need to reverse this trend is at least part of the reason that the military services and NASA. have taken the Initittive in sponsoring the work of this committee. This report reflects the co Ittee's recotmendations for needed research in terms of both long-term and sbozt-term objectives. This report does not attempt to cover the full scope of human factors engineering, even in relation to military and NASA needs. As the committee began discussing research needs, a wide range of possible topics was considered. Two of our meetings Included tours and discussions of ongoing research in military laboratories. Cowmittee members were encouraged to develop brief position papers on highlighted topics that were germane to their interests. The human 5 factors comunity was surveyed through an article in the Bulletin of the Human Factors Society, and 116 responses were received; the survey results confirmed the importance of many of the topics already Identified by the committees Some topics were dropped, and some new papers were generated. Others were combined into coherent units; still others were deferred for further study or initiative. The material in this report is the result of that process. Each chapter is designed to be a self-contained report of an important area In which research is needed. All the topics discussed here meet the following critezia: (1) each topic is germane to our military and NASA sponsors; (2) the topics are within the expertise of the committee; (3) each topic has been, in the opinion of the committee, incompletely addressed by previous or current military and civilian research efforts; and (4) the potential results of the recommended research will be important contributions to the scientific bas-s and practice of human factors. And the work of the committee Is ongolng. In addition to the research areas presented in this report, work on a number of topics is in various stages of development: (1) organizational context In relation to design; (2) team performance; (3) simulation; (4) human performance modeling; (5) multicolor displays; (6) human factors education, and (7) accident reporting "systems. We expect to address many of these as well as other topics in subsequent reports. In the paragraphs that follow, the areas of research suggested by the ccimittee are summarized together with some of our major recommendations. The chapters themselves provide a detailed elaboration of these topics. ............................................. 6 HUMAN DECISION MAKING A. central issue in the understanding of humaa performance is human decision making. It has become even more important vith the increased role of automation In complex modern systems ranging from military command, coutrol, and communication systems to aircraft and process control systems. There has been much support for research on decision making over the last 15 years, particularly by the Defense Advanced "Research Projects Agency and the Office for tNaval Research. This "research has tended to focus on formal decision theoretic -'. constructions, which, while analytically powerful, have proved to be Insufficiently robust to reflect the strengthe and weaknesses of humav "decision-ua!:ing capabilities. The committee recommends murtber research, with an emphasis on moving into uncharted areas. Surprisingly, despite the effort devoted to decision-making "research in general, there is still a need for revearch on how to "structure practical decision problems and on Improving the realism of "models that claim to relate to decislon-making performance. We do not know how to represent decision situations that evolve dynamicallyt nor do we have a systematic framework from which to consider decision aiding. Furthermore, we are coming to realize that many plunn$.ng activities actually involve decision makiug that cannot be modeled by enumerating the possible states of the world and courses of action In a unitary decision matrix. They often evolie :vor time in bits and 7 pieces with limited central direction. We need a deeper understanding of such diffuse decision processes In order to provide effective computer sids for this kind of decision. While previous work has led to many decision-making aids and models, no criteria or methodologies have been suggested for evaluating their relative merits. Until such comparisons are made, practitioners will continue to advocate their own products without a basis for choice among them. Finally, there is a persistent need for development of Innovative ways of soliciting preference and relative value judgments from people, a problem that leads us directly to the second topic. ELICITING EXPERT JUDGMENT The application of expert judgment covers everything from medical evaluations to accident investigations. Although the subject matter ranges widely, it Is our belief that there are generic, substantive .1 research issues that should be addressed in a coherent program. These problems recur in diverse contexts for which elicitation methods either do not exist or are Inadequately standardized across applications to yield consistent results. The research issues include (1) creating a coon frame of reference from which to assess judgments among a group of experts; (2) formulating questions for "experts in a way that is compatible with their mental structures or "cognitive representations of a problem; (3) eliciting judgments about the quality of infoziation; (4) detecting and identifying reporting i;: . .. .~ ~.. ~~~. . . - ' -' - " "" " " " ' " " *~~ . " 717777 8 bias in judgments; and (5) minimizing the effects of memory loss and distortion on the reporting of past events. SUPERVISORY CONTROL SYSTEMS Supervisory control is a relatively new conceptualization of system function that is playing an increasingly important role in automated systems. In such systems, operators supervise the semiautomatic control of a dynamic process, such as a chemical plant or railway system* Typically the operators work in teams and control compute3, which in turn mediate information flow among various automatic components. Other examples of supervisory control systems are modern aircraft, medical intensive care units, power plants, and distributed tommand and control systems such as may be found in military operations or in manufacturing by robots. Such systems deemphasize the importance of human sensory and motor capabilities and emphasize complex perceptual and cognitive skills. This perspective is relatively new to practicing system designers; work is beginning to be sponsored in these areas, but much further development is needed. Supervisory control may be thought of as a generalization from earlier work on monitoring and controlling complex systems; in that sense the foundations for modeling acd theory are established. The theory must be greatly elaborated and extended, however, to meet the analysis requirements of current and future systems. As the human skills of thinking, reasoning, planning, and decision making become key, the models must be able to accommodate these human capacities and ,. .. ., . . . • . .. . ... , . . -. . .......................... ••.'a'.2 .. •a• • •••.......... •....... :...... * 9 limitations. This is a choice opportunity to bring together work on control thsory models abd cognitive science representations* Cognitive psychology is also advancing our understanding of the way in which resources are shared among various processes within the brain. This work has unexplored Implications for understanding how to modify system design to change perceived workload, particularly In the complex tasks typical of supervisory control. Each of the military services has research programs focused on human workload analysis. In our opinion many of them are too application-oriented; they need a stronger focus on research to advance the knowledge base from which new application techniques will emerge. Another key concern In supervisory control is prediction and the control of human error. Our understandlng of this topic is in its infancy. We have no general theory of human error, although theories abound for human response time. Human reliability analysis has been In vogue for several years, but, as currently practiced, it simply uses the numerical aggregation of historical data on recorded human "failure rates. It is weakest in just the situations in which it is most needed--when the activity involves complex diagnosis, situation assessment, and interaction with computers. At the level of design, there are three major questions: how to design supervliory control tasks to accommodate human capabilities and limitations; how to organize and display the information needed to carry out these tasks; and how much control to delegate to the human versus the automatic parts of the system. . r'* rj..' . . 2 ..- *. . -- ,... *.*-. • .. 'fC.*' •-• •.. . .-.. .. . .. * 10 USER-COMPUTER INTERACTION Since computers are already playing a major role in most new system developments, Including supervisory control systems, Issues of facilitating the learning and use by both computer professionals and novices has been accorded a chapter of its own. At a March 1982 conference on user-computer interaction, more than 100 papers addressed a variety of topics related to hardware and software design. More than half of the 1,000 participants were system design specialists from industry and government. The committee believes that this level of interest foretells a heavy demand for scientific knowledge that has yet to be created. Although a number of industrial laboratories are supporting proprietary work, there is only one major funded collaborative effort between computer science and human factors specialists, that at Virginia Polytechnic Institute and State University (funded by the Office for Naval Research). Most human factors research has been done in the area of computer hardware. Information is available on which to base design decisions concerning information display hardware and keyboards. Many alternative input devices, such as joy sticks, track balls, and light pens, have been studied in the context of specific applications. There is a need for further work on input devices that focuses on comparison among the full range of devices across a broad set of uses, Including instruction, text processing, and graphics. N .- ...-.................... - .. . - .-o O -. N, Automatic speech recognition and production have attracted much "interest as the technology improves. Speech as an alternative to manual and visual modes of input and output needs systematic investigation. Fundamental work is necessary on the design of "Interactive speech dialogs that Involve inherently sequential communication and potentially heavy memory demands on the listener. As computer terminals are becoming pervasive in the white-collar workplace, concern is growing about the adverse effects on people from long-term use of terminals with cathode ray tubes (CRTs). A recent study by the National Institute for Occupational Safety and Health found no radiation hazards from CRTs but did find a substantial increase in worker complaints of fatigue and other health problems from sustained daily use. This study was not able to distinguish CRT design-based complaints from those relating to the task or other features of the workplace--and this is an urgent research need. In Europe, governments are now mandating standards for workplace designs. It will not be long before similar actions are taken in the United States and the research must begin now to anticipate them. In the area of software design, research needs are only beginning to be filled. Effective design of sophisticated software implies understanding of human knowledge sytems and the ability to represent not only what a user knows but also how a user makes inferences from "that Information. There is a need for models of users' understanding of the system with which they are interacting, a problem that is "important for supervisory control applications as well. Perhaps the most neglected research area in computer system development ts how to produce effective materials and reference •.• . .7 • • •... -• .,. ., .. . .. . . . .. . --- . /. 12 Information. While design principles developed for printed materials are useful for computer system documentation, there are documentation opportunities unique to Interactive systems that we do not yet understand how to exploit effectively. Finally, there is a need to understand in more detail the characteristics of the user population that make a difference in computer system design. We need research that suggests, in parametric terms, how changes in user characteristics should be reflected in system design changes. The committee regards user-computer Interaction as one of the most urgent topics on which to undertake research initiatives. POPULATION GROUP DIFFERENCES Through public sentiment as well as government legislation, our society has mandated the elimination of discrimination among 9 population groups in the design of jobs and workplaces. In addition to racial discrimination, there is growing concern about discrimination on the basis of sex, are, and disability. We lack the research necessary to describe the nature and extent of performance differences among the various population groups about which discrimination is a concern. The committee believes it is in the national interest to undertake the research necessary to accommodate this relationshp between population group differences and design. It is not enough to consider population group differences per se. In some cases the effect of a group characteristic such as age on -S 13 performance may depend on the value of some other variables, such as amount of training or level of interpersonal skills. It may be misleading to discover simply that performance deteriorates with age, when in fact training or experience may reverse that trend. Such interactions remain largely unexplored. There is also a need to understand the way in which these differences in performance should influence workplace desi8n or training procedures. We know how to write equipment specifications designed to fit 95 percent of a particular user population insofar as body dimensions are concerned, but for most other human performance characteristics we lack this knowledge. APPLIED METHODS Much human factors work is performed under constraints of money, time, and opportunity that preclude the use of the kind of experimental methods used in laboratory research. From necessity, human factors practitioners have adopted or developed a variety of applied methods for acquiring or organizing information related to human characteristics that arise in the context of system design, development, and evaluation. Examples of these methods are task analysis, information flow analysis, collection and analysis of survey date, evaluation of physical mock-ups, and the structured walk through. In contrast to the methods of scientific research, which are maintained and disseminated in university curricula and textbooks, and by specialists who devote careers to improving and inventing "experimental design procedures, applied methods in human factors work 14 , are described only briefly in technical project reports, which are difficult to access, and efforts to improve or invent methods occur largely in connection with a particular project. There is a clear need to develop a compendium of standard descriptions of the most important applied methods. This compendium would be valuable for use in human factors curricula in colleges and universities and for continuing education tutorials for human factors "practitioners. Currently most knowledge of applied methods is gained through on-the-job experience. Documenting existing applied methods, however, will not fulfill the methodological needs for all current and future system design purposes. Advances in computer technology applied to automatiQLA ftC4c. supervisory control systems and computer systems themselves all have profound methodological implications for the analysis and description of the roles people play in these sytems. Existing methods such as workload analysis, protocol analysis, and function allocation require research to modify and extend their use in new applications in which the emphasis is on cognitive functions of operators rather than on the perceptual-motor functions prominent in old systems. Similarly, there Is a need to develop new methods to provide information of the type and form necessary to resolve such issues as translating task requirements into personnel selection criteria, deriving training requirements from functional requirements, and describing or evaluating the effects of task or system functions on the affective responses of personnel. All the basic research needs addressed in this report require experimental Investigations to provide the theory, principles, and 15 data to support human factors work in the design and evaluation of systems. The application of the knowledge derived from basic research, however, will occur largely through the use of applied methods. Documentation of existing methods and research to extend and Iniltiate methods to meet future needs are as essential as the substantive research to improve both the scientific basis and the practical effectiveness of human factors work. CONCLUSION System design and the world of work are undergoing profound changes. In a period when automation is replacing the need for finely tuned 4 perceptual-motor activities by skilled operators, human productivity is no longer easily assessed in terms of unit output. New systems place Increased demands on the cognitive and decision-making aspects of human performance. The role of people in systems is shifting to those of monitoring and directing otherwise automatic processes in industrial production, transportation, military operations, and office work. These changes in human-machine relations both offer new opportunlities and present new problems for system design, It is therefore timely and appropriate that the committee's first report of 4 research needs in human factors emphasizes the importance of understanding fundamental cognitive processes and their role in 4i interactive and supervisory control systems. S.N 4,. . BLANK PAGE 7 ' / "S .,• HUMAN DECISION MAKING Work organizations, and those who staff them, rise and fall by their ability to make decisions. These may be major strategic decisions, such as the deployment of forces or Inventorfes, or local tactical decisions, such as how to promote, motivate, and understand particular subordinates. To list the kind3 of decisions that need to be made and the stakes that somezimes ride on them would be to repeat the obvious. Decisions are made explicitly whenever one consciously combines beliefs and values in order to choose a course of action. They are made Implicitly whenever one relies on a ritualized response (habit, tradition) to cope with a choice between options. Repetition of past decisions may result In suboptimal choices; however, it may also provide a ready escape from the difficulties and expense of explicit decision *1• making. The reasons decision making often seems (and is) so difficult are quite varied, as are the opportunities for interventions and the needs for human factors research to buttress those interventions. One problem Is information overload: More things need to be considered than can be held and manipulated in one's head The principal author of this chapter is Baruch Fischhoff. w". 2 simultaneously. Coping with such computational problems ic an ideal task for computers, and there are a variety of software Vackaget aallable that in one way or another combine decision makers' belie.s and values in order to produce a recommendation. Choosing between and using these decision aids forces one to face a second inherent difficulty of deciaion making: not knowing how to define (or structure) the decision problem and to assess one's own values, that is, how to make trade-offs between competing objectives. Because analytic decision-making methods caanot operate without guidance on these issues, judgment is an inevitable part of the decision-making process, as is the need for judgment elIcit&i4,n methods to complement the decision aid (see Chapter 3). A third difficulty is knowing when to stop analyzing and start acting. Taking that step requires one to assess the quality of the decision-making process and reconcile any remaining conflicts between the recommendation it produces and that produced by one's own Intuitious. To help one through this step, a decision aid must reveal its own limits in ways that are psychologically meaningful. A fourth difficulty Is that In many interesting decisions one knows too little to act confidently. When uncertainty is a fact of life, the role of good design is to ensure that the best use is made of all that is known. The existence of these four problems is common knowledge. Their resolution is complicated by a fifth difficulty whose Identification requires research: People's commonsense judgments are subject to robust "and systematic biases. These biases make It difficult to rely on "intuition as a criterion for the adequacy of decisions and the methods 4 that produce them. Decision aids must accommodate these biases and may require supplementary training exercises lest their recommendations be •°°2 ,, . ... . ..... ... . • • • - . . , ••, ."• ". • . . . .,• •. '. - .'•,. 3 adopted only when they affirm intuitions that are known to be faulty. Given the multitude of decisions that are made, any research or design effort that made even a minute contribution to the quality of a minute proportion of all decisions would bear a large benefit in absolute .. o, terms* Proving that such a benefit had been derived would be as difficult as It is in most areas of human factors work. Whenever uncertainty Is involved, better decisions will produce outcomes only over "the long run. That makes it difficult to establish the validity of bona fide Improvements and easy to fall prey to highly touted methods with good face validity, but little else. A sound research base is needed not only to develop better decision-making methods, but also to give users a fighting chance at being able to identify which methods are Indeed better for their purposes. BACKGROUND . Ad hoc advice to decision makers can be traced from antiquity to the Sunday supplements. Scientific study of decision making probably begins with the development of statistical or Bayesian decision theory by Borel, Ramsey, de Finetti, von Neumann, Morgenstern, Venn, Wald, end others. They showed how to characterize and interrelate the primitives of a general model of decision-making situations, highlightirg its subjectyive elements. The development of scientific decision aids could be traced in the iwork of Edwards, Raiffa, Schlaifer. and others, who showed how complex real-world decision situations could be intetpreted in terms of the general model. Essential to this model is the notion that 4 decision-mak~ng problems "an be decomposed Int4 components that can be i'soessed Individually, then combaned Into a general recommendation that reflects the decision makers' best Interest, Those components are "typically described as options, beliefs, and values or alternatives, opinions, and preferences, or some equivalent triplet of terms. They are interrelated by an integration scheme called a decision rule or problem structure (e.g., Fischhoff, at al., 1981; Sage, 1981). -More generally, decision-making models typically envision fou: interrelated steps. 1. Identify all relevant courses of action among which the d-clsion I•! maker may choose. This choice among options (or alternatives) constitutes the act of decision; the deliberations that precede it considered to be part of the decision-making process. are 2. Identify the consequences (advantages) that may arise as a result of choosing each option; assess their relative attractiveness. In this act the decision maker's values find their expression. Although these values are essentially personal, they may be clarified by techniques such as multiattribute utility ar,alysis and informed by economic techniques that attempt to establish the market value of consequences. 3. Assess the likelihood of these consequences' being realired. These probabilities may be elicited by straightforward judgmental methods or with the aid of more sophisticated techniques, such as fault tree and event tree analysis. If the decision maker knows exactly what will happen given each course of action, It then becomes a case of decision making under conditions of certainty and this stage drops out. 4. Integrate all these considerations in order to identify what appears to be the best option. Making the best of what is or could be o,~ * 5 known at the time of the decision IF the hallmark of good decision making. The decision maker io not to be held responsible if this action meets with misfortune and an undesired option is obtained. These steps are both demanding and vague. Fulfilling them requires considerable attention to detail and may be accomplished in a variety of ways. Moreover, they may not even be followed sequentially, if Insights gained at one step lead the decision maker to revise the analysis performed at a different step. This flexibility has produced a variety of models and methods of decision making whose interrelations are not always clearly specified. The opportunity for routinizing and merchandising these "decision-makiag procedures led to one of the academic and consulting growth industries of the 1970s. A wide variety of software packages and firms can now bring the fruits of these theoretical advances to practicing decision makers. Decision analysis, the most common name for these procedures, is part of the curriculum of most business schools. *+ Although it has met considerahle initial resistance from decision makers because of its novelty and because of the explicitness about values and beliefs that it requires, decision analysis seems to be gaining considerable acceptance (e.g., Bonczek, at al., 1981; Brown, et al., 1974; Raiffa, 1968). This acceptance seers, even now, to go beyond what could be Justified on the basis of any empiricel evidence of its efficacy. Figure 2-1 gives some examples of the contexts within which decision-aidIng schemes relying on interactive computer systems have been operating and have been reported In the professional literature. Figure 2-2 is similar to the summary printout of one such scheme, which offers physicians on-line diagnoses of the causes of dyspepsia. S+•i+ • k l + t . + _ ____" *•r* 4"-, %*,*,:'* '.' "4 : AccountinS--helplng to assess the financial vi~bility of corporations. * Clinical diagnosis--helping physicians to decide whether to perform diagnostic procedures and how to interpret their restilts. Counseling--helping people to choose careers or consider having children. Energy--choosing where to site energy-producing facilities. Meteorology--derivation of precipitation forecasts. Military--deciding whether troops are in an adequate state of readiness; preplannIng responses. Petroleum geology--allocatioa of resources for oil exploration. Pharumaceuti~s--helping in monitoring field reports in order to decide whether drugs need to be recalled. FPesearch and development--deciding how to allocate funds. .-. A*m i FIGURE 2-1 E~xamples of Operating Decision-Aiding Systems' FfUE21Eamlso prtigDcso-Adn ytm 7 ROTHERMAN AREA HEALTH AUTHORITY UNIT NO. 1 456/89 MONTAGU HOSPITAL SURNAME: Smith SYMPTOM PROCESSING PROJECT FIRST NAMES: John HISTORY SHEET CLINICIAN: Dr. Gardner SYMPTOMS INPUT TO COMPUTER •,- Mae Relief antacids •'Age 60-69 Nightpain pros. SSite epigastr~c Nausea present Radiation none Vomiting present Duration 7m-lyr Meals: pain Immed Pattern episodic Haematemesis abs Pain is moderate No Indigestion Progress worse Bowels OK Aggd by food Micturition OK "COMPUTER PROBABILITIES BASED ON THESE SYMPTOMS 0 25 50 75 100 FUNCTIONAL 22 - -------.X----..-------.---------- - - CHOLECYSTITIS 0 X ---------------------- DUODENAL ULCER 2 X--------------...... GASTRIC ULCER 76 -------------------------------- X ...... CA. STOMACH 0 X --------------------------- none of these ---------------------------------------- If you judge any of the above probabilities to be In error please adjust them accordingly. PROVISIONAL DIAGNOSIS if appropriate Is-- Level of confidence in this diagnosis. very tentative certain 1 2 3 4 5 The hiphest probability has been assigned to GASTRIC ULCER. If this or any other probability is not in accordance with your own judgement, please indicate reasons for your conclusions. "FIGURE 2-2 Summary Printout of a Medical Decision-Aiding Scheme * Source: D. C. Barber and J. Fox (1981). ..................... - -.• -, - ,*.-. ,. . • . • , - -. o -b - .° - A *. %. . . . . Behavioral decision theory (e.g., Einhorn and liogarth, 1981; Slovic, et al., 1979; Wallsten, 1980) has taken decision aiding out of the realm ..- j of mathematics and mcrchandising into the realm of behavioral research by recognizing the role of judgment in structuring problems and in eliciting * - their components. Researchers In this field have studied, in varying degrees of detail, the psychological processes underlying these judgments and the ways in which they can be Improved through training, task restructuring, and decision-aid design. A particular focus has been cn the identification and eradication of judgmental biases. The research described below is that which seems to be needed to help behaviorai decision research fulfill this role. An important development in this research over the last decade has been its liberation from the mechanistic models of behavior Inherited from economics and philosophy. The result has been more process-oriented theories, attempting to capture how people do make and would like to make "decisions (e.g., Svenson, 1979). This change was prompted in part by the realization that mechanistic models offer little insight into central questions of applications, such as how action options are generated and "when people are satisfied with the quality of their decisions. These developments are reflected in the research described below. There may seem to be a natural enmity between those purveying techniques of decision analysis and those studying their behavioral underpinnings, with the latter revealing the limits of the procedures that the former are trying to sell. In general, however, there has been "rather good cooperAtion between the two camps. Basic researchers have often chosen to study the problems that practitioners find most 4*o,, .-.. .. - 9 troublesome, and practitioners have often adopted basic researchers' sugSestions for how to improve their craft. For example, in both commercial and government use, one can find software packages and dicision-msking procedures that have been redesigned in response to basic research. Established channels (e.g., conferences, paper distribution lists) exist for members of this community to communicate with one another. Many of the leading practitioners have doctoral-level training, usually in psychology, management science, operations research, or systems engineering, and maintain academic contacts. Indeed, the quantity of basic research has been reduced by the diversion of potential researchers to applied work, although its quality may have benefited from being better focused. Although problems remain, research in this area has a fairly good chance of being useful and of being used. In addition, none of the research issues discussed in the following sections appears to pose any serious methodological difficulties. The conventional experimental methods of the behavioral sciences are suitable for performing the recommended investigations. RESEARCH ON DECISION MAKING Given the relatively good communication between decision-making researchers and practitioners, the primary focus of the recommendations that follow is the production of new research, as opposed to its dissemination. It seems reasonable to hope that the same communication networks that brought these applied problems to the =ttention of academics vill carry their partial solutions back to the field. Research V- Sr T- '•-'••• - %- .- 10 on decision making per se assumes that there are general lescons to be learned from studying the sorts of issues that recur in many decision problems and the responses typically made to them. In fact, the complexity of real decision problems is often so great as to prevent some lessons from being learned from direct study. These recommendations are cast in terms of research needed to improve the use of computerized decision aids, referred to generically as decision analysis. These aids work in an Interactive fashion, asking people to provide critical Inputs (e.g., the set of actions that they are considering, the probability of those actions achieving various gc-ls), combining those inputs into a recommendation of what action to ... - repeating the process until users feel that they have exhausted its possibilities. In order to be useful, an aid must: (a) deal with those aspects of decision making for which people require assistance, (b) ask for inputs in a language compatible with how people think Intuitively about decision making, and (c) display Its recommendations in a way that properly captures their implications and definitiveness. Achieving these goals requires understanding of (a) how people assess the quality of human performance in decision-making tasks, (b) the nature of decision-making processes, and (c) how people assess the quality of N- decision-making processes, both those they perform and those performed for them. The research described below is intended to contribute to all three of these aspects of systems design. It is also intended to facilitate the development of supplementary components of decision-support systems, such as exercises for improving judgment or for more creative option generation. "In this light, research that contributes to hardware or software design should also be a useful adjunct to any formal or semiformal decislon-making process in which judgment plays a role. Even the devotee of decision analysis often lacks the time or resources to do anything but an informal analysis. Decision Structuring Decieion making is commonly characterized as involving the four interrelated steps described earlier. The first three of these give the problem its structure, by specifying the options, facts, and value issues to be considered as veil as their Interrelations. Prescriptive models of decision making elaborate on the way these steps should be taken. Most descriptive theories hypothesize some deviation of people's practice from a prescriptive model (Fischhoff, Goitein, and Shapira, 1981). These deviations should, in principle, guide the develoFment of the prescriptive model. That Is, they show how the prescriptive models fail to consider issues that people want to incorporate in their decisions. In practice, however, the flow of information is typically asymmetrical, with prescriptive models disproportionately setting the tone for descriptive research. As a result, decision structuring is probably the least developed .4. aspect of research into both prescriptive and descriptive aspects of decision making (von Winterfeldt, 1980). Prescriptive models are typically developed from the pronouncements of economists and others "regarding how people should (want to) run their lives or from ad hoc ""U "p. 12 lists of relevant considerations. Descriptive models tend more or less to assume that these prescriptions are correct. Neither seems to have explored fully the range of possible problem representations that people use when left to their own devices. Paying more attention to the diverse ways in which people do make "decisions would enable decision aiders to offer their clients a more diverse set of alternative ways in which they might make decisions, along "with some elaboration on the typical strengths and weaknesses of each method. Some research projects that might serve this end follow. o Studies of dynamic structuring, allowing for iterations in the decision-making process, with each round responding to the Insights "gained from Its predecessors (Humphreys and McFadden, 1980). Can people use such opportunities, or do they tend to stick to an initial representation? Are there initial structures that are less confining, which should be offered by the aids? o Studies of goals other than narrow optimization. In economic "models, the goal of decision making is assumed to be maximizing the "utilityof the immediate decision. Recently attention has turned to other goals, such as reducing the transaction costs from the act of making a decision, Improving trust between the individuals iuvolved in a decision, making do with limited decision-making expertise, imposing consistency over a set of decisions, or facilitating learning from -: experience. Theoretical studies are needed to clarify the consequences of adopting these goals (e.g., how badly do they sacrifice optimization); empirical studies are needed to see how often peeple actually want to accept them (particularly after they have been informed of the results of the theoretical studies). 13 o Option-generation studies. Decision makers can only choose between the options they can think of. Each decision need not be a new test of their Imaginations, particularly because research Indicates that Miagination often fails. Research can suggest better formulation "procedures and generic options that can be built into decision analysis schemes (Gettys and Fisher, 1979). o Many decision analysis schemes are sold as stand-alone systems, to be used by decision makers without the help of a professional decision analyst. The validity of these claims should be tested, particularly with regard to decision structuring, the area in which the largest errors "can occur (Pitz, et al., 1980). Research could also show ways to improve the stand-alone capability (e.g., with better Introductory training packets). Measuring Preferences * Unless one is fortunate enough to find a dominating alternative, one that is better than all competitors in all respects, making decisions means making trade-offs. When one cannot have everything, it ti necessary to determine the relative importance of different goals. Such balancing acts may be particularly difficult when the question is new and the goals that stand In conflict seem incommensurable (Fischhoff, et al., 1980). Dealing with hazardous technologies, for example, leads us daily to face questions such as whether the benefits of dyeing one's hair are worth a vague, minute increase In the chances of cancer manv years hence. Decision analysis schemes seem to complicate life by making these 14 inherent conflicts apparent (McNeil, et al., 1978). They actually complicate it when they pose these questions in cumbersome, unfamiliar ways In order to elicit the information needed by their models--e.g., how great an increase in your probability of being alive in five years' time would exactly compensate for the .20 probability that you will not recover from the proposed surgery--and does this trade-off depend on other factors? Such questions are difficult in part because their format is dictated by a formal theory or the programmer's convenience, rather thar by the decision maker's way of thinking. They are also difficult b•c.•,' of the lack of research guiding their formulation. Research on rhe 41 elicitation of values has lagged behind research on the elicitation of judgments of fact (Johnson and Huber, 1977). Although there are many highly sophisticated axiomatic schemes for posing value questions, few have been empirically validated for difficult, real-life Issues. In practice, perhaps the most common assumption is that decision makers are able to articulate responses to any question that is stated in good English. The projects described below may help solve problems that currently "are (or should be) worrying practitioners. Some similar needs have been "identified by the National Research Council's Panel on Survey-Based Measures of Subjective Phenomena (Turner and Martin, 198X). o No opinion. In most behavorial decision research, as in most survey research, economics, and preference theory, people are typically assumed to know wl'at they want. Careful questioning is all that is "needed to reveal the decision maker's implicit trade-offs between whatever goals are being.compared. The need for some response is often . .w - - 15 necessary for the analysis to continue. Knowing how to-discover when decision makers have no opinions and how to cope with that situation would be of great value. Studies of "no opinion" in survey research - (Schumann and Presser, 1979) would provide a useful base to draw on, although they often show that people have a disturbing ability to manufacture opinions on diverse (and even fictitious) topics. o Interactive value measurement. One possible response to situations in which decision makers' values are poorly articulated (or nonexistent' is for the decision aider to engage In a dialogue with the client, suggesting alternative ways of thinking about the problem and the Implications of various possible resolutions. Although there are obvious opportunities for manipulating responses lit such situations, research may show how they could be minimized; at any rate they may be rendered no worse than the manipulati~n inherent in nut confronting the ambiguity in respondents' values. Of particular interest is the question of whether people are more frank about their values and less susceptible to outside pressures when interacting with a machine than with another human being. Again, some good leads could be foud in the survey research literature, particularly in work dealing with the power and prevalence of interviewer affect. o Specific topics. In orde. to interact constructively with their *- clients, should decision aiders be able to offer a comprehensive, balanced description of the perspectives that one could have on a problem? The provision of such perspectives may be enhanced by a combination of theoretical and empirical work on how people could and do think about particular issues (Jungermann, 1980). For example, to aid decision problems that involve extended time horizons, one would study ,F 16 how people think about good r-nd bad outcomes that axe distributed over time. One might discover that people have difficulty conceptualizing "distant consequences and therefore tend to discount them unduly; such a tendency could be countered by the use of scenarios that reify hypothetical future experiences. Medical covnseling and the setting of * safety standards are two other areas with specific problems that reduce "the usefulness of decision technologies (e.g., the difficulty of imagining what It would be like to be paralyzed or on dialysis, unwillingnsas to place a value on human life), o $imulati•g valmes. One obvious advantage of computerized qvskems Is to work quickly through calculations using alternative valueE, different parameters, A possible didactic use would be to help people clarify what they want, by simulating the implications of different sets of preferences ("If those were your trade-offs, these would be your choices"), both on the problem In question and on sataple problems. Work "along this line was done at one time in the context of social judgment theory (Hammond, 1971). Completing it and making it accessible to the users of other decision aids would be useful. o Framing. Recent research has demonstrated that formally equivalent waya of representing decision problems can elicit highly Inconsistent preferences (Kahneman and Tversky, 1979; Tversky and "Kahneman, 1981). Because most decisioo-aiding schemes have a typical "manner of formulating preference questions, they may inadvertently be biasing the results they produce. This work should be continued, with an eye to characterizing and studying the ways in which decisio•n analysis "schemes habitually frame questions. %L', * ~ - - ~ - 17 Evaluation The decision maker looking for help may be swamped by offers. The range of available options may run from computerized decision analysis routines ed to super-soft decision therapies. Few of these schemes are supported by empirical validation studies; most are offered by Individuals with a .,ested interest in their acceptance (Fischhoff$ 1980). A comprehensive evaluation program would help decision makers sort out the contenders for their attention and to use those selected judiciously, with a full understanding of their strengths and limitations (Wardle and Wardle, 1978). Such a program might Involve the following elements: o Collecting and charactertitie the set of exhiting decision aids K with an eye to discerning common behavorial assumption& (e.g., regarding the real difficulties people have in making decisions, the ways in which they vent to have problems structured, or the quality of the judgment inputs they can provide to decislon-making models). o Examiniag the assumptions identified above. This might include questions like: Can people separate judgments of fact from judgments of value? When decision makers are set to act in the name of an institution, can they assess Its preferences, unencumbered by their own? Can people Introspect usefully about beliefs that have guided their past deeisions, free from the biasing effects of hindsight? o Developing methods for evaluating the quality of decisions (tuch as are produced by different methods). For example, what weights should be placed on the quality.of the decision process and on the quality of ..................................... .,.,..+.'.+'..'... ,'.--'.2.- .', +.. ,• -.+ ,-. -. + . + . ,+ .- o +. .++,.•.+ .• ...... • +~t•,.• ++ -. ... • + ',. "'• w '"] _ , ,, the outcome that *rises? What level of successful outcomes should be expected in situations of varying difficulty? This work would be primarily theoretical (Fischer, 1976). - o Clarifying the method's degree of determinacy. To what extent do arbitrary changes (i.e., ones regarding which the method is silent) In mode of application affect the decisions that arise (Hogarth and Makridkis, 1981)? Similarly, one would like some general guldanca on the sensitivity of the procedure to changes in various aspects of ý!,,c decision-making process, In order to concentrate efforts on the most important areas (e.g., problem structuring or value elicitation). "Conversely, one wants to know how sensitive the method is to the particulars of each problem and user. That iso does it tend to render the same advice in all c.rcumwtances? o Assessing the impact of different methods on "process" variables, such as the decision maker's alertness to new Information that threatens the validity of the decision analysis or the degree of acceptance that a procedure generates for the recommendation it produces (Watson and Brown, 1978). Such questioning of assumptions has been the goal et much - existing research, which should provide a firm data base for new work ... (although many questions, such as the first two of the three raised, have yet to be studied). 1., Improving Realism The simplified models of the world that decision analysis software packages use to represent decision problems are in at least one key - o'.4 " .,*.•.•.°o•,- . .. ,, S• - - *-r - T'• •' ; - -• h 2<o -,• 19 "respect very similar to the model& generated by flight or weapons slmulators. Their usefulness is conotrained by the fidelity of their representations to the critical features of the world they hope to model. Although there Is much speculation about process effects, It points In Inconalstent directious and is seldom substantiated by empirical studies (either in the laboratory or in operating organizations). Although these topics have been studied very little In this context, research could draw on whatever analogous studies have been conducted with other kinds of simulators. Some suggested research topics follow. o Hot and cold cognition. Decision analysis schemes are cold and calculating, and they expect the decision maker to be so as well. It Is not clear how well their putative advantages survive when decision makers S shift from "cold" to "hot" cognition. Such a shift occurs with emotional "Involvement, such as might happen when the stakes increase or the topic is arousing (Janis and Mann, 1977). The use of decision aids for medical patients pondering possible treatments assumes that decoision quality will not deteriorate in such situations--or at least no more than it -4 deteriorates without the aid. Another such shift involves time O pressures, such as might arise in crisis decision making (Wright, 1974). .-, Many proponents of decision analysis claim that time constraints actually enhance the usefulness of their tool, rather than threater it, arguing that a quick-and-dirty analysis Is often the most cost-effective way to use the technology. Evidence is needed regarding whether this is true, both when quickness Is chosen and when it is imposed. o Contingency planning. Many of the most Important uses of decision aids are for the sake of contingency planning. The essence of !'4 20 su'nh plannin8 is anticipating future situations and prescribing the actions needed should they actually occur. In principle, preplanning responses should allow a more leisurely and thoughtful analysis with 2 better utilization of experts and decision aide than would be possible if one waited until a situation demanding an Immediate response developed. The success of such efforts depends on the planner's ability to imagine in advance how various contingenciea will appear should they come about. If the actual contingency does not resemble Its i•mage, then the (preplanned) decisions based on that Image will seem inappropriate. In such cases, the decision maker must decide on short notice whethe; to adhere to the plan (and assume that his or her Immediate impress _. faulty) or come up with a new plan on the spot (and assume that the event that was anticipated is not the event that occurred). Although the stakes riding on contingency plans are often very large, we have little systematic knowledge about the correspondence between actual and planned contingencies. Research Is needed on (1) when and why situations look (or feel) different when they occur than they did during planning and (2) what to do when plans made at an earlier time seem Inappropriate. o Overriding recommendations. The moment of truth for the decision aid comes when the decision maker must decide to follow its recommendations or override them. Analogous moments face the users of most other human-machine systems, suggesting that the study of overriding would have broad implications. The research questions are: When do "people even think about overriding? How valid are the cues that lead *' them to do so? Hov much better than the aid are their Intuitive Judgments? Does protracted reliance on decision aids increase or decrease intuitive decision-making ability? Existing research on the 21 acceptance of computerized diagnoses In medicine, clinical psychology, and meteorology would provide a good basis for this research. o Better displays. Decision analysts have shown considerable .i•genuity in translating formal decision theory into terms that may be understood by less sophisticated decision makers. M•ore work needs to be done In this area, particularly If decision aids are to have stand-alone capacity. The features that the models capture are a mixture of those "that are easy to capture and those that designers intuitively feel are important to include. Each of the four topics just described In this section is a factor that may affect the realism of decision aids and, If so, should be considered in their design and utilization. Research efforts to date have hardly begun to tap the potential of zecent work in N' computer graphics for developing superior displays (e.g., to facilitate interpretation of how robust a recommendation is by showing its susceptibility to change with variation in the values of the input parameters). A particular problem is that both questions and recommendations typically appear without any Indication of their rationale. An a result, decision makers may have little feeling for "where the questioning is leading or how robust the concluding recommendations will be (or how they can be explained to others). Collaborative efforts might increase both the overall acceptance of decision analysis and the realism of its recommendations when It is used. •~ Aiding Diffuse Decisions Common to most decision-making models Is the assumption that decisions are =ade by an Identifiable Individual at an Identifiable point in time. V. Clearly, however, this Idealization often is not realized in practice: there may be many parties to a decision; some decisions just evolve over time (or at least are made to seem that way); other decisions are made by people who do not think of themselves as decision makers (e.g., supervisors monitoring and directing the behavior of subordinates or systems); some decisions are made by people who are not officially recognizable as decision makers (e.g., aides to a senior official). Rather different forms of research are needed to improve decision mkfig in each setting; a number of them are outlined below. o Multiperson decisions. Decision theory methods are typict&'.y designed to explore and aggregate the beliefs and preferences o.L s, individual. One approach to dealing with multiple decision makers is a computational scheme for aggregating their beliefs and preferences prior to using them in a common decision model (Rohrbaugh, 1979). Theoretical work has suggested a variety of analytical aggregation schemes. Although this work should continue, it could be usefully complemented by empirical studies (using simulations and experimentation) of how greatly the results of these various schemes differ and how well they are accepted by users. Another approach is to have the parties aggregate their perspectives through some structured interaction (Sachman, 1975; Steiner, 1972). This approach, well worked by students of the risky shift and of the Delphi methods, might benefit from research using computerized systems that allow participants (perhaps at different sites) to go through many rounds of interactions with varying communication channels and protocols. For example, will decisions be reached more quickly and adopted more enthusiastically if the parties can observe visual images of one another, not just privted summary statements? 23 o Evolving decisions. Insofar as decisions represent choices between alternative courses of action, any decision may be expressed as a statemint of action ("I [or we] will do X"). Such translation of a "complex decision process to its procedural Implications can have drawbacks. One is that the underlying rationale of an action is lost, making it difficult to understand why things are done the way they are, "how to respond to new contilngencles, and when it Is time to rethink the whole decision. A second potential drawback Is that those decisions that still have to be made are not addressed directly, leaving crucial steps to guesswork (e.g.. an operator may be told something to the effect of "Figure out what is going on and then follow steps S, to Sne). A third possibility is that procedures may have internal inconsistencies or be at cross-purposes, and people either do not realize It or they realize .9 it but do not quite know what is wrong. Systems that add rules over time may be particularly prone to this problem (the social security system Is an example). Some combination of artificial Intelligence, decision modeling, and experimental work might help people to diagnose the logic * •of the systems that they deal with and that they are called on to redesign (Corbin, 1980; Klein and Weitzenfeld, 1978). o Unwitting decision makers. Just as any decision may be thought of as an action, so may each action be thought of as a decision. Most students of decision making would probably agree with the hypothesis that people would be better off if they realized the decisions implicit in their actions, and structured them as such. For example, a supervisor contemplating the shutdown of a plant because of a malfunction would make wiser choices with even a rudimentary decision ana.ysis (i.e., listing all possible courses of qction, sketching out possible consequences and -.- . ¶ '. 'W°W'.. '--•w•r h •'r _ •, 4* • 1 ... ... .4:• . - • , , •.' • . ' 24 contingencies, crudely working through the expected utility of each action). Such structuring has become part of the training of some medical students. The user of computerized information retrieval systems (e.g., Prestel, Teletext) might be usefully seen as making a series of decisions (such as: These alternatives are ambiguous--which gives me the best chance of getting the information I need? Is it worth my time and money to use the system on this problem? Is the answer I got complete enough or should I keep working?). A useful way to exploit existing research would be to translate it into crude aids, adapted to the conditions and problems of particular work settings (along with an evaluation of their efficacy). o Unofficial decision makers. Senior officials in many organizations are too busy to make deliberative analyses of the many "decisions they must consider. A common (and sensible) defense Is to have "aides conduct the analyses. For this strategem to work, the senior official must communicate well enough with the aide to ensure that the appropriate problem is addressed; the aide must communicate well enough with the senior official to ensure that the rationale behind the decision-making method and the implications of its conclusions are 4• understood well enough to be properly represented and afforded due consideration. Communication problems are likely to be particularly great when the official must present the conclusions to some larger public or when the training of official and aide are quite different. Consider, for example, the difficulties experienced by public officials enunciating the policies devised by economists or by those of junior executives trying to sell decision analyses to old-line senior executives. Better methods of communication (and for realizing the lack 4~4x7- P *7' *• ' , i - 9 -. .-.- T Y v P.-.-y .r~ ~ Y .9-.-.. - . . "23 ofIt) Would be a useful addition to the software accompanying declsion-•making method. any These methods could apply to the front end of an analysi* (e.g., training films, practice exercises) or after it is* ,complete (Federico, et Al.t 1980). 5.~ CONCLUSION '* Decision aiding appears to be increasingly viable and popular. A variety of software packages are currently being marketed and used, each offering somewhat different operationallations of the basic model. If their promises are not to outstrip their capabilities, they will need to be 5'-'9- accompanied by behavurial research regarding how best to design and use ,16-hat software.The five problem areas described In this chapter "represent topics for which research ts likely to be particularly useful end a usable. These projects require primarily experimental methods, .5.-5. the theory and hardvare building on already available. To be most effective they need a context that affords ready contact with decision theoristo and *• practicing decislon analysts. The former can solve the * "Ctheory to vhich they are most suited; questions of the letter can provide access to their machines (and perhaps to their clients) and facilitate the ,* translation from research to practice. i• . "26 REFERENCES Bonczek, R. H., Holsapple, C. U., and Whinston, A. D. 1981 Foundations of Decision Support Systems. New York: Academic Press. Brown, R., Kahr, A. S., and Peterson, C. 1974 Decisional Analysis for the Manager. New York: Holt, "Rlinehart & Winston. Corbin, R. 1980 Decisions that might not get made. In T. Wallsten, ed., Cognitive Processes In Choice and Decision Behavior. Hillsdale, N. J.: Lawrence Erlbaum. Einhorn, H., and Hogarth, R. M. "1981 Behavorial decision theory: processes of judgment and choice. Annual Review of Psychology 32:53-88. Federico, P. A., Brun, I. E., and McCalla, D. B. "1980 Management Information Systems and Organization Behavior. New York: Praeger. 7 .~* • 4 . p ' * , ' 27 Fischer, 0. W. 1976 Multidimensional utility models for risky tnd riskless choice, Organizational Behavior and Human Performance 17:127-146. Fischhoff, B. 1980 Clinical decision analysis. Operations Research 28:28-43. 4! Fischhoff, B., Goitein, B., and Shapira, Z. 1981 The experienced utility of expected utility approaches. In N. Feather, ed., Expectancy, Incentive and Action. Hillsdale, N. J.: Lawrence Erlbaum. Fischhoff, B., Lichtenstein, S., Slovic, P., Derby, S., and Keeney, R. 1981 Acceptable Risk. New York: Cambridge University Press. Fischhoff, B., Slovic, P., and Lichtenstein, S. 1980 Knowing vhat you want: measuring labile values. In T. S. Wallsten, ed., Cognitive Processes In Choice and Decision Behavior. Hillsdale, N. J.: Lawrence Erlbaum. Gettys, C. F. and Fisher, S. D. 1979 Hypothesis plausibility and hypothesis generation. Organizational Behavior and Human Performance 24:93-110. Hammond, t, R. 1971 Computer graphics as an aid to learning. Science 172:903-908. 28 Hogarth, R. N. and Makridakis, S. 1981 Forcecasting and planning: an evaluation. Mnagement Science 27:115-138. Humphreys, D., and McFadden, W. 1980 Experiences with MAUD: aiding decision structuring versus bootstrapping the decision maker. Acta Psychologica 45:51-69. Janis, I. L., and Mann, L. 1977 Decision Making. New York: Free Press. Johnson, E. M., and Huber, G. P. 1977 The technology of utility assessment. IEEE Transactions on Systems Management Cybernetics SMC-7:311-325. Jungermann, H. 1980 Speculations about decision-theoretic aids for personal decision making, Acta Psychologica 45:7-34. Kahneman, D., and Tversky, A. 1979 Prospect theory. Econometrica 47:263-292. Klein, G. A., and Weitzenfeld, J. 1978 Improvement of skills for solving ill-defined problems. Educational Psychology 13:31-41. a_ 29 McNeil, B. J., Welchselbaum, R., and Pauker, S. G. 1978 Fallacy of the 5-year survival rate in lung cancer. New England Journal of Medicine 299:1397-1401. Pits, G. F., Sachs, N. J., and Heerboth, N. T. 1980 Structure for Individual decision analysis. Organizational "Behavior & Numan Performance 26:65-80. Raiffa, H. 1968 Decision AnalysIs. Reading, Mass.: Addison-Wesley. Rohrbaugh, J. S1979 Improving the quality of group judgment. Organizational Behavior and Human Performance 24:73-92. Sachman, R. 1975 Delphi Critique. Lexington, Mass.: Lexington Books. Sage, A. P. 1981 Behavorial and organizational considerations in the design of Information systems and processes for planning and decision support. IEEE Transactions on Systems Management and Cybernetics SHC-l1:640-678. Schumann, N., and Presser, S. 1979 Assessment of no opinion in attitude surveys. Sociological Methodology 10:241-275. ........ ........... - . 30 Slovlc, P., Fischhoff, B., and Lichtenstein, S* 1979 Behavloral decision theory. Annual Review of Psychology 28:1-39. Steiner, I. D. 1972 Group Procerses and Production. New York: Academic Press. * - Svenson, 0. 1979 Process descriptions of decision making. Organizational Behavior & Human Performance 23:86-112. Turner, C., and Martin, E. 198X Surveyirg Subjettive Phenomena. Panel on Survey-Based Measures of Subjective Phenomena, Committee on Natic.al Statistics, National Research Council. Publishe. to coaie. Tversky, A., and Kahneman, D. 1981 The framing of the derisions and the psychology of choice. Science 211:456-458. von Winterfeldt, D. 1980 Structuring decision probleme for deciaion analysis. Acta Psycholog~ica 45:71-93. Wallsten, T. 1980 C_ nitIve Procesues in Choice and Decision Behavior. "Hillsdale, N. J.: Lailrence Erlbaum. * 4 -. *.,. -,- 31 Wardle, A., and Wardle, L. 1978 Computer-aided diagnosis--a review of rese&rch. .M!ethods of Information in Mledicine 17:15-28o Watson, S. R., and Brown, R. V. 1978 The valuation of decision analysis. Journal of the Royal , Statistical Society Series A(141):69-78. Wright, P• 1974 The har'assed decision maker. Journal of Applied Psychology 59:555-561, v.L. 4 2" * p:' III ELICITING INFORMATION FROM EXPERTS ),any formal and informal. processes in working organizations hinge on the effective communication of Ooxpert information." Risk analyses may require a metallurgist to assess the likelihood of a valve's fracturing under an anticipated stress or a human factors expert to assess the likelihood of Its failing to open due to faulty maintenance. Strategic "analyses may require substantive experts to assess the growth rate of the Soviet economy or the proportion of its expenditures directed to arms. Tactical planning In marketing or the military may demand real-time reports by field personnel of what seems to be nappening "at the front." Air traffic control typically requires succinct, unambiguous statua reports from all concerned. Computerized career-counseling routines or procedures for establishing entitlement to social benefits assume that A.4 lay people can report on those aspects of their own lives about which they are the ranking experts. The U.S. Census Bureau makes similar "assumptions when asking people about their employment status, as a step 4-,• toward directing federal policies and jcbs programs. In product The principal author cf this chapter is Baruch Fischhoff. "p° 2 liability trials technical experts give evidence In a highly stylized manner. •.: As can be seen from 4hese examples, experts may talk to the consumers of their advice directly, to elicitors who then translate what they esay into a form usable by a computer, or to a computor. Insofar as computers have been designed by people, all of these communication modes assume wome fairly high level of interpersonal understanding. The elicitors must ask questions that people can sensibly answer. The recipients of those answers must interpret them with an appreciation of the errors and ambiguities they may conceal. The quality of that communication is likely to depend on the novelty of the problems, the historic level o" interaction between questioner and answerer, and the quickness with which miscommunications produce diagnostic signs. Poor elicitation by air traffic controllers may become visible very quickly; whereas employment surveys may (and have) elicited biased responses and misdirected economic planning for years without the error's being detected. Particularly 0: clumsy elicitation may lead users to reject the eliciting system, thereby avoiding mistakes but also wasting the resources that have been invested in its design. New research &bout elicitation and the translation of existing research findings into more usable form could benefit a wide variety of enterprises. As this chapter discusses, elicitation is not a field of Inquiry or application in and of itwelf, but a function that recurs in many problems. This creates special difficulties for the accumulation and dissemination of knowledge about it. , ,, , , • , ,, . . " - **-' , , "-- " * " . . BACKGROUND Perhaps because elicitation Is a part of many problems but all of none, it has emerged neither as a discipline nor as an area that is seen to '- require special expertise. The typical assumption is that elicitation is not a parcicular problem, as long as things stay fairly simple and one . uses common sense. The validity of that assumption may not be questioned until some egregious problem has clearly arisen from a particular failure. When probitms arise, the lack of a coherent body of knowledge may encourage ad hoc solutions, with little systematic testing or accumulation of knowledge. Solutions are generated from the resources of those working on a particular problem and viewed from their narrow "perspective. One reason for aggregating these elicitation issues into a single chapter Is to keep them from being orphaned, as parts of many problems A. for which there Is no focus of responsibility. Another reason Is to suggest that there are enough recurrent themes to generate a coherent body of knowledge, thereby reducing the degree to which each system designer faced with an elicitation problem must start from scratch. Although work may still focus on specific problems, conceptualizing them in a general way may increase both the pool of talent they draw on and the breadth of perspective with which their solutions are interpreted and reported. Because a common element of these projects is dealing with substantive experts, their cumulative impact should be to generate a % better understanding of the judgmental processes of experts. * A7 4 The research bases for the following projects are sufficiently diverse that further details are given within each context. In some cases, there is a distinct research literature on which new projects can be based. In others, the proposed topic does not exist as a separate pursuit, or at least not within the context of human factors; tha literature cited is suggestive of the kinds of approaches that have * proven useful in other fields or related problems that might be drawn on. RESEARCH ON ELICITATION Ensuring a Common Frame of Reference An obvious precondition for communication is ensuring that elicitor and respondent are talking about the same thing. In ordinary conversation the participants have some opportunity for detecting and rectifying misunderstandings. If questions are set down once for all respondents, then misunderstandings must be anticipated in advance. Some Implicit theory of potential (mis)interpretations must guide the queetion .0, composers for management systems, accident report forms, or automatic diagnostic routines that rely on expert judgment. These problems are not, of course, unique to human factors. They are probably best understood by professionals whose central concern for the longest periods of time has been asking questions; these include anthropologists (Agar, 1980), linguists, historians (Hexter, 1971), survey researchers (Payne, 1952), philosophers, and some social V 5 psychologists (Rosenthal and Rosnow, 1969). Two general conclusions that one can derive from their work is that the opportunities for misinterpretation are much greater than most people would presuppose and "that the nature of possible specific misinterpretations is hard to Imagine Intuitively. The chances for miscommunication are likely to increase to the extent that elicitor and respondent come from different cultures and have had little opportunity to interact. Systems designed by technical experts for lay users often fall into this category, especially when the N elicitation is far removed physically or temporally from the design effort. Consider, for example, a computerized Job search program that requires unemployed workers to characterize their experience In terms of one of the 12,000 categories of the Dictionary of Occupational Titles (DOT) code (e.g., handkerchief presser). Although a considerable intellectual effort has gone iato imposing a semblance of order on the world of work, that order may be very poorly matched to the way in which applicants conceptualize their experience. Indeed, even those who elicit such information from job applicants and translate it Into the DOT code on a full-time basis way have considerable difficulty. Similar problems may face a system designed to clarify entitlement to social services or a computerized system for diagnosing car or radio problems on the basis of a userts description of presenting symptoms. These problems may persist even with the clearest display and the most lucid users' manual. Ii Although the details of each problem are unique, seeinS their common elements can enable designers to exploit a larger body of existing research and research methods. One strategy is lite:rture reviews that make accessible the methods used by fields such as anthropology to -V.* V V- . .. . -L;_7 6 uncover misunderstandings. Using these methods with small samples of users prior to designing systems or in the early stages of design could effectively suggest minor changes or even major Issues (such as whether the system could ever stand alone, or whether it will always need an interpreter between it and the actual user). Such strategies are increasingly being used in survey design; they may even lead to some revision In the categories of Justice Department statistics so as to make them more compatible with the ways in which victims of crimes think abc,-r their experience (National Research Council, 1976). Another research strategy is to review existing case studies of mishaps (e.g., In diplomacy, survey research, police work, or software design) for evidence of problems due to questioners and respondents unwittingly speaking different languages (Brooks and Ballar, 1978). Such studies would help establish the prevalence of such problems and create a stock of cautionary tales for educational and motivational purposes. A third strattgy involves experimental and observational studies of groups of individuals who regularly communicate with one another, in order to see how well they understand une another's perspectives. Software designers and less educated users, engineers and machine operators, and market researchers and consumers are a few such dyads. The intuitive beliefs of the elicitors in each of these dyads regarding the perspectives of their respondents might provide some productive hypotheses and reveal some misconceptions worthy of correction. Better ways of eliciting information should also suggest better wqys of presenting It. Informing and counseling patients about medical risks is one area in which these problems are currently under active study (see Chapter 2). 7 Matching Questions to Mental Structures A prftsumption of many elicitation efforts is that the respondent has an answer to any question that the elicitor can raise (Turner and Martin, 198X). One contributing factor to this belief Is the fact that elicitors often cannot accept "no answer" for an answer, needing some best guess at the answer in order to get on with business. A second contributing factor may be the tendency, long known to surveyors, for respondents to offer opinions on even nonexistent issues, perhaps reflecting some feeling that they can, should, or must have opinions on everything. A third factor may be the elicitors' (intuitive or scientific) models of memory that presume a coherent store of knowledge waiting to be tapped by whatever question proves most useful to the elicitor (Lindley, et al., 1979). Coping with situations in which the respondent has little or no knowledge about the topic in question is dealt with in the next section, on'how to elicit assessments of information quality. Alternatively, the respondent may have the needed Information, but not in the form required by the question. Whenever there is incompatibility between the way in which knowledge Is organized and the way in which it is elicited, the danger arises that the expert may not be used to best advantage, may provide misleading information, or may be seduced Into doing a task to which his or her expertise does not extend. For example, riPk assessment programs often require the designers of a technica] qystem to describe it in terms of the logical interrelationships between various components .0. (including Its human operators, repair people, suppliers, etc.) and to assess the probability of these components' failing at various rates, perhaps as a function of several variables (Jennergren and Keeney, 1981). Given these judgmental inputs, these programs may perform miraculous simulations and calculations; however, the value of such analyses is contingent on the quality of the judgments. The processes by which experts are recruited may or may not take into consideration the need for these special skills. In some situations, no one may have •I: Research designed to improve the compatibility of questions with the way in which knowledge is stored ahould be guided by substantive thC-.cj about that storage as well as practical knowledge of the Informatiou needed. The citations given here represent different approaches to conceptualizing such mismatches between precise questions and differently organized or unorganized knowledge. As an example of the kinds of testable hypotheses that emerge from these literatures, consider the possibility that many experts experience the topics of their expertise one by one, whereas elicitors often need a summary (e.g., of the rate of target detections by sonar operators, the conditional probability of misreading an altimeter given a particular number of hours of flying "experience, the distribution of hearing deficits associated with various noise levels). If experts are not accustomed to aggregating their experience, then they will respond differently to procedures that request aggregate estimates immediately and those that focus first (and perhaps entirely) on the recall of ind.vidual incidents (Fischhoff and Whipple, w !, 1981). This particular research could build somewhat on probability learning studies or attempts to distinguish between episodic and semantic memory. elk...*1 [. Efforts to design the best response mode assume that respondents have the knowledge that the elicitor needs, but not organized in the most convenient form. A more troublesome situation arises when they do not 'have it organized at all. In that case the elicitor's task becomes to evoke all of the relevant bits and pieces, then devise same scheme for interpreting them. Doing so first requires discovering that incoherence exists, which may not be easy, insofar as a set of questions may elicit consistent responses simply because It has consistently imposed one of several possible perspectives. Although sensitive elicitors may already be poking around creatively, there are few codified and tested procedures. Such procedures might involve standard sets of questions designed to produce diverse perspectives, which the respondent would then integrate to provide a best guess (or set of best guesses) for the problem at hand. For example, one might always ask about case-by-case and aggregate estimates, in that order. Such efforts might also prompt and be helped by the develepment of memory models allowing for multiple, incoherent representations. Clarifying Information Quality Eefore taking action on an expert's opinion, one wants to know how good that best guess is. Great uncertainty might prompt one to try to uncover its sources or to take alternative courses of action (e.g., hedging one's bets). Although explicit assessments of uncertainty are becoming a greater part of enterprises such as risk analysis (F:irley, 1977), weather foresatitig (Murghy and Winkler, 1977), and strategic assessment "10 (Daly and Andriole, 1980), such experiences are rare for most people. As one would expect In novel elicitation situations, the responses that people give are not always to be trusted. Assessments of information iquallty (or confidence or probability) have been the subject of extensive research over the last decade (Lichtenstein, et el., 1982). It has produced a fairly robust set of methods for eliciting uncertainty and a moderately good understanding of human performance in this regard. The ,clearest finding is that people have a partial but not complete appreciation of the extent of their own knowledge. Most commonly, this partial knowledge expresses Itself in overconfidence, which seems c'•. e impervious to most attempts at deblasing, except for intensive trainin "(Fischhoff, 1982). Many practical problems could be solved in this area with a moderate investment in completing the research that has already been started. This research could use the stock of elicitation techniques already available to understand better the range and potency of overconfidence blases, to clarify how worrisome they are, and to determine the most effective training and how far it can be generalized. Of particular interest is the extent to which experta are prone to these problems when "making judgments in their areas of expertise; current evidence suggests that they are, but It is still Inconclusive given the Importance of the question (Spetzler and Stael von Holstein, 1975). The practical steps that can be taken subsequent to such research are developing and testing training procedures, Identifying the least blas-prone elicitation methods for situations in which training is Impossible or ineffective, and anticipating the extent of bias with different methods and attuations in order to apply ad hoc corrections. -r - . .- .' - - - . - 99. 11 Choosing between these steps and implementing them efficiently will require a more detailed understanding of the cognitive processes involved in representing and integrating probabilistic information. Although existing research covers much of the ground between basic cognitive psychology and field applications, it has not quite touched bases with either extreme. Coping vith this practical problem might provoke some interesting theoretical work In the representation of knowledge. Eliciting Systems In the examples used in the preceding sections, the knowledge that experts were asked to provide dealt with the components of some large "system (e.g., a failure probability, a job choice, a burnout rate). At times, however, experts are required to describe the entire system (Hayes-Roth, et al., 1981). Software packages that attempt to elicit a big picture include some of those used in decision structuring, failure probability modeling (U.S. Nuclear Regulatory Commission, 1981), map 4 making, route planning, and economic analysis. Once such systems have been programmed well enough to work at all, one must ascertain the degree of fidelity between the representations they produce and the conceptual or physical systems they are meant to model; attempts to develop better elicitation methods or to cope with known limits or errors should follow (Brown and Van Lehn, 1980). The research strategies outlined below, based In part on the initial work already begun and in part on " discussions with troubled system elicitors, may shed some light on these problems. In each case 9ne would want to know whether a change in 12 procedure made a difference and, if so, whether one method would be preferred in some or all situations. Because so little systematic knowledge is available on how results may vary with different elicitation procedures, generalizing the existing research findings should be done cautiously. o Determining whether formally equivalent ways of eliciting the same information produce different responses. For example, a category of events may be judged differently when considered as a whole and when disaggregated into component categories. o Evaluating the effectiveness of methods that require more a,, less "deep" (or analytical or Inferential) judgments about system operation. For example, if a process produces a distribution of events (e.g., failure rates), one could assess that distribution directly or judge something about the data-generating process. o Varying the amount of feedback provided about how the elicited system operates. For example, when a simulation of an industrial process is designed according to an expert's judgment, it may be run a few times, just to see if it produces more or less sensible results. The expert could then introduce apparently aeeded adjustments. Such tinkering should lead to successive improvements in the wodel; however, it can also prevent simulations from producing nonintuitive (i.e., surprising, informative) results. It also threatens the putative independence of the models created by different experts in azeas such as climatology and macroeconomics. The convergence of these models' predictions (about the future of the economy, for example) is used a• a sign of their validity. In practice; however, econometricians monitor oae another's models and adjust theira If they produce outlying predictions. 4 ', : i -'"'-• / - : .-- - -- -- : . -• . - •• -- ' - '- • -• ÷i --. "- • .-. ,-, . 13 ,o Assessing experts' ability to judge the completeness of a representation. How well can they tell whether all important components have been included? Available evidence suggests that considerations that are out of sight are also out of mind; once experts have begun tý think about a model in a particular way, the accessibility of other perspectives is apprecially reduced (Fischhoff, et al., 1978). If this is generally true, an elicitor might try to evoke a variety of perspectives on the system superficially before pursuing any In depth (as a sort of intra-expert brainstorming). Estimating Numerical Quantities * common form of uncertainty is knowing something about a topic, but not a necessary fact. If that fact is a number (e.g., the numter of tanks an enemy has or the percentage of those tanks that are in operating order), it may be possible to use the related facts in a systematic way if one can devise a rule or algorithm for composing them (Armstroug, 1977). The * validity of such estimates depends on the appropriateness of the algorithms, the quality of the component estimates, and the accuracy of "their composition. Used appropriately, algorithms can make otherwise impenetrable judgmental processes explicit and subject both to external criticism and to self-improvement, as one can systematically update one's best estimate whenever more is learned about any component (Singer, 1971). Although there are many advocates of algorithmic thinking and anecdotal evidence of its power, there do not seem to bc many empirical studies uf their usefulness (Hogarth and Makridakis, 1981). Such s ---------------------------------------------------------- 14 of algorithm efficacy as do exist seem concentrated on the solving of deterainistic logical problems for which all relevant evidence Is 4 presented to the respondent and a clear criterion of success exists, rather than estimation tasks in which the accuracy of the estimate will be unclear until some external validation is provided. Like any other judgmental technique, algorithmic thinking could be more trouble than it Sis worth if it increases confidence in judgment more thean it improves judgment. A primary research project here would be to compile a set of plausible and generally applicable algorithmic strategies. Procesb tracing of the judgmental processes of expert estimators might be uae source. The algorithms discovered in the study of logical problem solving might be another. A subsequent project could attempt to teach people to usi these algorithms, then, looking at the fidelity with which they can be applied, measure the accuracy of their results and their influence on confidence. The use of multiple algorithms and people's ability to correct the results of Imperfect algorithms are also worth study. The best algorithms could then become part of management information systems, decision support systems, and the like. - Two interV,-atlve literature reviews might provide useful adjuncts to tbla research. One woulu )ook at work on mental arithmetic of the sort required when people must execute algorithms In their heads. Although 7:-" computational devices should be able to eliminate the need for such "exercises, judges may still be caught without their tools or may uge unwritten mini-algorithms in order to produce component estimates (once they've gotten the general idea). The second review would summarize, in a form accesoible to designers, the psychophysics literature on - *"A "stimulus-presentation and response-mode effects (Foultou, 1977). That .tiuu-peettin- e sq. literature shows the degree of variability in magnitude extimation that can arise from "a'tifactual" changes in procedure (e.g., order of alternative presentation, kind of numbers used). Detecting Reporting Bias The preceding sections have assumed that elicitor and respondent are engaged in an honest, unconflicted attempt to produce a beat estimate of some quantity or relationship. When research identifies difficulties, one assumes a mutual good faith effort on the part of elicitors and A. experts to eliminate them. In the real world, however, many wrong answers are deliberate; their producers do not wish to have them either detected or corrected. If the citations given here are at all representative, systematic misrepresentation has been of greatest interest to those concerned with the social and eco:aomic context within which behavior takes place. Such misrepresentations may be usefully divided into two categories. The first includes deliberate attempts to deceive In order to gain some advantage. For example, economists chronically mistrust verbal reports oý people's preferences (i.e., surveys) for fear that respondents engage in strategic behavior, trying to "put one over" on the questioner and distort the survey's results (Brookshire, at &l., 1976). Some critics of survey research are even advocating that respondents do so deliberately so as to stop the survey juggernaut (see Turner and Martin, 198X), as do some people in * 16 organizations who feel threatened by computerized information systems and wish to see them fail. The second category uf misreports reflects cultural or subcultural norms. Xn a business or military anit, for example, optimism (or grousing) may be the norm for communication between members of some ranks (Tihansky, 1976). Or there may be a norm of exaggerating one's wealth or weight. Those who share the norms know how to recode the spoken word to gain a more accurate assessment; however, mechanical systems designed bY people outside the culture may take those reports at face value and thereby Introduce systematic errors Into their workings. AMthough investigating misreporting is likely to be quite difiicui, Identifying it is part of systems design. One way to start is to review the relevant literature in fields that have dealt with these questions (e.g., sociology, economics). A second is to interview experts off the record about how (and how often) they try to manipulate systems that pose questions to them. A third is to observe ongoin$ elicitations for which it is possibile to validate responses. Difficulties, once identified, must still be treated. One method is to institute penalties fcz misreporting. A second is to make consistency checks to detect errors. A third is to eliminate the reasons for misreporting (e.g., ensuring confidentiality). A fourth is to correct misreports for known biases. For example, the Central Electricity Generating Board in Great Britain discovered that it could quite accurately predict the time needed to return a power station to operation by doubling the time estimates reported by the chief plant engineers. One difficulty with such adjustments is that people may change their reporting practices if they find out about them (Kidd, 1970). ......................................... a, ,r 'A , 0 __ -. Reporting Past Events •Mny planning and design activities are heavily guided by reports of past events, particularly accidents cr other failures (Petzoldt, 1977; Rasmussen, 1980). One reconstructs the way in which a system should have operated, contrasts that with the way in which it actually operated, and uses that comparison to Improve future design (perhaps assigning guilt and enacting penalties along the way). Such retrospections are inevitably colored by the reporter's knowledge of what has happened. As common sense suggests and the citations below partially document, that coloring can be the source of needed detail or of systematic distortion. It has been found, for "example, that people seem to exaggerate in hindsight what could have been (and *as) known in foresight; they use explanatory schemes so complicated and so poorly specified as to defy empirical test; they remember people i'a as having been more like their present selves than was actually the case; they fail to remember crucial acts that they themselves performed. These problems seem to afflict both the garden-variety retrospections evoked irn laboratory studies and those of professional historians, strategic analysts, and eyewitnesses (Fischhoff, 1975). One needed project is to make these studies available to those engaged in eliciting or using retrospective reports. Another is to attempt to replicate them in human factors domains. Of particular interest are cases In which the direction of bias has been documented sufficiently to allow recalibration of biased retrospections. In cases • 18 in which distortious are less predictable, techniques should be developed *. to help experts reconstruct their view of the situation before, during, and after the event. For example, such research may show that people exaggerate the probability they assigned (or would have assigned) to past events before they occurred by about 20 percent, on the average. That knowledge may make it poGsible to adjust retrospectiva probability assessments, but not to eliminate distortions in the way particular events and causal links are drawn. For assigning blame or understanding how an accident situation looked to an operator just before things started to gc wrong, strict (accuzd•) reconstruction is essential. For ,znderstanding how the system actual!,, operates, one needs to be wary of the danger that experts have learned too much from a particular event, thereby misinterpreting the Importance and generality of the causal forces Involved. Generals who prepare for • .the last war may fit this stereotype, as may the operators of supervisory control systems who respond to each mishap by ensuring that It will not happen again, then rest confident that the system as a whole is now "fail-safe. Three research strategies appear to offer some promise for clarifying these questions. One Is to review the reports of historians, judges, journalists, and others about how they detect and avoid biases. A second is to do theory-based experiments, looking at how memory accommodates new information, particularly to see which processes are reversible. The "third is research on debiasing, looking at the effect of directly warning people, of raising the stakes riding on a decision, or of Instructing them to change the structure of the task to one that uses their intellectual skills to better advantage. - * .. . * .. *" .° ." *'-" - . ".•. °. i 19 CONCLUSION Eliciting Information from experts successfully is important to a variety of systems and organizations. The care taken in elicitation varies greatly, from detailed studies of the elicitation of some specific recurrent judgments, to careful deliberations unsupported by empirical research, to casual solutions. Even though elicitation is not a discipline per se, research such as that suggested in this chapter could focus more attention on it and make a body of knowledge accessible to designers. In part, that knowledge would be borrowed from related fields (with suitable translations); in part it would be created expressly to solve human factors problems. Some of these projects could be undertaken in their own right; others would be best developed as part of ongoing projects, with more emphasis on elicitation than might otherwise be the *• case. The interdisciplinary aspect of many projects may generate "*. interest in human factors problems on the part of workers in other fields (e.g., memory representation, workplace culture), and their expertise could contribute to human factors research. 7.A 20 REFERENCES Agar, M. 1980 The Intimate Stranger. New York: Academic Press. Armstrong, J. S. 1977 Long Range Forecasting. New York: Wiley. I Brooks, A., and Bailar, B. A. ."1978 An Error Profile: EmployMent as Measured by the Current Population Survey. Statistical Policy Working Paper 3. Washington, D.C.: U.S. Department of Commerce. All SBrookshire, D. S., Ives, B. C., and Schulze, W. D. 1976 The valuation :f aesthetic preferences. Journal of Environmental Economics and Management 3:325-346. Brown, J. S.e and Van Lehn, K. "1980 Repair theory: a generative theory of bugs in procedural skills. Cognitive Science 4:379-426. 'a "Daly, J. A., and Andriole, S. J. S1980 The use of events/interaction research by the Intelligence community. Policy Sciences 12:215-236. . , * 21 Fairley. W. S. 1977 Evaluating the "small" probability of a catastrophic accident from the marine transportation of liquefied natural gas. In V. B. Fairley and '. Mosteller, ede., Statistics and Public Policy. Reading, Mass.: Addison-Wesley. Fischhoff, D. 1982 Deblasing. In D. Kahneman, P. Slovic, and A. Tversky, eds., Judgment Under Uncertainty: Heuristics and Biases. New York: Cambridge University Press. Flachhoff, B. 1975 Hindsight 0 foresight: the efiect of outcome knowledge on judgment under uncertainty. Journal of Experimental Psychology: Human Perception and Performance 1:288-299. Flachhoff, B., Slovic, P., and Lichtenstein, S. 1978 Fault trees: sensitivity of estimated failure probabilities to problem representation. Journal of Experimental Psychology: Human Perception and Performance 4:330-344. 7ischhoff, I., and Whiipple, C. 1981 Risk assessment: evaluating errors in subjective estimates. The Environmental Professional 3:272-281. I.- . . . . . . . . . . . . . . . . . . . . . i•. 22 Waterman, D., and Hayes-Roth, J. 1982 An Xnvestigation of Tools for Building Expert Systems. .5 Report prepared for the Notional Science Foundation. Santa Monica, Calif: Rand Corporation. "Hexter, J. H. 1971 The History Primer. New Haven: Yale University Press. Hogarth, R. M., and hikridakis, S. 1981 Forecasi:ing and planning: an appraisal. Mament Science 27:115-138. I. Jennergren, L. P., ant' Keeney, R. L. 1981 Risk assessment. In Handbook of Applied System Analysis. ri, . . . Laxevburg, Austria: International Institute of Applied Systems Analysis. Kidd, J. B. 1970 The utilization of subjective probabilities in production planning. Acts Psychologica 34:338-347. Lichtenstein, S., Fischhoff, B., and Philips, L. D. 1982 Calibration of probabilities: state of the art to 1980. In D. liihneman, P. Slovic, and A. Tversky, eds., Judgment Under Uncertainty: Heuristics and Biases. New York: Cambridge University Press. •" - -. . . _. . . . . . . o. . . . . . . ~ *.*¸- .- .-. ',. . 23 Lindley, D. V., Tversky, A.,, and Brown, RI V. 1979 On the reconciliation of probability assessmentso Journal of the Royal Statistical SocieSZ. Series A(142) Part A.2:146-180, M urphy, A. H., and Winkler, R. 1977 Can weather forecasters formulate reliable probability forecasts of precipitation and temperature? National Weather Digest 2:2-9. National Research Council 1976 Surveying Crime. Panel for the Evaulaoion of Crime Surveys, Committee on National Statistics. Washington, D.C.: National Academy of Sciences. Payne, S. L. 1952 The Art of Asking Questions. Princeton, N. J.: Princeton University Press. Pezoldt, V. J. 1977 Rare Event/Accident Research Methodology. Washington, D.C.: National Bureau of Standards. Poulton, 2. C. "1977 Quantitative subjective assessments are almost always biased, sometimes completely misleadiz;. British Journal of PsycholoMj 68:409-425. hA, 24 Rasmussen, Jo 1980 What can be learned from human error reports. In K. D. Duncan, M. Gruneberg, and D. Wallis, ads., Changes in Working Life. New York: Wiley. Rosenthal, R., and Rosnov, R. 1969 Artifact in Experimental Design. New York: Academic Press. Singer, M. 1971 The vitality of mythical numbers. The Public Interes- 23:3-9. Spetzler, C. S., and Stael von Holstein, C-A. 1975 Probability encoding in decision analysis. Management Science 22:340-358. Tihansky, D. 1976 Confidence assessment of military air frame cost predictions. Operations Research 24:26-43. Turner, C. and Martin, E., ads. in Surveying Subjective Phenomena. Panel on Survey-Based press Measures of Subjective Phenomena, Committee on National Statistics, National Research Council. New York: Russell Sage. 25 U.S. Nuclear Regulatory Commission 1981 Fault Tree Handbook. Washington, D.C.: U.S. Nuclear "Regulatory Commission. '4: - -.- . - . •)i~ii* IV SUPERVISORY CONTROL SYSTEMS In the past 15 years the introduction of automation into working environments has created more and more jobs in which operators are given very high levels of responsibility and very little to do. The degree of responsibility and the amount of work vary from position to position, but the defining properties of such jobs are: (1) The operator has ov rall responsibility for control of a system that, under normal operating conditions, requires only occasional fine tuning of system parameters In order to maintain satisiactory performance. (2) The major tasks are to program changes in inputs or control routines and to serve as a backup in the case of a failure or malfunction in a system component. (3) Important participation in system operation occurs Infrequently and at unpredictable times. (4) The time constraints associated with participation, when it occurs, can be very short, of the order of a fe'k seconds or minutes. (5) The values and costs associated with operator decisions can be very large. (6) Good performance requires rapid assimilation of large quantities of information and the exercise of relatively complex inference processes. The principal authors of this chapter are Thomas B. Sheridan, Baruch Fischhoff, Michael Posner, and Richard W. Pew. 2 These kinds of jobs are found in the process control industries, such as chemical plants and nuclear power plants. They are involved in the control of aircraft, ships, and urban rapid transit systems, robotic remote control systems for inspection and mwailt untion in the der~p ocean, and computer-aided manufacturing. They are Involved in medical patient-monitoring systems and law enforcement Informat#,,. A.nd coac:.2. systems. As computer aids are introduced into military comsuanc control systems, such jobs become involved in that area. Foi. &:pi., the Army alone currently has 70 automated or computer-aided systems .t the concept development stage (U. S. Army Research Institute, 29>' Zhe other services have similar projects under development. The human factors problems involved in supervisory control systems can be classifed Into five categories. 1. Display. In the past these systems have used large arrays of meters and gauges or large situation boards and control panele to displ..y information, with the general Zal of displaying everything, because one never knows exactly what will be needed. Little attention has been paid to the need to assimilate diverse information sources into coherent patterns for making inferences simply and directly. Today computers are being used more and more in the control of these operations; large display panels are being collapsed into computer-generated displays that can call up the needed information on demand. 1hese developments in physical technolcgy are pushing human factors engineers to devise better 0...' ways of codin& and formazing large collections of information to facilitate interpretation ano reiLsble decisions by operators. Also needed are h•etter means of accessing irformation, means that are not 79 3 ^paque and do not leave operators confused in urgent end stressful situations. 2. Command. The emergence of powerful computers and robotic devices .b. is necessicated the development of better "command languages," by which operators can convey instructions to a lower-level intelligence, perhaps giving examples or hints and providing criteria or preferences, and doing it in a communication mode that is natural and adaptable to different people and linguistic styles. 3. Operator's Model. We also lack well-developed methodologies for identifying the internal conceptual model on thQ basis of which an operator attempts to solve a problem, (This has also been called the op,' Fstem ;.wage, picture, or problem space.) Incorrect "operator's models can lead to disaitrous roý,ults (e.g., Three Mile Island); it Is obviously a watter of utmost impovtance 2vr operfttr of military command and control systems to acquire proper conceptual models and keep them updated on a moment-by-moment basis in times of crisis. 4. Workload. We have no good principles of job design for operations in supervisory control systems, in part because it has proved extremely difficult to measure or estimate the mental workloads involved. They tend to be highly transient, varyivg from light and boring when the work to routine to extremely demanding when action is critical. At present there Is no consensus on what mental workload is or bow to measure it, especially In the context of supervisory control. 5. Proficiency and Yrror. Issues 4f training and proficiency maintenance are critical in this kind of operation because each event is in some sense unique and is drawn from an extremely Jarge set of possibilities, most of which vill never occur during the operating life .. .o . . . . . . . . .. 4 of the system. It is not easy to anticipate what types of errors will occur or how to train to prevent them. SUPERVISORY CONTROL IN DIFFERENT APPLICATIONS. This section, adapted from Sheridan (1982), provides brief comparisons and contrasts among different applications of supervisory control systems: process control, vehicle control, and manipulators. Process Control The term process usually refers to a dynamic system, such Ps a fossil ••e' or nuclear power generating plant or a chemi.cal or oil production facliity, that Is fixed In space and operates more or less continuously in time. Typically time constants are slow--many minutes or hours may elapse after a control action is taken before most of t0. ý,-:tem response is complete. Most such processes involve large structures with fluids flowing from "one place to another and Involve the use of heat energy to affect the fluid or vice versa. Typically such systems involve multiple personnel and multiple machines, and at least some of the people move from one location of the process to another. Usually there is a central control room where many measured signals are displayed and where valves, pumps, and other devices are controlled. K'f 5 Supervisory control has been emerging as an element in process control for several decades. Starting with electromecbanical controllers or control stations that could be adjusted by the operator to maintain certain variables within limits (a home thermostat is an example), special electronic circuits gradually replaced the electromechanical function. In such systems the operator can become part of the control loop by switching to manual control. Usually each control station displays both the variable being controlled (e.g., room temperature for the thermostat) and the control signal (e.g., the flow of heat from the furnace). Many such manual control devices may be lined up in the control room, together with manual switches and valves, status lights, dials and recording displays, and as many as 1,500 alarms or annunciators--windows that light up to indicate what plant variable has just gone above or below limits. From the pattern of these alarms (e.g., 500 in the first minute of a loss-of-coolant accident and 800 in the second minute, by recent count, in a large new nuclear plant) the operator is supposed to divine what is happeuing. The large, general-purpoce computer has found its way into process control. Instead of multiple, iudependent, conventional proportional-integral-derlvative controllers for each variable, the computer can treat the set of variables as a vector and compute the control trajectory that would be optimal (in the sense of quickest, most efficient, or whatever criterion is important). Because there are many more Interactions than the number of variables, the variety of displayed signals and the number of possible adjuatments or programs the human operator may input to the computer-controller are p'n"entlally much greater than before. Thts there Is now a great need, accelerated since p. - . o 6 the events at Three Mile Island, to develop displays that integrate complex patterns of information and allow the operator to issue commands in a natural, efficient, and reliable manner. The term system state vector Is a fashionable way to describe the display of minimal chunks of information (using G. A. Miller's well-known terminology) to convey more meaning about the current state vector of variables, where it has been in the past. and where it Is likely to go in the near future. I "Vehicle Control "Unlike the processes described above, vehicles move through space and carry their operators with them or are controlled remotely. Various types of vehicles have come under a significant degree of supervisory control in the last 30 years. We might start with spacecraft because, in a sense, their function is the simplest. They are launched to perform well-defined missions, and their interaction with their environment (other than gravity) is nil. In other words, there are no obstacles and no unpredictable traffic to worry about. It was in spacecraft, especially Apollo, that human operators who were highly skilled at continuous manual control (test pilots or "Joy stick jockeys") had to adapt to a completely new way of getting information from the vehicle and giving it commands--this new way was to program the computer. The astronauts had to learn to use a simple keyboard with programs (difierent functions appropriate to different mission phases), nouns (operands, or data to be addressed or processed) and verbs (operations, or actions to be performed on the nouns). 7 Of course, the astronauts still performed a certain number of, continuous control functions. They controlled the orientation of the vehicle and maneuvered it to accomplish star sighting, thrust, rendezvous, and lunar landing. But, as is not generally appreciated by the public, control In each of these modes was heavily aided. Not only were the manual control loops themselves stabilized by electronics, but also nonmanual, automatic control functions were being simultaneously executed and coordinated with what the astronauts did. In commercial and military aircraft there has been more and more supervisory control in the last decade or two. Commercial pilots are called flight managers, indicative of the fact that they must allocate their attention among a large number of separate but complex computer-based systems. Military aircraft are called flying computers, and indeed the cost of the electronics in them now far exceeds the cost of the basic airframe. By means of inertial measurement, a feature of the new jumbo jets as well as of military aircraft, the computer can take a vehicle to any latitude, longitude, and altitude within a fraction of a kilometer. In addition there are many other supervisory command modes intermediate between such high-level commands and the lowest level of pure continuous control of ailerons, elevators, and thrust. A pilot can set the autopilot to provide a display of a smooth command course at fixed turn or climb rates to follow manually or can have the vehicle slaved to this course. The autopilot can be set to achieve a new altitude oL a new heading. The pilot can lock onto radio beams or radar signals for automatic landing. In the Lockheed L-lOll, for example, there are at least 10 separate identifiable levels of control. It is important for the pilot to have reliable means of breaking out of tiac 5A - '- automatic control modes and reverting to manual control or some intermediate mode. For example, when In an automatic landing mode the pilot can either push a yellow button on the control yoke or Jerk the yoke back to manually Set the aircraft back under direct control. Air traffic control poses interesting supervisory control problems, for the headways (spacing) betwe.a aircraft in the vicinity of major commercial airports are getting tighter and tighter, and efforts both to save fuel and to avoid noise over densely populated urban areas require more radical takeoff and landing trajectories. New computer-based communication aids will supplement purely verbal comu~unication b... pilots and ground controllers, and new display technology will help the already overloaded ground controllers monitor what is happening in three-dimensional space over larger areas, providing predictions of collision and related vital Informatinn. The CDTI (cockpit display of traffic Information) is a new computer-based picture of weather, terrain hazards such as mountains and tall structures, course information such as way points, radio beacons and markers, and runways and command flight patterns as well as the position, altitude, heading (and even predicted position) of other aircraft. It makes the pilot less dependent on ground control, especially when out-the-window visibility is poor. Mfore recently ships and submarines have been converting to supervisory control. Direct manual control by experienced helmsmen, which sufficed for many years, has been replaced both by the installation of Inertial navigation, which calls for computer control and provides capability never before available, and by the trends toward higher speed and long time lags produced by larger size (e.g., the new supertankers). LA -' f• ' •_. o~ o... , ., - . , . . . 9 New autopilots and computer-based display aids, similar to those In aircraft, are now being used in ships. Manipulators and Discrete Parts Handling In a sense, manipulators combine the functions of process control and vehicle control. The manipulator base may be carrIed on a spacecraft, a ground vehicle, or a submarine, or its base may be fixed. The hand (gripper, end effector) is moved relative to the base in up to three degrees of translation and three degrees of rotation. It may have one degree of freedom for gripping, but some hands have differentially movable fingers or otherwise hav? more degrees of freedom to perform special cutting, drilling, finishing, cleaning, welding, paint spraying, sensing, or other functions. Manipulators are being used in many different applications, including lunar moving vehicles, undersea operations, and hazardous operations In industry. The type of supervisory control and its justification differs according to the application. The fact of a three-second time delay in the earth-lunar control-loop resulting from round-trip radio transmission from earth leads to "instabilities, unless an operator waits three seconds after each of a series of incremental movements. This makes direct manual control k time-consuming and impractical. Sheridan and Ferrell (1967) proposed having a computer on the moon receive commands to complete segments of a movement task locally using local sensors and locaY !omputer program control. They proposed calling this mode supervisory control. Delays in ......- ----- -- -- -- - a.& --- control loops that report position and velocity. If the parts conveyor is sufficiently reliable, welding or painting nonexistent objects seldom occurs, so that more sophisticated feedback, involving touch or vision, Is usually not required. Manufacturing assembly, however, has proven to be a far more difficult task. In contrast to assembly line operations, in which, even if there is a mix of products, every task is prespecified, in many new applications of manipulators with supervisory control, each new task is unpredictable to considerable extent. Some examples are mining, earth moving, building construction, building and street cleaning and maintenance, trash collection, logging, and crop harvesting, in which large forces and power must be applied to external objects. The human operator is necessary to program or otherwise guide the manipulator in some degrees of freedom, to accomodate each new situation; in other respects certain characteristic motions are preprogrammed and need only to be initiated at the correct time. In some medical applications, such as microsurgery, the goal is to minify rather than enlarge motions and forceo, to extend the surgeon's haud tools through tiny body cavities to cut, to obtain tissue samples, to remove unhealthy tissue, or to stitch. Again, the surgeon controls some degrees of freedom (e.g., of an optical probe or a cauterizing snare), while automation controls other variables (e.g., air or water pressure). 'p . . . . .. . . . . . . . . . . . . . . . . . . . 12 THEORY AND METHOD There are a number of limited theories and methods in the human factors literature that should be brought to bear on the use of supervisory control systems. A great deal remains to be done, however, to apply them in this context. The discussion that follows deals with five aspects ol' the problem. The first considers current formal models of supervisory control. The second discusses display and command problems. The third takes up computer knowledge-based systems and their relation to the internal cognitive model of the operator for on-line decision making in supervisory control. The fourth deals with mental workload, stress, and research on attontion and resouzce allocation as they relate to supervisory control. The fifth is concerned with issues of human error, system reliability, trust, and ultimate authority. Modeling Supervisory Control In the area of real-time monitoring and control of continuous dynamic processes, the optimal control model (Baron and Kleinman, 1969) describes the perceptual motor behavior of closed-loop systems having relatively short time constants. Experimentation on this topic has been limited, suggesting that this class of model may be broadened to represent monitoring and discrete decision behavior in dynamic systems in which control is infrequent (Levison and Tanner, 1971). TVere are also SI_ . .. . .13 to extend Sattempts this work to explore Its upplicability to more complex systems (Baron, et al., 1981; Kok and Stassen, 1980). An increasing number of supervisory control systems can be "ropresented by a hierarchy of three kinds of interaction (Sheridan, 1982): (1) a human operator Interacting with a high-level computer, (2) low-level computers interacting with physical entities in the environment, and (3) the resulting multilevel and multiloop Interaction, having Interesting symmetrical properties (Figure 4-1). Since there are three levels of intelligence (one human, two artificial), the allocation of cognitive and computational tasks among the three becomes central. Using Rasmussen's (1979) categorization of behavior into knowledge-based, "rule-based, and skill-based behavior, the operator may assign rule-based tasks (e.g., pattern recognition, running planning and predictive models. organizing) to the high-level computer (Figure 4-2). Similarly, skill-based tasks (filtering, display generation, servo-control) may be assigned to various low-level computers. The operator must concentrate on the environmental tasks that compete for his attention, allocating his atten-ion among five roles: (1) planning what to do next, (2) teaching S., or on-line programming of the computer(s), (3) monitoring the (semi) automatic behavior of the system for abnormalities, (4) intervening when necessary to make adjustments, maintaining, repairing, or assuming direct control, and (5) learning from experlence. -,I 4-: 14 I. Task Isobserved directly by human operator. HUMA OPRATR 2 Task isobservegd indirectly through siansors, HUMAOPEATORcomputers, end displays. This SAS feedback interacts with that from within HIS. 3. Task is controlled within SAS automnatic mode. -4. Tosk is affected by the prosemsof being sensed. 109 5. Task affects actuators avd In turr, is atfectsti. (1)6. Human operator directly affects task~. DIPAY ONRL 7. Human operator affects task indirectly xnrotu~h DISPAYSCN I OLScontrols, HIS computers. and actuators. This control interacts with that from within SAS. wj S. Human operator pta feediack i. z- ti 10. Human operator adjusts di~sla Pararr.ng.tr. IHI zO PUE 0000 0 00 0000 0000 ~ 1 SAS U COMPUTER 223 4K SENSORS ACTUTOs rLGURZ 4-1 Multiloop Interaction in a Supervisory Control System q, . * 15 Hunman Operator orn intervene e.1nitor leaceeh behavior behavior T :" ; lkill~ond * behavior Human Interactive Subsystom (HIS) avie-beme"d'llII x i .cl , S 2 S3 01 0203 " j12 ,- i'I .,.. tl behavior Sari-Autornatd Subwywme (SAS) /SA.31-1SAS-2 $A$ Computer exietgnson of 4. a4; . - 4 91 S2~ 0~02 03fRii *I- .4 .' --.. . ..- ; - ; ; . . .. . I l . FIGURE 4-2 Multilevel All1ocation-of Tasks in a Supervisory Control System 16 Display and Command Design of integrated computer-generated displays is not a new problem, and the military services and space agencies have pioneered developments in this area for aircraft and various command and control systems. But * the technology continues to create more possibilities. Operators of supervisory control systems need to have fewer displays, not more, telling them what they want or need to know when they want or nee,! ýo know it. An additional design problem is that what operators thing%- need and what they really need may differ. As computer collaborators become more and more sophisticated a useful type of display would tell the operator what the computer knows and assumes, both about the system and about the operator, and what it intends to do. An important source of guidance regarding the design of displays has * been and will continue to be the intuitive beliefs of experienced operators. The designer needs to know how much credence to give to these Intuitions. Too little attention may mean forfeiting a valuable source of Information; too much may result in inappropriate designs that fit untested folk wisdom (a pilot's belief in the value of verisimilitude-in displays is an example of the latter problem). Ericsson and Simon's taxonomy (1980) of situations in which introspection is more and less valid is one point of departure for research. Studies of metacognition, people's understanding of t!.eir own ccgnitive processes (as contrasted with current pGychologicql understanding), are a *econd (Cavanaugh and 17 Borkowski, 1980). The studies of clinical judgment conducted In the 1950s and 1960s (Goldberg, 1968) are a third. These studies found that In the course of their diagnoses expert clinicians imagine that they rely 6n more variables and use theam in more complex manner than appears to be the case from attempts to model their diagnostic processes. Although good-quality computer-generated speech is both available and "cheap, and although it can give operators warnings and other information without their prior attention being directed to it, little imaginative use of such a capability has been made as yet in supervisory control. The use of command language has arisen more recently In conjunctiou "with teaching or programming robot systems. A more primitive form of it "Is found in the new autopilot command systems in aircraft. Giving commands to a control system by means of strings of symbols in syntax is a now game for most operators. Progress in this area depends on careful technology transfer from data processing that is self-paced to dynamic control In which the pace is determined by many factors. Naturalness in use of such language is also an important goal. Command, In many circumstances, is not a solitary task. The operator must interact with many individuals in order to get a job done. This may be particularly the case when the nature of the emergency means that the technical system cannot be trusted to report and respond reliably--that is, an interacting human system may assume (and perhaps interface with) some of the functions of the interacting technical system. The kinds of human Interaction possible include requesting information, monitoring the response of the system, notifying outsiders (e.g., for evacuation, to provide special skills), and terminating unnecessary communications. When are these Interactions initiated? How valid are the cues? Wivat ,. .. . . . . . . . ..- features of technical systems make such intervention more and less feaslblet How does having others around affect operatorsW thoughts and actions (e.g., are they more creative, more risk-averse, more careful)? Another question that arises with multiperson systems is whether one "individual (or group) should both monitor for and cope vith crises. in medicine it is not always assumed that the same individual has expertise in both diagnosis and treatment. Perhaps In supervisory control systems the equivalent functions should be separated, and different training an" "temperament called for in monitoring and In Intervention. Computer Knowledge-Based Systems and the Operator's Internal Cognitive Model It Is not a new Idea that, In performing a task, people somehow represent the tosk in their heads and calculate whether, given certain constraints, doing this will result in that. Such ideas derive from antiquity. A,. ""Human-Machine a Control "In the 1950a the development of the "observer" In control systems theory formalized this idea. That is, a differential equation model of the external controlled process is included in the automatic controller and is driven by the same input that drives the actual process. Any discrepancy between the output of this computerized model of the -, environmental process and the actual process Is fed back as a correction • . . . . . .. . .. W....-. ... .... .- .-...- '.5 19 to the internal model to force its variables to be continuously the same as the actual process. Then any and all state variables as represented (observed) in the internal model may be used to directly control the process, if direct measurement of those same variables in the actual environment may be costly, difficult, or impossible. This physical realization of the traditional idea of the internal model probably provoked much of the current research in cognitive science. Running in fast-time, updating initial conditionn at each of a succession of such calculations, the model becomes a "predictor display" that provides the operator with a projection of what will happen under given assumptions of input (Kelly, 1968). Further comparisons can be made between outputs of such real-time models run In the computer and those of the operator's own internal model, not only for control but also for failure detection and isolation (Sheridan, 1981). Teach has developed a realization of this as an operator aid for application to process control (Tsach et al., 1982). Ideally the computer should keep the operator Informed of what it is assuming and computing, and the operator should keep the computer Informed of what he or she Is thinking. Cognitive Science In the last several years cognitive psychology has contributed some theories about human Inference that make the application of knowledge-based systems particularly relevant to supervisory control. The idea Is that reasoning and decision making consist of the developing 20 and searching of complex problem spaces (Newell and Simon, 1972) and of applying one or more inference procedures about information In a knowledge base that represents the decision maker's understanding of the situation (Collins and Loftus, 1975). This is similar to but more inclusive and less well developed than the internal process model used by control theorists. Rasmussen's (1979) qualitative model of human decision making about process control is entirely compatible with this view. And, the contribution of specialists in artificial intelli.:,•, concerning knowledge-based systems provides one way to implement th'. computer portion of such human-computer interaction. A number of human factors problems relate to people's ability tc h,.,, in mind the basic workings of a complex system and to update that view depending on the current state of the system. Recent studies of ,N cognitive processes in skilled operators such as taxi drivers (Chi et 1l., 1980) or chess players (Chase and Simon, 1973) begin to provide the "kind of information that will be needed by human factors designers evaluating these issues. For example, how can people best be trained to . develop effective problem spaces? What is the optimal mix of analog and digital representation? How can the computer's data base system be used "to aid the individual in developing and updating of such an internal model? What means can be used to ensure that the current state of the model fits with the current state of the system? With what frequency should a person be interrogated about his or her current view of the model to make sure that he or she is still "with It" in control of the system? For human supervision to be really effective, a detailed understanding of how the human controller grasps a complex system at any moment in time and updates it over time Is necessary. S6 21 How can we determine a given operator's internal cognitive model of a given task at a given time? One method is to ask the operator to express it in natural language, but the obvious difficulty is that each -operator's expression is unique, making it very difficult to measure either discrepancy from reality or to compare across operators. Verbal protocol techniques (Bainbridge, 1974) make use of key words and relations. More formal psychometric techniques (multiattribute utility assessment, conjoint or multidimensional scaling, interpretive structural "modeling, policy capturing, and fuzzy set theory) offer some promising ways of telling a computer one's knowledge and values in structural form. "*• A likely (and perhaps common) source of difficulty is a mismatch in the mental models of a system of those who design It and those who operate it. Operators who fail to recognize this disparity are subject to unpleasant surprises when the system behaves in unexpected ways. Operators who do recognize it may fail to exploit the full potential of N the system for fear of surprises if territory (Young, 1981). they push it into unfamiliar On a descriptive level, it would be useful to understand the correspondence between the mental models of designers and operators as well as to know which experiences signal operators that there is a mismatch and how they cope with that information. On a practical level, it would be useful to know more about the possibility of improving the match of these two models by steps such as involving operators more in the design process or showing them how the design evolved (rather than giving them a reconstruction of Its final state). The magnitude of these problems is likely to grow to the extent that designers and operators have different training, experience, and intensity of involvement with systems. 22 "Mental Workload The concept of mental workload as discussed In this section is not unique to supervisory control, but it is sufficiently Important in this context "I to be Included here as a special consideration. Human-Machine Control (This section is adapted from Sheridan and Youn-h. 1982). During the last decade "mental workload" has become a concept of great controversy, not because of disagreement over whether it is important, but because of disagreement over how to define and measure It. Military specifications for mental workload are nevertheless being prepared by the Air Force, based on the assumption that mental workload measures will predlct--either at the design stage or during a flight or other operation--whether an operation can succeed. In other words, it is believed that measurements of mental workload are more sensitive In anticipating when pilot or operator performance will break down than are conventional performance measures of the human-machine system. At the present time "mental workload" is a construct. It must be inferred; It cannot be observed directly like human control response or system performance, although it might be defined operationally in terms of one or several or a battery of tests. There is a clear distinction between mental and physical workload: The latter is the rate of doing 23 mechanical work and expending calories. There Is consensus on measurements based on respiratory gases and other techniques for measuring physical workload. "Of particular concern are situations having sustained mental workloads of long duration. Many aircraft missions continue to require such effort by the crew. But the introduction of computers and automation in many systems has come to mean that for long periods of time operators have nothing to do-the workload may be so low as to result in boredom and serious decrement in alertness. The operator may then suddenly be expected to observe events on a display and make critical judgments--indeed, even to detect an abnormality, diagnose what failed, and take over control from the automatic system. One concern Is that the operator, not being "in the loop," will not have kept up with what is going on, and will need time to reacquire that knowledge and orientation to make the proper diagnoses or take over control. Also of concern is that at the beginning of the transient the computer-based information will be opaque to the operator, and it will take some time even to figure out how to access and retrieve from the system the needed information. There have been three approaches to measuring mental workload. One approach, used by the aircraft manufacturers, avoids coping directly with measurements of the operator per se and bases workload on a task time-line analysis: the more tasks the operator has to do per unit of time, the greater the workload. This provides a relative index of workload that characterizes task demand, other factors being equal. It says nothing about the mental workload of any actual person and indeed could apply-to a task performed by a robot. ",% ,*5 -- •,_ ._• .,-,z-,•-- • -• - , i- • -, ••' • :• , -, './ / . 24 "The second approach is perhaps the simplest--to use the operator's subjective ratings of his or her perceived mental workload. This may be done during or after the events Judged. One form of this is a single-category scale similar to the Cooper-Harper scale for rating * Iaircraft handling quality. Perhaps more interesting is a three-attribute scale, there being some consensus that "fraction of total time busy," "cognitive complexity," and "emotional stress" are rather different characteristics of mental workload and that one or two of these can be large when the other(s) are small. These scales have been used by the military services as well as aircraft manufacturers. A criticism t, •nex. Is that people are not always good judges of their own ability to pcr:-!:n in the future. Some pilots may judge themselves to be quite capable of p further sustained effort at a higher level when in fact they are not. . The third approach is the so-called secondary task or reserve , capacity technique. In it a pilot or operator is asked to allocate whatever attention Is left over from the primary task to some secondary task, such as verbally generating random numbers, tracking a dot on a screen with a small joy stick, etc. Theoretically, the better the performance on the secondary task, the less the time required and therefore the less the mental workload of the primary task. A criticism of this technique Is that it Is intrusive; it may itself reduce the attention allocated to the primary task and therefore be a self-contaminating measure. And, in real flight operations the crew may not be so cooperative in performing secondary tasks. The fourth and final technique is really a whole category of "partiallyexplored possibilities--the use of physiological measures. Many such measures have been proposed, including changes in the ~--r .. - 25 electroencephalogram (ongoing or steady-state), evoked response potentials (the best candidate Is the attenuation and latency of the so-called P3 0 0 1 occurring 300 milliseconds after the onset of a challenging stimulus), heart rate variability, galvanic skin response, pupillary diameter, and frequency spectrum of the voice. All of these "have proved to be noisy and unreliable. Both the Air Force and the Federal Aviation Administration currently have major programs to develop workload measurement techniques for aircraft piloting and traffic control. If an operator's mental workload appears to be excessive, there are several avenues for reducing It or compensating for it. First, one should examine the situation for causal factors that could be redesigned to be quicker, easier, or less anxiety-producing. Or perhaps parts of the task could be reassigned to others who are less loaded, or the procedure could be altered so as to stretch out In time the succession of events loading the particular operator. Finally, it may be possible to give all or part of the task to a computer or ou*omatic system. Cognitive Science It is important, for purposes of evaluating both mental workload and cognitive models as discussed in the previous section, to note that there t*, has been an enormous change in models of mental processing in both psychology and computer science. In their recent paper, Feldman and Ballard (in press) argue that 're•' ! " "1 , r - . 26 Contemporary computer science has sharpened our notions of what is "computable" to include bounds on time, storage and other resources. It does not seem unreasonable to require that computational models and cognitive science be at least as plausible in their postulated resource requirement. The critical resource that is most obvious is time. Neurons, whose basic computational speed is a few milliseconds must be made to account for complex behaviors which are carried ou. i:-a few hundred milliseconds . . . (Posner, 1978). This means that higher complex behaviors are carried out in less than a hur±,.:ed time steps. it may appear that the problem posed here is inherently unsolvable and that we have made an error in our "formulation, but recent results in computational complexity .* theory suggest that networks of active computing elements can carry out at least simple computations in the required time range-these solutions involve using massive numbers of units and connections and we also address the question of limitations on these resources. There is also evidence from experimental psychology (Posner, 1978) that the human mind is, at least in part, a parallel system. From neuropsychological considerations there is reason to suppose that a "parallelism is represented in regional areas of the brain responsible for different sorts of cognitive functions. For example, we know that different visual maps (Covey, 1979) underlie object recognition and that separate portions of the cortex are involved in the comprehension and *• production of language. We also know more about the role of subcortical and cortical structures in motor control. 27 The study of mental workload has simply not kept up with these advances In the conceptualization of the human mind as a complex of subsystems. The majority of researchers of human workload have studied the interference of one complex task with another. There is abundant evidence in the literature that such Interference does occur. However, this general interference may account for only a small part of the variance in total workload. More important may be the effects of the specific cognitive systems shared by two tasks. Indeed, Kinsbourne and Hicks (1978) have recently formulated a theory of attention in which the degree of facilitation or interference between tasks depends on the distance between their cortical representation. The notion of distance may be merely metaphorical, since we do not know whether It represents the actual physical distance on the cortex or whether It involves a relative interconnectivity of cortical arpa; the latter idea seems more reasonable. Viewing humans in terms of cognitive subsystems changes the .4 perspective on mental workload (see Navon and Gopher, 1979). It is unusual for any human task to Involve only a single cognitive system or to occur at any fixed location In the brain. Most tasks differ in sensory modality, in central analysis systems, and in motor output systems. There is need for basic research to understand more about the separability and coordination of such cognitive systems. We also need a task analysis that takes advantage of the new cognitive systems approach to ask how tasks distribute themselves among different cognitive systems 4 and when performance of different tasks may draw on the same cognitive system. There Is also an obvious connection between a cognitive systems approach and analysis of .individual differences based on psychomet.. :.,: - . . . . . . -W7 28 information processing concepts, and much needs to be done to link analysis of individual abilities to the ability to time-share activity within the same cognitive system or across different systems (Landman and Hunt, 1982). An emphasis on separable cognitive systems does not necessarily mean "that a more unified central controlling system is unnecessary. Indeed, widespread Interference between tasks of very different types (P. 'r 1980) suggests that such a central controller is a necessary aspect of human performance. There are a number of theoretical views addrtz--i, the problem of self-regulation of behavior, particularly in str•_ - situations. Two principles have been applied by human factors engineers: The first is that attention narrows under stress. Thus, more attention Is allocated to central aspects of the task while less attention Is allocated to more peripheral or 6econdary aspects. Sometimes this principle has been applied to positions In visual space, arguing that peripheral vision is sacrified more than central vision "under stress. The degree to which the general principle applies automatically to positions In visual space or to allocation of function within tasks Is simply not very well understood--but It should be. A second principle of the relationship between stress and attention suggests that under stress habitual behaviors take precedence over new or novel behaviors. The Idea is that behaviors originally learned under stressful conditions tend to return when conditions are again stressful. This view is particularly important with respect to the process of changing people from one task layout to another. If the original learning takes place under high stress conditions while transition occurs 2 '.7-' 1- ' 29 under relatively low stress conditions, a stressful situation may tend to reinstate the responses learned in the original configuration. Recently cognitive psychologists have tegun to take into account eiotional responses produced under conditions of stress (Bower, 1981). One development emphasizes links between individual differences in emotional responding and attention (see Posner and Rothbart, 1980, for a review). Although It is a highly speculative hypothesis at this time, this work suggests that attention may be viewed as a method for "controlling the de~ree of emotional responding that occurs during stressful conditions. In particular, differences in personality and "temperament may affect the degree to which attention and other mechanisms are successful in managing stress. These new models relate emotional responding to more cognitive processes. They have the potential of helping us understand more about the effects of emotion and how it may :4 guide cognition and behavior under stressful conditions. Since this work has just begun, there are few general principles to link the emotional responses to cognition as yet. Developments along this line could be useful for human factors engineers, particularly those involved in training and retraining and those involved In mangement of stress under battlefield conditioins. For the most part, this discnasion has been from the viewpoint of the "overloaded operator. For such of the time, however, the operator may be underloaded. In the field of vigilance research, which is concerned with human behavior in systems In which signal detection is required but the signals are infrequent and difficult to detect, a great deal is known about exactly what parameters of signal presentation affect performance. The signal detection model (Green and Swets, 1966) has been shown to be 30 useful In analyzing such behavior. Again, its applicability has not been evaluated In more complex tasks in which signals are represented by more complex patterns of activity as would be the case In supervisory control systems of the types described above. Human Proficiency and Error; Culpability, Trust, and Ultimate Authority "Designers of the large, complex, capital-intensive, high-risk-of-failure systems we have been discussing would like to automate human operator. out of their systems. But they know they must depend on them to plan, program, monitor, step in when failures occur with the automation, and generalize on system experience. They are also terrified of human error. Both tht commercial aviation and the nuclear power industries are actively collecting data on human error and trying to use it analytically in conjunction with data on failures in physical components and subsystems to predict the reliability of overall systems. The public and the Congress, In a sense, are demanding it, on the assumption that it is clear what human error is, how to measure it, and even how to stop It. Human error is covoonly thought of as a mistake of action or judgment that could have been avoided had the individual been more alert, attentive, or conscientious. That is, the source of error is considered to be internal and therefore within the control of the individual and not induced by external factors such as the design of the equipment, the task jQ requirements, or lack of adaquate training. Some behavioral scientists may claim that people err because they are operating "open loop"--without adequate feedback to tell them when they -.. .................................... '..'t. "....'..."."................".'... .. .. 31 are In error. They would have supervisory control systems designers provide feedback at every potential misstep. Product liability litigants sometimes take a more extreme stance--that equipment should be designed io that It Is error proof, without the opportunity for people to (begin to) err, get feedback, then correct themselves. The concept of human error needs to be examined. The assertion that an error has been committed implies a sharp and agreed-upon dividing line between right and wrong, a simple binary classification that is obviously an oversimplification. Human decision and action involve a multidimensional continuum of perceiving, remembering, planning, even socially interacting. Clearly the fraction of errors In any set of human response data Is a function of where the boundry is drawn. How does one decide where to draw the line dividing right from wrong across the many dimensions of behavior? In addition, is an error of commission. (e.g., actuating a switch when it is not expected), equivalent to an error of omission, (e.g., failing to actuate a switch when It Is expected)? Is It useful to say, in both these instances, an error has been committed? What then exactly do we mean by human error? People tend to differ from machines In that people are more inclined to make "common-mode errors," in which one failure leads to another, presumably because of concurrency of stimuli or responses iu space or time. Furthermore, as suggested earlier, if a person is well practiced in a procedure ABC, and must occassionally do DBE, he or she Is quite likely in the latter case to find himself or herself doing DBC. This type of error is well documented in process control, In which many and varied procedures are followed. IrA addition, when people are under stress of emergency they tend more often to err (sometimes. howevez, Ile ____-_________,-t w-. -",,, 'i-=-r' • -,' . . . . "," - -- 7- 32 analysts may assume that operators are aware of an emergency when they "are not). People are also able to discover and correct their own errors, S.,which they surely do in many large-scale systems to avert costly accidents. Presumably the rationale for defining human error is to develop means for predicting when they are likely to occur and for reducing their frequency (Swain and Gutman, 1980). Various taxouomies of human error have been devised. There are errors of omission and errors of U-' comission. Errors may be associated with sensing, memory, decision making, or motor skill. Norman (1981) distinguishes mistakes (wrong intention) from slips (correct intention but wrong action). But at present there is no accepted taxonomy on which to base the definition of human error, nor is there agreement on the dimensions of behavior that should be invoked in such a taxonomy. There is usefulness in both a case study approach to human error and in the accumulation of statistics on errors that lead to accidents. Both these approaches, however, require that the investigator have a theory or model of human error or accident causation and the framework from which to approach the analysis. In addition there is a need to understand the causal chain between human error and accident. One has only to examine a sampling of currently used accident reporting forms to realize the importance of the need for a framework for analyzing human error. They range from medical history forms to equipment failure reports. None that we have examined deals satisfactorily with the role of human behavior in contributing to the accident circumstances. 33 Furthermore, for accident reports to be useful, their aim needs to be specified. There is an inherent conflict between the goals of understanding what happened and attempting to fix blame for it. The "former requires candor, whereas the latter discourages it. Other potential biases in these reports include: (a) exaggerating in hindsight I what could have been anticipated In foresight; (b) being unable to reconstruct or retrieve hypotheses about what was happening that no longer makes sense in retrospect; (c) telescoping the sequence of events (making their temporal course seem shorter and more direct); (d) exaggerating one's own role in events; (e) failing to see the internal logic of others' actions (from their own perspective). Variants of these reporting biases have been observed elsewhere (Nisbett and Ross, 1980). Their presence and virulence in accident reports on supervisory control systems merits attention. In addition to these fundamental research needs, there is a variety of related issues particularly relevant to supervisory control systems that should be addressed. a. In supervisory control systems it is becoming more and more difficult to establish blame, for the information exchange between operators and computers is complex, and the "error,` if there ever was any, could be in hardware or software design, maintenance, or management. Host of us think we observe that people are better at some kinds of tasks than computers, and computers are better at some others. Therefore, it seems that it would be quite clear how roles should be allocated between people and computers. But the interactions are often so subtle as to elude understanding. It is also co-ventional wisdom to say that people should have the ultimate authority over machines. But 34 again$ in actual operating systems we usually find ou& aelves Ill prepared to assert which should have authority under what circumstances and for how long. Operators In such systems usually receive fairly elaborate training In both theory and operating skills. The latter is or should be done on simulators, since in actual systems the most Important (critical) events "for which the operator needs training seldom occur. Unfortunately there A has been a tendency to standardize the emergencies (classic stall or engine fire in aircraft, large-break loss-of-cooling accident In nuclear plants) and repeat them on the simulator until they become fixed paLiarns " - of response. There seldom Is emphasis on responding to new, unusuai emergencies, failures in combination, etc., vhich the rule book never "anticipated. Simulators would be especially good for such training. A frustreting, and perhapa paradoxical, feature of "emergency" Intervention is that supervisors must still rely on and work with systems that they do not entirely trust. The nature and success of their Intervention Is likely to depend on their appraisal of which aspects of the system are still reliable. Research might help predict what doubts about related malfunctions are and are not aroused by a particular malfunction. Does the spread of suspicion follow the operator's mental model (e.g., lead to other mechanically connected subsystems) or along a more associative line (e.g., mistrust all dials)? A related problem is how experience with one malfunction of a complex system cues the interpretation of subsequent malfunctions. Is the threshold of mistrust lowered? Is there an unjustified assumption that the same problem is ,* repeating Itself, or that the same Information-searching procedures are needed? How Is the expectation of successful coping affected? Do 35 operators assume that they will have the same amount of time to diagnose and act? Finally, how does that experience generalize to other technical systems? Do bad experiences lead to a general resistance to innovation? "Akey to answering these questions is understanding the operators' own attribution processes. Do they subscribe to the same definition of human error as do those who evaluate their performance? What gives them a feeling of control? How do they assign responsibility for successful and unsuccessful experiences? Although their mental models should provide some answers to these questions, others may be sought in general principles of causal attribution and misattribution (Harvey, et al., 1976). CONCLUSIONS AND RECOMM•ENDATIONS Supervisory control of large, complex, capital-intensive, high-risk systems is a general trend, driven both by new technology and by the V.. belief that this mode of control will provide greater efficiency and reliability. The human factors aspects of supervisory control have been neglected. Without further research they may well become the bottleneck and most vulnerable or most sensitive aspect of these systems. Reseach is needed on: (1) How to display integrated dynamic system relationships in a way that is understandable and accessible. This includes how best to allow the computer to tell the operator what it knows, assumes, and intends. 12. 66 36 (2) How best to allow the operator to tell the computer what he or she wants and why, in a flexible and natural way. (3) How to discover thi Internal cognitive model of the environmental process that the operator Is controlling and Improve that cognitive representation if it is inappropriate. (4) Now to aid the cognitiv. process by computer-based knowledge structures and planning models. (5) Why people make errors in system operation, how to minimize these errors, and how to factor human errors into analyses of system reliability. (6) How mental workload affects human error making in systems operation and refinement and standardization of definitions and measures of mental workload. 1 (7) Whether human operator or computer should have authority under what circumstances. j8) How to coordinate the efforts of the different humans involved in supervisory control of the same system. (9) How best to learn from experience with such large, complex, interactive systems. N (10) How to improve communication between the designers and operators of technical systems. Research Is needed to Improve our understanding of human-computer collaboration in such systems and on how to characterize it in models. The validation of such models Is also a key problem, not unlike the * problem of validating socioeconomic or other large-scale system models. In view of the scale of supervisory control systems, closer [h collaboration between researchers and systems designers in the ................................... * . 3 development of such systems may be the best way for such research* .odelings, end validation to occur. And perhaps data collection should be built in to the normal--amnd abnormal--operation of such systems. ,= 4' 38 REFERENCES Bainbridge, L. 1974 Analysis of verbal protocols from a process control task. In E. Edwards and F. Less, eds._ The Human Operator in Process Control. London: Taylor and Francis. Baron, S., Zacharias, G., Muralidharan, R., and Lancraft, R. 1981 PROCRU: A model for analyzing flight crew procedures in approach to landing. In Proceedings of the Eighth IFAC World Congress, Tokyo. Baron, S., and Kleinman, D. 1969 The human as an optimal controller and information processor. IEEE Trans. Man-Machine Systems MSS-10(11):9-17. Bower, C. 1981 Mood and memory. American Psychologist 36:129-148. Cavanaugh, J. C., and Borkowski, J. C. 1980 Searching for meta-memory-memory connections. Develop9ental Psychology 16: 441-453. '4 *"*'4~~~ 4. 44.1.4 . . . . .-. . 39 Chase, W. G., and Simon, H. A. 1973 The mind's eye In chess. In W. G. Chase, ed., Visual Information Processing. New York: Academic Press. Chi, M. T, H., Chase, W. G., end Eastman, R. 1980 Spatial Representation of Taxi Drivers. Paper presented to "the Psychonomics Society, St. Louis, November. - Collis, A. M., and Loftus, E. H. * 1975 A spreading activation theory of semantic processing. Psychologlcal Review 82:407-428. Covey, A. 1979 Cortical maps and visual perception. Quarterly Journal of Experimental Psychology 31:1-17. 4, Ericsson, A., and Simon, H. 1980 Verbal reports as data. Psychological Review 87:215-251. Feldman, J. A., and Ballard, D. H. "* in Connectionist models and their properties. In J. Beck and press A. Rosenfeld, eds., Human and Computer Vision. New York: Academic Press. Goldberg, L. 1. 1968 Simple models or simple processes? Some research on clinical judgments. American Psycholoist 23:483-496. .. . . . . . . . . . . . . . .49.. . . . 40 Green, D. M., and Swets, J. A. 1966 Signal Detection Theory and Psychophysics. New York: John Wiley. Harvey, J. H., Ickes, W. J., and Kidd, R. F., eds. 1976 New Directions iu.Attribution Research. Hillsdale, N. J.: Lawrence Erlbaum. Kelly, C. 1968 Manual and Automatic Control. New York: John Wiley. Kinubourne, M., and Hicks, R. L. 1978 Functional cerebral space: a model for overflow transfer and Interference effects in human performance. In J. Requin, ed., Attention and Performance VIII. H•llsdale, N.J.: Lawrence Erlbawu. Kok, J. J., and Stassen, H. G. * -. 1980 Human operator control of slowly responding systems: supervisory control. Journal of Cybernetics and "Information Sciences 3:124-174. Landman, N., and Hunt, E. B. 1982 Individual differences in secondary task performance. Memory and Cognition 10:10-25. -; - . . . . . . . . . . . . -- 7 41 Levison, W. H., end Tanner, R. D. 1971 A Control-Theory Model for Human Decision Making. National Aeronautics and Space Administration CR-1953, December. Navon, D., and Gopher, D. 1979 On the economy of the human-processing system. Psychological Review 86:214-230. Newell, A., and Simon, H. A. 1972 Human Problem Solving. Englewood Cliffs, N. J.: "Prentice-Hall. *Nisbett, R., and Ross, L. 1*080 Human Inference: Strategies and Shortcomings of Social Judgment. Englewood Cliffs, N. J.: Prentice-Hall. Norman, D. A. 1981 Categorization of action slips. Psychological Review _e:-,88:1-15. Posner, M. I. "1978 Chronometric Explorations of Mind. Hillsdale, N. J.: Lawrence Erlbauu. 1980 Orienting of attention. Quarterly Journal of Experimental P 32:3-25. S. . -. -. -.. -.-.. ".-.-.-.."...-.."...-..... ..... 42 Posner, M. I., and Rothbart, M. K. 1980 Development of attentional mechanisms. In J. Flov'ers, ed., Nebraska Symposium. Lincoln: University of Nebraska Press. Rasmussen, J. 1979 On the Structure of Knowledge--A Morphology of Mental "Models in a Han-Machine System Context. RISO National Laboratory Report M-2192. Roskilde, Denmark. Sheridan, T. 1981 Understanding human error and aiding human diagnostic behavior in nuclear power plants. In J. Rasmussen and W. Rouse, eds., Human Detection and Diagnosis of System Failures. New York: Plenum Press. 5., •,1982 Supervisory Control: Problems, Theory and Experiment in Application to Undersea Remote Control Systems. MIT Han-Machine Systems Laboratory Report. February. Shoridan, T., and Ferrell, W. R. 1967 Supervisory control of manipulation. Pp. 315-323 in Proceedings of the 3rd Annual Conference on Manual Control. NASA SP-144. Sheridan, T. t,and Young, L. R. 1982 Human Factors In aerospace. In R. Dehart, ed.., Fundamentals of Aerospace Medicine. Philadelphia: Lea and Febiger. S........ . ...- . .. . .- . *. - -7 *-r .. *~%*. -- ~ -- 43 * Swain, A. D., and Guttman, H. E. "1980 Handbook of Human Reliability Analysis with Emphasis on Nuclear Power Plant Applications NUREG/CR 1278. Washington, D.C.: Nuclear Regulatory Commission. Teach, U., Sheridan, T. B., and Tzelgov, J. in A New Method for Failure Detection and Location in Complex press Systems. Proceedings of the 1982 American Control Conference. New York: Institute of Electrical and Electronics Engineers. U. S. Army Research Institute 1979 Annual Report on Research, 1979. Alexandria, Va.: Army Research Institute for the Behavioral and Social Sciences. Young, R. Mo 1981 The machine inside the machine: users' modelo of pocket calculators. International Journal of Man-Machine Studies 15:51-85. Zajonc, R. B. 1980 Feeling and knowledge: preferences need no inferences. American Psychologist 35:151-175. 4" V "USER-COMPUTER INTERACTION INTRODUCTION Electronic computers have probably had a more profound effect on our society, on our ways of living, and on our ways of doing business than any other technological creation of this century. Computers help manage our finances, checking accounts, and charge accounts. They help schedule rail and air travel, book theatre tickets, check out groceries, diagnose illnesses, teach our children, and amuse us with sophisticated games. Computers make it possible to erase time and distance through telecommunications, thereby giving us the freedom to choose the times and places at which we work. They help guide planes, direct missiles, guard our shores, and plan battle strategies. Computers have created new industries and have spawned new forms of crime. In reality, computers have become so intricately woven into the fabric of daily life that without them our civilization could not function as it does today. Small wonder that all these effects have been described as the results of a computer revolution. Gantz and Peacock (1.981) estimate that the total computer power available to U.S. businesses increased tenfold in the last decade, and The principal authors of this chapter are Alphonse Chapanis, Nancy S. Anderson, and J. C. R. Licklider. 2 that it is expected to double every two to four years. According to the most recently available estimates (U.S. Bureau of the Census, 1979), there are currently about 15 million computers, terminals, and electronic office machines in the United States. That number is expected to grow to about 30-35 million by 1985, at whith time there will be roughly one comptter-based machine for every three persons employed in the white-collar work force. Spectacilqr advances in computer technology have made this growth possible, decreasing the cost of computer hardware at the rate of about 30 percent a year during the past few decades (Dertouzos and Moses, 1980). Computers are still not as widely accepted as they might be. Tn study by Zoltan and Chapanis (1982) on what professionals think about computers, over 500 certified public accountants, lawyers, pharmacists, and physic!ans In the Baltimore area filled out a 64-item questionnaire on their experiences with and attitudes toward electronic computers. SI: factors emerged from a factor analysis of the data. Factor I, the largest in terms of the variance accounted for, is a highly positive grouping of adjectives attesting to the competence and productivity of computers, such as eftictent, precise, reliable, dependable, effective, and fast. Factor II, the secon:d largest in terms of the variance accounted for, is made up of highly negative adjectives: dehumanizing, depersonallzing, impersonal, cold, and unforgiving. Still another factor in the Zoltan-Chapanis study indicates discontent with computers in terms of their ease of use. The respondents thought that computers are difficult and complicated and that computing !- lauguages are not simple to understand. These views are apparent in ,:! .I . . . . . . . . .. . . , . 3 their responses to such statements as: "I would like a computer to . accept ordinary English statements" and "I would like a computer to accept the jargon of my profession," both of which,they agreed with strongly. The flndiugs of that study are generally In agreement with more Informal reports in the popular press and other media about difficulties people have with computers and their use. Indeed, concerns about making computers easy to use can have serious economic consequences that may "have to be faced by more and more computer manufacturers. For example, a small company in California was recently awarded a verdict for substantial monetary damages because of the inadequate performance of a computer that the company had purchased (Bigelow, 1981). In rendering his opinion substantiating the award, the presiding judge said, "It's a particularly serious problem, it seems to me, in the computer industry, particularly in that part of the industry which makes computers for first-time users, and seqks to expand the use of computers by . . targeting as purchasers businesses that have never used computers before, who don't have any experience in them, end who don't know what the consequences are of a defect and a failure" (Bigelow, 1981:94). In Europe resistance to computerization has taken a somewhat different form than that in the United States. Television programs roughly equivalent to the American program 60 Minutes have been broadcast about the real and imagined evils of computers. Several countries--Austria, England, France, Germany, and Sweden among them--have prepared strict standards for the design of computer systems and have enacted federal laws restricting hours of work at computer terminals. 4 "Similar regulations may soon be In effect in this country. One difficulty is that current standards and regulations about computers are V.; sometimes based on skimpy and unreliable data and sometimes on no data at "all (Rupp, 1981). Whatever their origins, these events and trends ere Ssymptoms of fairly widespread uneasiness and malaise about computers, their usefulness, and usability. No one denies that computers are here to stay. The Important question is: "hwo can we best design them for effective human use?" This chapter describes some of the research needed to answer that question. Rcesearch needs are identified throughout the chapter. However desirable it might appear to assign specific priorities to each, wt !ect that It is difficult and risky to do so for at least three reason;.. First, computer hardware, software, and interface design features are changing very rapidly (for a summary of the trends and progress in computer development see Branscomb, 1982). So, for example, the increased availability of modularly arranged components for microcomputers for personal use, in the office and at school as well as new networking and communications features allow design improvements to be made quickly by t:iaJ and error. As Nickerson (1969) has pointed out, such trial-and-error design improvements can be made more quickly than they could be by careful laboratory research studies. Second, practical considerations are likely to be significant detarminants of what research can be performed. Operational computer systems rarely can be disrupted for research purposes, and up-to-date hardware and goftware as well as appropriate groups of users are not always available. Under these circumstances it takes great ingenuity to 5 "4, conduct human factors research on user-computer Interactions that can produce useful, generalizable results. Constraints and opportunities are therefore more likely than assigned priorities to dictate what research is performed. Third, there is a definite need for good human factors research in all the areas we discuss, even with the caveat that technology is changing rapidly and good research is difficult to conduct. With these qualifications in mind, we do provide at certain places In this chapter, short summaries indicating those research needs that we feel have higher prigrities than others. THE COMPUTER SYSTEM Computer systems and their environments have been diagrammed and modeled in various ways. Figure 5-1 illustrates elements that are important from a human factors standpoint: the user, .the task, the hardware, the software, the procedurec, and the work environment. Together they cluster around what is commonly called the user-computer interface--that Invisible surface that binds the va'rious elements together. Diagramming a computer system in this way is to a large extent artificial, because the various elements cannot really be considered in isolation. As will be apparent later on, there are interactions among all of them. The figure is merely a convenient way of structuring and organizing the "subtopics of this chapter, which are described briefly below and treated * in detail in subsequent sections, 6 HA-~1.R WAR ( , ¢ll t.f-.ri 't. ft) f ft •.4, ' PH USER- '-:':'SYSTEM ..... ° "*;• ~~~~~~INTERFACE..... i:;•iI -- SOFTWARE • -TASK / . •d .(date bute) reurmns (computer capabilities) :"(task ftp I,.,paper files, forms) i(manuals) (documentation) -" '•FI•GURE• 5-1 Important: Elements of Computer Syst~ems •: Source: Adapted from Chapanis (1982). "4,# 7 1. The Users Beginning with the users is a natural starting point for any discussion of the human factors Involved in computer systems. Focusing on users implies what is sometimes referred to as user-oriented design, rather than machine-oriented design. Perhaps the most important questions about users are "Who exactly are the users?" "What are their characteristics?" and "How can user requirements be translated into design requirements?" 2. The Task The second element is the task or the job that the user has to do with the computer. The complexity of the job, the kinds of Information the operator needs to perform the job, and the constraints under which jobs must be performed are all relevant considerations In the human factors design of computer systems. Task requirements are discussed in the section on users. "3. The Hardware Hardware means input devices, output display. and signaling devices, and the work station that the computer operator has to V., use. 4. The Software Software generally refers to the data bases, computer programs, and procedures available ih'a computer system. "5. Procedures Procedures, manuals, and documentation are often Included under software. They are shown separately in Figure 5-1 because the problems associated with manuals and documentation are somewhat different from those associated with programming languages, commands, and menus. - . . . .. 7i 6. The Work Environment Generally speaking, computers and computer systems are found in relatively benign work environments. Nonetheless, some features of the work environment--excessive glare, noise, and . sometimes dirt and vibration--have to be considered in the design of the user-computer Interface. Since standard human factors recommendations and good engineering practice are usually adequate guides for designing most work environments in which computers are located, we do not cover environmental variables in this chapter. USERS AND TASKS Computer users today are almost as varied as people In general. Although there have been a number of attempts to categorize or classify computer users into various groups or along various dimensions, there is today no generally accepted way of doing either. Computer tasks, by contrast, can be classified under the same headings as are used In task analyses. Proceeding from the more global to the more detailed they are jobs, functions, tasks, and subtasks. According to Ramsey and Atwood (1979), most of the literature about computer tacks is at the job level. Some "people think, however, that computer tasks cannot be classified in isolation, but that tasks interact with users and that the two must be Streated together. Examples are: professional programmers designing systems, professionals using application programs with comnmand lAnguages, "occasional users using application programs with menus. In short, classifying computer users and tasks is clearly in need of systematic work, and it Is treated more fully In the sections that follow. We rely .--•,. _- __,. _ ;,- ,,-,.---,.---....iw•rrr....rr- ,.....rr , . .- r • 9 In our discussion on the exemplary review of the literature on human-computer interaction by Ramsey and Atwood (1979), which was supported by the Office of Naval Research. Users Attempts to classify users have followed one of several quite different approaches. The first Is to categorize users into more-or-less distinct groups on the basis of their familiarity or sophistication with computers. This way of classifying users has yielded a large collection of names. Examples, In alphabetical order, are: casual users (Martin, 1973), computer professionals (Barnard et al., 1981), dedicated users (Martin, 1973), discretionary users (Bennett, 1979), experienced users (Shackel, 1981), familiar users (Ledgard et al., 1981), first-time users (Al-Awar et al., 1981), the general public (Shackel, 1981), general users (Miller and Thomas, 1977), inexperienced users (Dzida et al., 1978), naive users (Thompson, 1969), noncomputer specialists (Shackel, 1981), nonprogrammers (Martin, 1973), occasional users (Hammond et al., 1980), programmers (Martin, 1973), regular users (Dzida et al., 1978), and untrained users (Martin, 1973). Another way of categorizing users has focused more on the nature of the user's job. This has produced such categories as: analysts (S. L. Smith, 1981), clerical workers (Stewart, 3974), managers (Eason, 1974), operators (Smith, 1981), programmers (Martin, 1973), rugged operators (Martin, 1973), service personnel (Smith, 1981), specialists (Stewart, 1974), and technical users (Ramsey and Atwood, 1979). 10 Quite a different way of classifying users is in terms of underlying personal characteristics. Thus, Ramsey and Atwood suggest obtaining data about users' abilities, acquired skills, general background (including formal education), sex, age, attitude measures, mechanical (perhaps also spatial) aptitudes, vocabulary test performance, recency and length of training periods, training scores, cognitive decision style, and general "intelligence. Another classification of users' characteristics would include data on the following: 1. Sensory capacities, e.g., visual acuity 2. Motor abilities, e.g., typing skills 3. Anthropometric dimensions, for hardware design 4. Intellectual capacities, e.g., general intelligence and special abilities in order to evaluate reading levels for information presented 5. Learned cognitive skills, including familiarity with the English *.' language 6. Mathematical and logical skills 7. Experience with computers and proficiency in training 8. Personality, e.g., attitudes toward computers .4 Shneiderman (1980), by contrast, classifies useri only according to their semantic and syntactic knowledge about computers. This way of classifying users yields the simple matrix shown in Figure 5-2. The diversity of approaches that have been taken to this problem "." indicates that we need research to understand and identify which of many ..................... . . Syntactic Knowledge little a lot little naive user data entry job control language (JCL) •" novices Semantic Knowledge a lot infrequent frequent novice user professional user 4, FIGURE 5-2' Classification of Users According to the Extent of Their Semantic and Syntactic Knowledge Source: Adapted from Shneiderman (1980). 12 possible user characteristics are important for software design. in addition, research is needed to understand how to express and translate user characteristics into terms that can be used in systems design, i.e., into specifications for designers of system software. It is important to recognize that all users, whether they are seasoned systems programmers or less experienced users, continue to learn as newer systems are developed and/or updated. For that reason, Cuff (1980) has suggested that we need to consider the casual user of computers as well as expert or naive users. Additional dimensions r£ user behaviors could give us evidence of the functionality of systems, e.g., the range of tasks users can perform with a given system, how . it takes a user to learn a system or a system update, and the time 1: takes a user to perform a particular task or job. We need to know what kinds of errors users make when learning new systems as well as how many errors are made and how often they are made or repeated, how well users adapt to changes in system software (robustness) that are -upward compatible,"* and how usere rate subjectively the quality of the output or product and the systems that perform their set of tasks. When we look at what is currently known about the novice compared with the expert user, it appears that the former is generally engaged in problem solviug and is very susceptible to task-structure variations. *Upward compatible means that commands and features used in an older version of software are still available In a newer version, although the newer version may provide new commands or features that are more efficient for accomplishing the same ends. 13 The expert systems programmer typically interacts with a computer as a routine cognitive skill and is somewhat immune to structural variations in the tasks performed (see Moran, 1981; Mayer, 1981). A simple dialog in the software that is computer-initiated and tutorial In nature is probably more appropriate for the occasional and naive user, but an abbreviated, user-initiated dialog appears to be more approprIate for the experienced user. It is clear that we need to gather more data about problem-solving strategies and preferences across different types of tasks for 'different levels of users. .Of particular concern is that the research methods used in evaluating user characteristics for hardware design have been used In studies evaluating user characteristics for software design. It is not known if these research methods are appropriate for evaluating software use or which methods will provide the most information to designers. Moran (1981) has addressed this issue in part. Perhaps the two most pressing research needs In this area.are to find some meaningful way of classifying or categorizing users and translating user characteristics into specific recommendations that can be used in the design of computer hardware, software, and documentation. Tasks Most computer and human factors specialists agree that a task taxonomy is "needed and that system designers need a Pet of benchmark tasks to evaluate hardw&re/software development and changes. A task structure provides the rules of the game that determine the r&.,;e of actions users 14 can and cannot take (Moran 1981). Tasks can vary in several ways. They may (1) fulfill different functions for the user, e.g., professional, educational, or home hobby functions, (2) require different forms of language such as natural language, BASIC, COBOL, or.APL, and (3) be performed on different kinds of systems. In addition, almost all system designers recognize that the user's Interface with a computer system changes as tasks or jobs change. The user interface includes any part of the computer system that the user comes in contact with physically, perceptually, or conceptually. To_ user's conceptual model of the system to be used to perform a given task is part of that interface. Thus, we also need research to understanC 3w to discover a user's conceptual model(s) when he or she is interiac...q, with the computer. Models suggested by Moran (1981) involve explicit information * processes that spell out step-by-step the mental operations the user musz go through to cr~mplete the task application. These models need to be based on a psychological theory of users. One example of specific models - .- that describe individual user differences in understanding calculator "languages is described by Mayer and Bayman (1981). It would be helpful If a subset of tLe task taxonomy or benchmark tasks could in some way be integrated into the accounting systems of computers 6o that system designers could be provided with statistical data about tasks aud users. These statistics on users should include ", .information about the user type and systems used as well as erro'rs in usage. One example of a keystroke-level model for evaluating performance • is described by Card et al. (1980). -~~~~~~~~~~~.." ............ -. ... .•.................. ...... •.-./. -,-=.-'.,.."2.. " •..... -- w--i- . ...... .. - ~- .... ••. * . . ... • .. , 7 '7 T .T :T. 15 * Of primary need are systematic studies of the conceptual models of users when they Interact with a variety of hardware and software systems to do specified sets of tasks, e.g., text editing, numerical problem solving, or querying data bases. These studies should choose successful .mthodologies for producing results that can be directly applied to system design, or they should include new methods for evaluating the Interactions of user characteristics with task requirements. Another pressing problem is the development of a meaningful task taxonomy that iucludes both behavioral and cognitive elements for a set of four or five different representative tasks. COMUTER HARDWARE Computer hardware cannot be designed in isolation because the kind of hardware available on a computer terminal determines in part the kinds of dialog and the kinds of command languages that can be Implemented In the "system. Ideally, decisioas about Important aspects of computer dialogs should precede decisions about terminal hardware. In practice, the reverse often occurs. While recognizing that these interactions exist "and that they are important In design, we discuss the human factors aspecte of computer hardware with only passing reference to their software implications. - - - ~~ ~ ~~~~-7, -- . -. ~ - .- 7 .7 . .7`7- - 16 S0 4b q ~.4 '*5 a, -~d .6 . * a .4 0 v34l 4 0 .41g "A ""4 a- W- 'soiId m4 00 2 *0 i a or - W4 &uU "1l 69a. B0. .46 04 Ne4 I 60~ 0 IL. 140.04J W 'Q 0" 2~~-'j s 0 .45h0uh 40 44 W.~ o so o -1. " U 0 ow~~~ ~ A.. 0 ~F 01 hA(o .:AZ A .64%, a "m 0 p4. a d. '~oa m ! 630~. He 1aS .4 0OW. 0oldIf0. is 0 40 a. $II :w . . r 1 a-w IL 'I d a~S id L.40 *54 .0 s~ Ea4 $4. PO-. L- 4 A#*& 17 ... 60 4 *0 .9 9 -. so di a W4 a-~ 40a 0 a v " hr~ L, di 60 U" *40"0 ji On .0 04 18 .0j u 4 I I P 14 106 044 -a IM. A d .44 n -- 10 - 0 i aW 14 0*-6 6. bU *0-*" o IdA0c " 0. " "" U, , I U al*, . 05 of Va v4waW ha I IL'- IL *0a - 44 - 'o e .4 A a'~5 SMu~2~ ' ~~.e"4 H Hr i y t 1wb i 654b5 4m 16 *' . ~~~aw 0 4 Ah o t 5 N S S a ~e . .00 ~ P~It 6, DR sodow U0w 4U tc0 , aM4 L 40~. S U g~ IS gL.g *3 w4; "I'd S 71 LI v .6 1. 4Im b. .0. A- o- "~~~ -aae-" - 4. p v00 6 v4w g od0-.4 sI wa *4 f 00 m I 20 4. 1 *4 . -x a; .04. -4 U0 0j tZ au qo09tq .C U v r imt 96 In 0% au.Y ~~~~ us4S9 ~ 21 Input Devices Delgners of Interactive computer systems can select from a very large -- number of devices for inserting information into computers. Table 5-1, modified from the work of Ramsey and Atwood (1979), lists 16 different kinds of input devices, comments on some of their features, and Identifies the principal references to studies of these devices. Since the situation has not changed materially since the Ramsey-Atwood C' report was issued, its findings are still valid. By far most of the work on computer input devices has been done on keyboards; the literature .is large and varied. Seibel's chapter In the Van Cott and Kinkade (1972) handbook is a good starting point for anyone interested in these problems. Ramsey and Atwood reference a number of studies done after Seibel's chapter was written, and there is.a fair amount of even newer work, e.g., Hirsch (1981) and Hormsby (1981)'. The available literature on keyboards is sufficient to answer most practical questions, This is no longer an area urgently in need of extensive research. The situation with regard to alternative input devices, such as light pens, touch panels, and hand printing, is different. Most of the work that has been done on these devices has compared two or more input devices in specific applications. There are not many studies of this . kind in the literature, although Card et al. (1978) did evaluate the "speed and accuracy of four devices for text selection. Research is needed that will lead to a set of recommendations abnet the kinds of , - "22 Input devices that are beat suited to general classes of tasks (e.g., text input, input of numerical data, selection of commands and operands from displays, discrete positional [graphical] input, and continuous positional [graphical] input) and perhaps to general classes of work eavironments. A much more serious concern is that there have been practically no studies of the optimal design of input devices, except for keyboards. That in, given that a light pen is better than a keyboard for some applications, how exactly would one design the best light pen for tht. job? Research is clearly needed on the optimal design parameters of all Input devices other than keyboards. Voice input to computers deserves special treatment because (L) i. does not Involve a physical mechanism that the user manipulates as such and (2) speech as a human output is distinctly different from the "movements of fingers, hands, or feet that are required for the activation of most conventional computer input devices. Speech has a number of characteristics that theoretically make it an attractive candidate for computer Inputs. It is fast, effective, versatile, flexible, and requires little effort. Moreover, almost everyone knows how to talk, so that training Is generally unnecessary. One of the principal reasons why speech input is not widely used, however, is that technology has not been able to provide us with speech recognition capabilities that even begin to approximate those of human listeners. Nonetheless, the state of the art is advancing rapidly. "There are now some very good speech recognition devices available and their capabilities are certain to increase greatly In the foreseeable future. ............................. •,• • . -,-%--:•"•. ++ . -',' . +.;-. •++• ,, .• • + + •+_ .'; ; + •, , .. . . . . 23 Although speech has some distinct advantages as a medium of communication, It is also easy to Identify applications In which speech input to computers would not be desirable. Some of these applications -S involve certain kinds of user& (for example, persons with speech Impediments), others the task (for example, Intricate mathematical and chemical formulae are not easily described orally), and still others the work environment (speech input is not very efficient in noisy environments). For more reliable guidance about applications in which the voice should or should n6t be used, the only source of help are recommendations comparing visual and auditory forms of presentation (see Table 5-2). Table 5-2, and others like it in the'human factors literature suffer from four PAjor defects. Yirst, the recommendations are oriented more tovard output devices rather than input devices--that is, they do not compare speech with other possible forms of data input. Bowever attractive speech may appear as an Input medium, some data are available suggesting that it is not necessarily the solution for all situations (see, for example, 3raunstein and Anderson, 1961). Second, recomasndations such as those in Table 5-2 are not specifically oriented toward computer applications. Third, these comparisons are not sufficiently comprebensive to be of much use to computer designers. For 5, example, none of these comparisons considers in detail user characteristics or the work environment in which computers are used. Some environments have rows and rows of computer terminals In close proximity.. Imagine the babble that might result if 50 operators were inputting Information by voice simultaneously Into computers! Finally, *[ existing comparisons of vision and audition provide Information that is SS S "24 :.. TABLE 5-2 Recommendations for the Use of Auditory and Visual Forms of Presentation Use auditory presentation if: Use visual presentation if: 21. The message is simple. 1. The message Is complex. 2. The message is short. 2. The message Is long. 3. The message viii not be 3. The message will be referred referred to later. to later. 4. The message deals with events 4. The message deals with in time. location in space. 5. The message calls for immedi- 5. The message does not call ate action. for immedlate action. 6. The visual system of the person 6. The auditory system of the is overburdened. person is overburdened. 7. The feceiving location Is too 7. The receiving location Is bright or dark-adaptation too noisy. Integrity Is necessary. 8. The person's job requires 8. The person's job allows for continual movement, a stationary position. Source: Deatherage (1972). .4im -. - - .. . - . -. .... , - - ' ".-,. 17 - 25 too vague to be of any practical use to a computer designer. For example, how is a designer to decide whether a message Is simple or complex? What we clearly need is a detailed, comprehensive, and quantitative set of guidelines about the precise conditions under which speech Input to computers is and is not desirable. These guidelines should consider "the user, the task, and the work environment In which computers are located. Although some very good speech recognition machines are available, they have some important limitations. First, they all are vord recognition devices, that is, they do not recognize continuous speech. Second, they are capable of responding only to vocabularies of restricted size. Third, they are user-dependent, that is, they must be programmed to learn to recognize words spoken by a particular person and will generally respond accurately only to that person's voice. Speech recognition machines that can respond to connected speech or that 4re speaker-independent are-well beyond the current state of technology. Despite these important limitations, speech input to computers can be successful and useful. There is not, however, a good base of research findings on the conditions under which speech recognition machines can be used effectively.even with their limitations. For example, how much useful work can be done with vocabularies of various sizes? How effectively can people be trained to leave pauses between words in connected speech so that individual words can be recognized?. How effortful is it to speak while deliberately leaving pauses between words? If vocabularies of restricted size must be used, how effectively can one construct complex inputs with the available qords? What rules of '_7 12 .. - 26 grammar and syntax must be observed if one is restricted to a limited Svocabulary? What should that vocabulary be? The conditions under which speech recognition devices can be used most effectively is virtually an unexplored area of. research that should be vigorously pursued. One example of research -in the use of voice input to operate a distributed computer network has been conducted at the Navy Postgraduate School by Poock (1980). Output Devices Although teletypewriters and alphanumeric cathode ray tube (CRT) displays are the most common forms of uutput devices used in computer systems, there are numerous other possibilities: plasma displays; light-emitting diodes (LED) and liquid crystal displays; tactile displays; audio displays, including synthetic speech; graphical displays; laser displays; and even psychophysiological output devIces. The state of the art of these various output devices is summarized in Table 5-3, which is based on Ramsey and Atwood (1979). CRT Displays Enough research has been done on CRT displays to support guidelines for their design (Galitz, 1981; Shurtleff, 1980). Although the two handbooks available do not answer all the questions designers may have, they cover a substantial number of them. Most of their reco,endations are r . "•','.",'. . ". " ..". ,.". . . . . . . .. . . . . . - •'; 'I.. 14t 27 w Q 0- L9 16 0 P% . I" 4 4 4 .9 a.a id .4 . a. 4' 0 a ~ 0 Am4 A f 1-,.4 44 g~ 51 4A 49 o,. Ivj . 3a 11 -9 1 0 0 0 1-3 4 4 ý4 "4 %W %.4 J . -4 0. V~ a0 Ad .3 .3 f a!o a1.J 24 * 44 q V *V a. 3 g ~It a g4aa h 1.11 *w6 40a mu 39il a. w .4a Id JO. 00" 61 0 - 4 a. 3. .4 06I -4-4 28 '4b -d .4~. 1 u ga ~a a6 14 OA* -Iwo -_4 0 CO*MAG V4 %4 - 3'4 A-no1*zi .4 . 40 0 ~ at. *M. j ~ ~~~ Z 'A -~ .. ~ ~ Iiu.4*o,~b solo)J 4 94I 'M4 J ~ V Adp . AI ~ 3 29 3 IS o4. '4- 61 IL 'S 60 .44 0 IL di b 1! to 4o*4 30 supported by research data, and those that are not seem reasonable. The two most important unresolved questions concern the size of displays and the use of colored displays. With regard to size, Shurtleff (1980) has devoted a chapter to questions of legibility as related to display size, but he has nothing to say about the more important question of how much infornation can be presented on screens of variout sizes. Military applications of computer displays, for example, in cockpits, must be small by necessity. How mall c~n they be and still be legible? How can Information best be presented on small displays? The converse problem may occur when many people must view the same display. In that case the relevant questions are: Now large can displays be? How can information best be presented on large displays? These are not questions relating simply to the legibility of the information presented on displays of various size; such questions can easily be resolved on the basis of available data. What is needed is research on the interactions between display size and the amount of Information that can be most effectively presented. K' Questions on the use of color on CRT displays is also still essentially unresolved. The advantages of color coding for identification purposes are, of course, well documented, but the long-term effects of working with colored CRT displays for data entry, inquiry, or interactive dialog are not known. Although many people seem to like colored displays, others find them annoying and garish. The scanty research evidence available seems to show that colored CRT displays produce no substantial performance benefits. More research may enable designers to make informed decisions about the possible benefits of color on CRTs versus their cost and other disadvantages. 31 Alternatives to'CRT Displays "Vevy little human factors research has been done on displays other than CRTs.Of p&rticular interest are synthetic speech displays. , Computer-Senerated speech is now available in a variety of devices, and the quality of the speech in some.of these devices is quite good. The situations in which computer-generated speech Is a viable alternative to visual displays, however, are not known . Basic reoearch paralleling .1 that on speech input is needed to produce defensible recommendations about applications In which speech output can or should be used. Workplace Design Computer displays and input devices are generally assembled into work stations consisting of terminals, consoles, desks, and chairs. There is, of course, a very large and useful literatuxe on the physical layout of workplaces (see, for example, Van Cott and Kinkade, 1972), but there is very little empirical research on work station design specifically for computer-related tasks and settings. The Importance of these problems is highlighted by a great deal of literature, mostly from Europe, about compla.ints from workers using CRT devices (see, for example, Crandjean and Vigliani, 1980). Similar complaints from a consortium of labor unions In the United States were received by the National Institute of Occupational Safety and .,i 32 Health (NIOSH) in 1979. The general nature of these complaints was that employees using CRT terminals experienced a variety of symptoms Including headaches, general malaise, eyestrain, and other visual and musculoskeletal problems. In reaporee to these complaints NIOSH conducted an extensive Investigation of computer work stations in three companies in the San Francisco Bay area (Murray at al., 1981). The study consisted of four phases: (1) radiation measurements, (2) industrial hygiene sampling, (3) a survey of health complaints and psychological mood states, and (4) ergonomics and human factors measurements. Although radiation from C1RTs had long been suspected as a potential health hazard, the NIOSH study seems to have conclusively ruled it out. X-ray, ultraviolet, and radio-frequency radiation in all sites and at all work station. tested was either not detectable or was well below acceptable occupational levels. Similar negative conclusions were reached about the chemical environment. Hydrocarbon, carbon monoxide, acetic acid, and formaldehyde levels in and around work stations were not appreciably different from what one would find in an ordinary living environment. The results of the survey of health complaints were quite different, however. They show that operators of visual display terminals (VDT) experienced a greater number of health complaints, particularly related to emotional and gastrointestinal problems, than did comparable operators who did not work with VDTs. These findings, according to the NIOSH - report, demonstrate a level of emotional distress for the VDT operators that could have potential long-term health consequences. The NIOSH study concludes, however, that it is quite likely that the emotional distress shown by the VDT operators is more related to the type of work activity r-"" "r-r' ' , -. rr . 33 than to the use of VDTs per se. With the growing number of VDTs in our society, It Is clearly of considerable importance to establish how much of worker complaints can be traced to VDTs and how much to other factors (Ketchel, 1981; M. 3. Smith, 1981). This Is a research question that urgently needs to be Investigated. The NIOSH report has more to say about the ergonomic and human factors aspects of the computer workplace than about any other aspect of computer work. Keyboard heights, table and chair designs, viewing distances and viewing angles, copy holders, and other aspects of work ,station design all come in for criticism. Computer work stations in ",, America appear to be as poorly designed as those in Europe (see Grandjean and Vigliani, 1980; Brown at al., 1982), forcing operators to adopt strained postures and to contend with glare and generally substandard viewing conditions (Ketchel, 1981). Although basic data for good work station design are available, they need to be assembled in a good set of guidelines specifically oriented toward such design. This also appears to be an urgent research need. General Problems Three general problems relating to computer hardware have received almost no attention: (1) the design of transportable terminals and data, (2) the design of robust computer systems for military purposes, and (3) the design of computer terminals for use In unusual or exotic environments, for example, in moving vehicles or under water. -a.- 34 Spectacular advances in microelectronics have made it possible to package enormous computing power Into small packages. The full potential of this miniaturization has not yet been realized or explored. We need human factors research leading to the design and use of transportable teomnile, Including input and output devices and data in the form of cassettes. Most computer systems are designed for use in benign environments. As the use of computers becomes more common in the military services, data will be urgently needed on how to design them for the rough treatment they are almost certain to receive under operational conditions. Vibration, high-g forces, Immersion in water, and perhaps other environmental conditions affect machines as well as their operators. Certain input devices, for example, light pens or even keyboards, may be difficult or impossible to use when the computer and the operator are "subjected to excessive movement, vibration, or g-forces. We have I essentially no Information about the usability of computers on the design of computers for use under such conditions. Although this may not be an Immediate problem, it is certain to become increasingly important as computers are Integrated into complex systems for use in harsh, exotic, or unusual environments. COMPUTER SOFTWARE Software has many different meanings to computer scientists and computer analysts who develop or use computer programs that include command languages, dialog systems, and specialized applications systems with data * 35 bases. Software may have originally been synonymous with computer programs, but in general software now consists of "the operational requirements for a system, its specifications, design, and programs, all its user manuals and guides, and its maintenance documentation" (Hills, 1980:417). Research In human factors in software has evaluated the human-computer Interface with command languages, programming languages, dialog systems, and feedback and error management. Frequently the human factors studies have emphasized ease of use and ease of learning as well as efficiency of completing the problem-solving tasks on the computer. The recent experimental and observational studies were summarized in the special issue on human factors in Computing Surveys (1981), the IBM Systems Journal (1981), and in articles in Human Factors, the International Journal of Man Machine Studies, and Ergonomics. In addition, there are exemplary technical reports, such as Williges and Williges (1981), Ledgard et al. (1981), Shneiderman (1980), and the proceedings of the Conference on Human Factors in Computer Systems .(Institute for Computer Sciences and Technology, 1982). The more popular trade magazines, e.g., the April 1982 issue of BYTE, also feature articles on human factors In software design. Many authors express the need for additional careful research studies in software design and criticize many current results as incomplete and inconsistent due to poor methodology, use of subject populations limited to particular types of users (e.g., college students), inadequate experimental designs,' and misuse or poor use of statistics. Selected useful guidelines for software designers are found In Engle and Granda (1975) and the recent reports by Willigec 3nd Williges (1981) 'p 36 and Ehrenreich (1981). Although there exist guidelines as vell as selected research studies in human factors issues in software, considerable research needs to be done in order to provide information of use to system designers of software. SThe research efforts needed in human factors in software design can be divided into two areas: (1) methodological studies and (2) substantive studies of software design features for the end user. The two areas are not always independent, and some research studies require attention to both. In either case we are concerned about human factors "research in software systems with which end users interact or Interface, not about research In programming languge design per se; this Is uaually the concern of the computer programmer or systems analyst. In the methodological area, research is needed on hoop to develop a "suitable simulation capability for the design of dialog and Interface systems. We need to understand how to evaluate present software zyw.ems as well as how to mock up new systems for testing and eval-,tlca with end users. The choice of dependent variables In evaluating software is not clear. We know little about how to collect user statistics on the ease of learning of new software, how to record errors and complex, response-time metrics from end users in time-•sharing systems, and how to measure user satisfaction. Research is needed on what components of usability are most important for different kines of jseers and applications (see Shackel, 1981). One of the problems in this area is that we don't know how to do research on these topics. There is no agreed-upon set of empirical methodologies for conducting research studies about software issues. The studies that have been done are frequently context-specific and/or about .¶. . . . ... . . . •.. .. -.. ,* ':•. .• •"-.• . •L _. . .• . • . 37 one or two software features and are difficult to generalize and integrate with other data in the area. Examples include evaluations of a given command asking users to translate the abbreviated form into English, effects of modifications of conditional nesting structures in FORTRAN, user efficiency of indentations to locate single bugs in PASCAL, and modifications in a language used In teaching at the University of *; Toronto. A research program undertaken by a multidisciplinary group at Virginia Polytechnic Institute and State University by Williges and Ehrich sponsored by the Office of Naval Research [human-computer interaction and decision behavior, NR SRO-101] is attempting to develop principles of effective human-computer interaction, including establishment of a user's model of command languages. This research is interdisciplinary and programmatic in nature. Another set of met~bodoloical studies is needed to discover how to develop guidelines V.• and what kiads of guidelines for software characteristics are most useful 2 for system designers and engineers; for example, Smith has described his Aieas and progress in this area in the proceedings of the Conference on Human Factors in Computing Systems (Institute for Computer Sciences and Technology, 1982). In a substantive area, research is needed to understand the control of uisers' input accuracy through "clever" or "novel" feedback during actual user experiences as well as what the "format structures" should be for providing feedback on errors that users make. Data needs to be collected on how best to provide effective error correction features, help messeges, and what range of default procedures should be provided to aid user efficiency. We need research to evaluate how important feedback and system response time are for improving user efficiency or ease of 4bi41 '¾%.*~*..-..-..-... -,**.**'-. .S 38 use. There is a need for methodology and quantification of user ease and efficiency. At present, studies evaluate different types of commands in "a laboratory rather than in real-use settings, and It is not clear that the most effective commands in the laboratory are applicable in applied system uses. We need information on what length of commands (one, two, or three words) or how many (enter only one and wait for system response or enter six at once) are preferred by casual users rather than expert software programmers. A variety of studies are needed in order to evaluate how best to develop natural language dialog systems and in particular what kinds of language-based models of human communication are most appropriate for commands In operating systems, editing systems, knowledge-based systems, and query systems for human computer interactions (e.g., Reisner, 1981). Additional reseach is needed to understand how to develop knowledge-based systems for a variety of users. Knowledge-based systems are developed by a formulation of the application problem, designing and constructing the knowledge base of expertise, developing schemes of inference, search, or problem solving, winning the confidence of.experts, and evaluating the programs for production versions. Examples of knowledge-based systems, frequently referred to as expert systems, include assisting users in such tasks as: (1) deducing molecular structures from the output of mass spectrometers, (2) advising when and where to drill for ore, and (3) diagnosing blood infections. It should be noted that there are three different kinds of end users of these systems, only the first of which is a user in a conventional Information retrieval system: (1) in getting answers to problems, the user as "* client, (2) in improving the system's knowledge, the user as a tutor, and 39 (3) in harvesting the knowledge base, the user as pupil. A summary of recent research related to knowledge-based or expert systems can be found in L. C. Smith (1980). Some of the major features of these systems, including the schemes of inference or problem-solving approaches used in defining structures for the knowledge bases, are reviewed by Feigenbaum .(1978). A recently developed specialty is software associated with special graphics displays. At present the development of both hardware and software for graphics use are at the gadget stage. We need to know how to design software modules for graphics use, what modules are best for various graphics features in addition to points, lines, and circles, and how to mix keyboard and pen inputs in ways other than up and down arrows and drawing pad devices. Most graphic software has hierarchical levels for command use; it is unknown if different levels are needed or how many are needed and which commands are best to use at each level. Also, the best ways for interacting among the hierarchically ordered levels of commands for draw and edit and the method for terminating are unknown. We need more information about what icons, menus, and special symbols should be used in creating graphics. Methods have been developed for partitioning a display screen into multiple, sometimes overlapping windows, each monitoring an independent process. There has been very little research on how best to make use of this kind of capability. We know little about how to use color effectively for different kinds of graphics displays and applications. Several of the above research recommendations have been recognized by Moran (1981), who also suggests that further research is needed to understand users' conceptual models in interacting with a variety of C. 40 software systems. In addition, Thomas and Carroll (1981) and Miller (1981) have emphasized that the areas of most needed research are In the human-to-computer communication process, including research on the advantages and disadvantages of natural language software systems for different tasks. Computers have become more a part of all office systems today, and we need to study what impact the new computer technology has on organizations and their structures as well as the effects oa decision making of the new management information systems (Federico, 1980). As a final point, it should be noted that we need research on the interaction between hardware and software design features as new developments such as voice input and video disks become more commonly incorporated into all types of computer systems. Important research that should be done involves first the design and analysis of new methodologies for conducting software research, and second, users' conceptual models of software systems, including natural language systems for a variety of tasks. Also, we need to understand how to develop and evaluate additional knowledge-based systems for users as client, tutor, and/or pupil. Also needed are studies conducted to understand what software features would facilitate effective use of graphics in different tasks. DOCUMENTATION Documentation was once defined as printed matter that describes or explains how a system of some kind works or should be used. The documentation was necessarily separate from the system unless the system 2 . __ 41 itself was a thing of print on paper. In the context of the computer, however, documeaitation can be part and parcel of the system it describes or explains. Recent experienco indicates that on-line documentation has many advantages over ;print-on-paper documentation. .r cannot get lost or separated from the system. Inasmuch as the user is working with the computer, the computer can monitor what the user is doIng and help find the parts of the doc~uentation that are pertinent to the user's current activity and curren~t quandary. When the user thinks he or she understands what to do the computer can help do it--and may be able to try it out in a tentative way that will not cause much trouble if the user's understanding is faulty. The possibilities are obviously revolutionary. Because on-line documentation Is relatively new, however, not much is known about how to design and implement It effectively. Clearly the first priozity for research In documentation is to explore, evaluate, and improve techniques of on-line documentation. On-line documentation within the system is not the answer..to all needs for documentation, of course. Some computer systems (such as batch-processing systems and automatic process-control systems) are noninteractive, and others (such as many avionics systems) do not have * enough memory or storage to make on-line documentation feasible. Documentation for such systems is, by and large, not vury satisfactory. There is still need, therefore, for improved external documentation, documentation that is associated with the system but not in it. Wright (1981) has several useful suggestions for documentation designers, Including suggested aids that take the form of heuristics for analyzing the user's interaction with the text. Her suggestions also consider 42 types of users and the user's (reader's) purpose rather than the producer designer's (writer's) purpose as a classification for documents. Of course, external documentation need not necessarily be print-on-paper documentation. It Is an Interesting idea to associate a "documentation computer" with the system to which the documentation pertains. In some instances, the documentation computer might be a small machine, even a portable one, taking the place of a few manuals; other Instances--those that have veritable libraries of documentation--might require a documentation computer system of significant size. In an experimental system on an aircraft carrier, for example, the computer system that handles documentation is a network of about 30 PERQs* that are 16-bit, chip-based "personal" computers of substantial capability. Documentation as Part of an Overall System The aircraft carrier project Introduces a concept that will no doubt be very important in the future: Documentation and what users do with it are parts of a larger system. If the use of documentation leads to the didcovery of a defective part, Inventory must be checked and ordering may have to be done. If the use of documentation leads to ioolation of a 4 software bug, software maintenance work must be done. It would be convenient and would foster efficiency If the same system that handled documentation also handled Inventory and software maintenance. To *PERQ is a trademark of the Three Rivers Computer Corporation. .4! 43 "improve the overall effectiveness of documentation, research is needed on the Interactions of documentation with other parts of the overall task support system. Computer-Based Versus Print-on-Paper Documentation The discussion thus far has focused on computer-based documentation, even when the system being documented is not itself an Interactive computer system. That choice reflects the judgment that research in computer-based documentation is more likely to make a major payoff than ongoing research in print-on-paper documentation. The latter research has led to many improvements and the total effect has been significant, but, Insofar as conventional documentation is concerned, diminishing returns have set in. Computer-based documentation, by contrast, with the capability of the computer, offers hope of a very major advance. While "computer-based documentation Is not a new concept by any means, it has just recently begun to be studied systematically. The "help systems" and the "tutorials" of the 1960s and l170s were written without the benefit of research of the kind that was devoted, for example, to programming languages. As a resil. It has been said, the help systems needed help i systems and the tutorials needed tutor&. Our conclusion is that now Is the time to make a strong research attack on computer-based *. documentation, including self-instructional programs, coherent system-wide help systems, documentation keyed to the behavior of programs (so that an error calls forth an explanation of what went wrong), and * programming languages that write programs to explain themselves. -A AO S, ,44 Capturing the Intent of the Creators ef the System As iuggested earlier, documentation must be viewed as a part of the overall system that interacts with other parts of the overall system. The time dimension--the history--of the overall system is a very important base of the interaction. Most systems are developed through efforts to improve earlier systems, and those that do not are developed from some kind of design activity in the minds of system designers. (Programs are systeas, of course, so the same can be said of programs). *.The intentions of the improvers and designers are crucially important to Sunderstanding what the systems do, how they work. and how they should be used--but Intentions tend not to be captured In the plans and designs. A computer program, for example, usually tells how to do something, not what it is that is being done, and it Is very difficult to reconstruct the programmer's intentious from the program. Research on this topic may or may not Improve the situation, but it clear that the situation needs * B>to be improved. A broad view of documentation Is important. The right approach may be'to create computer-based des.Ign and upgrading metasystems, within which Improvers and designers would work under constant monitoring, with as much emphasis on recording Intentions and goals as on devising the means for achieving them. Note that this notion, if not developed with sensitivity to privacy Issues, could lead to serious ethical problems. 'a. , . : .T••,:• . • •• - -, . • . .3,-. ,, .- * .. 45 Dynamic Graphics and Documentation Although documentation was, In earlier days, primarily print on paper, some documentation has been available in other media, such as recorded speech and movies. The latter offered, at considerable cost, the advantages of kinematic graphics and moving gray-scale and color pictures. The computer promises to reduce the cost of preparing kinematic graphics by having a single, static program create dynamic "multidimensional patterns that develop over time. The video disk promises to reduce the cost of storing and playing back all kinds of information, especially pictorial information. Together the computer and the video disk may open up a new era for dynamic graphic documenrtation. At present the computer can select and present in a few milliseconds any one of the approximately 55,000 pictures on a video disk. It..cau run off sequences of continuous frames as a movie or skip around under program control and show fast .slide sequences. What it selects can be conditioned, of course, by the ,esponses of the viewer or viewers. These '4capabilities present an exciting opportunity to explore and develop new approaches to documentation. "Another exciting opportunity is being studied under the rubric of program visualization. The computer is capable, of course, of displaying representations of its own Internal operation. It can present sequences "of symbols representing the program that is being executed and the data on which the program is operating. Alternatively, it can present graphs, diagrams, and pictures to tell the person at the console what the program I..... I1 46 should be doing and what it is In fact doing. This latter approach to documentation, which requires sophisticated graphic display not widely available in the past, is now economically as well as technically feasible. The hope is that iconic displays will prove superior to symbolic displays in presenting the broad picture of the behavior of computer programs and systems and in helping people deal with their intrinsic complexity. With the iconic approach, it may be possible to provide something analogous to a zoom lens, through which one would be able to monitor and control the broad picture as long as everything proceeds according to plan, then focus on the offending details as soon as trouble ariseb. Documentation in the Form of Knowledge Bases Conventional documentation takes the forms of natural language text, diagrams, sketches, pictures, and tables of data; it is designed exclusively to be read by eye. New forms of documentation are becoming essential: pointer structures, semantic networks, procedural networks, and production rules, documentation designed to be interpreted by computer programs. Such documentation will probably be used first in interactive computer systems to help end users or programmers and maintenance workers, but In due course it will be used also in fully 3-. automatic systems sophisticated enough to read their own documentation and restructure themselves to overcome difficulties and maximize performance. Some work has already been done on such documentation in the field of artificial Intelligence; much more needs to be done. It is 47 essential to couple research on documentation closely with other research pertinent to the systems in which it will be used--for example, with work * .on interactive tutorial systems for end users, Interactive maintenance systems, and robotic maufacturing systems. Computer Systems to Facilitate Conventional Documentation The foregoing emphasis on computer-based documentation expresses our N.J conviction that It is the high-payoff area within the documentation field, but it should not be taken to imply that conventional documentation is dead. We think that two main foci have the greatest potential payoff for research in conventional documentation: (1) understanding the target-group of people that the documentation is Intended to help and the tasks in which they will be engaged when they use the documentation and (2) using computer systems, with good editors, "formatters, and composers to facilitate creation and production of conventional documentation. The theme of understanding the users is developed elsewhere in this chapter. Great advances have been made In the last few years in the design of computer-based systems for creating and producing conventional documents, and research in that area has much new technology to work on. Indeed, research is needed to develop the capability to make the new editors, formatters, and composers easy to use in order to facilitate the preparation of documentation that will make them and other systems easy to use. Kruesi, for example, supported by the Office of Naval Research (NR 196-160), is Investigating the relationship between the types of 48 documentation provided to programmers and their performance on a wide variety of software-rslated tasks. * In summary, research should be emphasized in several &reas pertinent to documentation: (1) techniques of on-line documentation, (2) Interactions and information flows between document subsystems and other subsystems, (3) efforts to capture the intent of designers and upgraders of systems, (4) dynamic graphics and the video disk, (5) dynamic graphics and program visualization, (6) knowledge bases, (7) understanding the uses and users of documentation, and (8) computer-based systems for the development of conventional documentation. Of these suggestions two primary research needs are to know how and when to use display documentation with graphics and what program visualization techniques are most helpful to users. SUMMARY AND CONCLUSIONS The primary research recommendatons in the areas of users, tasks, hardware, software, end documentation include a major emphasis on developing new methodologies to evaluate what is meant by ease of use in human-computer Interaction. Does ease of use mean the extent to which it is easy to learn to use a computer; does it imply good design of hardware and software for a variety of naive, casual, and'professional users; does it mean that any task can be done quickly and without errors; does it encompass a component of judged satisfaction about use; or does it mean all of these? •- r,.• • .. * *.., = A •l * .* '*.-b . '.', ''S 4. •N L * s -• . N -. ,,, ,' *,,, j.. .< .- ,. • .• • 49 We need to know what user characteristics are important determinants of successful human-computer interaction for a specified set of tasks, such as data base inquiries, computation and accounting problems, and editor or word processing functions. In the area of hardware design, more research is needed to evaluate alternatives to keyboard input "21 (including voice input), uses of color in displays, the best sizes of -. displays, and alternatives to CRT displays. Studies In evaluating software are barely beginning to provide data for design use. We don't yet know how to conduct systematic research studies in software design, what independent variables are most important, and what dependent variables of human-computer interaction should be recorded. We don't have data to support the design of a simulation facility to effectively evaluate commands in operating systems, editing systems, knowledge-based systems, and query systems. We need to understand users' conceptual models in interacting with specific software systems, and we need more information about the advantages and disadvantages of natural..language software systems. Documentation may well become part of the available software for users; when and how to display documentation is an Important area for research. Research is needed on how best to use graphics and special knowledge bases to facilitate uses of documentation either on line or in manuals. Current documentation is desi&Aer-oriented rather than user-oriented, and the perspectives should be changed so that documentation is used more effectively. Although the research needs outlined are numerous, a major emphasis in this chapter is on systematic studies that include all four substantive variables--user and task characteristics, hardware, software, *2 and documentation--and the interaction of these components with a clear-cut set of studies to define ease of use. 'S 50 "REFERENCES Addis, T. R, 1972 Human behaviour in an interactive environment using a "simple spoken work recognizer. International Journal of Man-Machine Studies 4:255-284. .--.- Al-Awar, J., Chapanis, A., and Ford, W. R. 1981 Tutoriats for the first-timn computer user. IEEE Transactions oa Professional Communication PC-24:30-37. Alden, D. C., Daniels, R. W., aud Kanarick, A. F. 1972 Keyboard design nnd operation: a review of the major issues. Human Factors 14:275-293. (A very similar paptz by the same authors is Tezhnical Report 12.180-FRIa, Honeywell Systems and Research Center, St. Paul, Minn., March 1970). Apsey, R. S. 1976 Human factors of constrained handprint for OCR. Pp. 466-470 in Proceedings. IEEE Internetional Conference on Cybernetics and Society. November 1976. New York: Institute of Electrical and Electronics Engineers, Inc. , , . . .. . . ,. .. .. .• ..- ,, , ,- : , -:;-".-:- ..-...,.. .. ,.. 51 -: Barnard, P. J., Hammond, N. V., Morton, J., and Long, J. B. "1981 Consistency and compatability In human-computer dialogue. International Journal of Man-Machine Studies 15:87-123. * Bennett, is L. 1979 Incorporating usability into system design. Design_'79 Symposium. Monterey, Calif., April 1979. Bezdel, W. 1970 Some problems In man-machine communication using speech. International Journal of Man-Machine Studies 2:157-168. Bigelow, R. P. "1981 Two is the prime number In love, war, and lawsuits. , _Infosystems 28(1l):92, 94. Branscomb, Lewis M. 1982 Electronics and computers: an overview. Science 215:755-760. Braunstein, M., and Anderson, N. W. 1961 A comparison of the speed and accuracy of reading aloud and "Icey-punchingdigits. IEEE Transactlons on Human Factors in Electronics HFE-2:56-57. * * , ., a.,. 52 Brown, B. S., Rinalduccl, E. J., and Dismukes, R. K. 1982 Video Display Terminals and Vision of Workers: Summary and Overview of a Symposium. Committee on Vision, National Research Council. Behaviour and Information Technology 1(2): 121-140. Card, S. K., English, W. K., and Burr, B. J. 1978 Evaluation of mouse, rate-controlled isometric joy6tick, step keys, and text keys for text selection on a CRT. Ergonomics 21:601-631. Card, S. K., Moran, T. P., and Newell, A. 1980 The keystroke-level model for user performance time with "Interactive systems. Communications of the ACM 23:396-410. Chapanis, A. 1972 Design of controls. Pp. 345-379 in H. P. Van Cott and R. C. Kinkade, eds.,.Human Engineering Guide to Equipment Design. Revised edition. Sponsored by the Joint Army-Navy-Air Force Steering Committee. Washington, D. C.: U. S. Government Printing Office. 1975 Interactive human communication. Scientific American 232(3):36-42. -. 53 1981 Interactive human communication: some lessons learned from laboratory experiments. Pp. 65-114 In B. Shackel, ed., Plan-Computer Interaction: Human Factors Aspects of Computers and Peoplef Alphen aan den Rijn, The "Netherlands: Sljthoff and Noordhoff. 1982 Humanizing computers. In Proceedings ITT Europe Human Factors Symposium, 18-19 May. London: ITT Europe Engineering Support Centre, 20-53. Cornog, D. Y., and Rose, F. C. 1967 Legibility of Alphanumeric Characters and Other Symbols: II. A Reference Handbook. National Bureau of Standards Miscellaneous 262-2. Washington, D. C.: U. S. Government Printing Office. Cuff, R. 1980 On casual users. International Journal of Man-Machine Studies 12:163-187. Deatherage, B. H. 1972 Auditory and other sensory forms of presentation. Pp. 123-160 !n H. P. Van Cott and R. G. Kinkade, eds., Human Engineering Guide to Equipment Design. Revised edition. Washington, D. C.: U. S. Government Printing Office. 4% S4 4- ° 4 ~ ,*-" •"., ' ' • ',£ •' . . .. L. 54 Dertouzos, H. L., and Moses, J. 1980 The Computer Age: A Twenty-Year View. Cambridge, Mass.: MIT Press. Devoe, D. B. 1967 Alternatives to handprinting in the manual entry of data. IEEE Transactions on Human Factors in Electronics HFE-8:21-32. Dolotts, T. A. 1970 Functional specifications for typewriter-like time-sharing * terminals. Computing Surveys 2:5-31. Dzida, W., Herda, S., and Itzfeldt, W. D. 1978 User-perceived quality of interactive 3ystems. IEEE Transactions on SoftwareEngineering SE-4:270-276. Eason, K. D. 1974 The manager as a computer user. Applied Ergonomics 5:9-14. Ehrenreich, S. L. 1981 Query languages: design recommendations derived from the humin factors literature. Human Factors 23:709-726. Engelbart, D. C. 1973 Design considerations for knowledge workshop terminals. AFIPS Conference Proceedings 42:221-227. 55 Engle, Stephen E., and Granda, Richard E. 1975 Guidelines for Man/Ditsplay Interfaces. IBM Poughkeepsie Laboratory Technical Report TR 00.2720, December 19, 1975. English, W. KC., Engelbart, D. C., and Berman, Mf. L. 1967 Display-selection techniques for text manupulation. IEEE Transactions on Human Factors In Electronics HFE-8:5-l5. Federico, Pat-Anthony 1980 Management Information Systems and Organizational. Behavior. New York: Praeger. Feigenbaum, Edward A. 1978 The art of artificial intelligence--themes and case studies U of knowledge engineering. In Sakti A".Gbosh and Leonard Y. Liu, eds., Proceedings of the American Federation of Information Processing Societies. Volume 47. Montvale, N. t'. J.: AFIPS Press. Galit:, U. 0. 1981 Handbook of Screen Format Resign. Wellesley. Mass.: Q.E.D. Information Science, Inc. Gantz, J., and Peacock, J. 1981 Computer systems and services for business and Industry. Fortune 103(8):39-84 (advertisement). .7. -- 56 Goodwin, N. C. 1975 Cursor positioning on an electronic display using lightpen, lightgun, or keyboard for three basic tasks. Human Factors 17:289-295. Grandjean, E., and Vigliani, E., eds. 1980 Ergonomic Aspects of Visual Display Terminals. , Taylor and Francis. Hammond, N. V., Long, J. B., Clark, I. A., Barnard, P. J., and Morton, J. 1980 Documenting human-computer mismatch in the interactive system. Proceedings of the Ninth International Symposium on Human Factors in Telecommunications. Holmdel, N. J., September 17-24, 1980. Hirsch, R. S. 1981 Procedures of tOe human factors center at San Jose. IBM Systems Journal 20:123-171. Hlady, A. M. 1969 A touch sensitive X-Y Frsition encoder for computer input. AFIPS Conference Procedin"- 35.545-55l. Hornsby, N. E, 1981 A comparison of •ull- and reduced-aipha keyboards for aircraft data entry. P. 257 in Proceedirs of the Human Factors tociety, 25th Annual XeetInjA. 57 Institute for Computer Sciences and Technology, National Bureau of iandards, and Washington, D. C. Chapter, Association for Computing M.achinery 1982 ProceedIngs: Human Factors in Computer Systems. March 15-17, Gaithersburg, Md. Irving, G. W., Horinek, J. J., Walsh, D. H., and Chan, P. Y. 1976 ODA Pilot Study II: Selection of an Interactive Graphics Control Device for Continuous Subjective Functions Applications. Report No. 215-2. Santa Monica, Calif.: Tterate 'enc Corp. * -. Johnson, 3. K. 1977 Touching Sata. Datamation 23(l):70-72. Ketchel, J. 1981 Visual display terminal research--the opportunity and the challenge. Human Factors Society Bulletin 24(10):2-3. Kulp, R. A., and Kulp, H. J. 1972 A comparison of mark sensing and handprinting cooing methods. Pp. 416-421 in Proceedinsaof the Sixteenth Annual Meeting of the Human Factors Society. Santa Mcnica, Calif.: Himan Factors Society. "*:-0 '4.' 58 Landis, D., Slivka, Re. M.e Joness J. M., Harrison, S., and Silver C. A. 1967 Evaluation of Large Scale Visual Displays. Technical Report No. RADC-TR-67-57. Griffiss AFB, Rome, N. Y.: Rome -. Air Developmental Center. (NTIS No. AD 651372) "Ledgard, H., Singer, A., and Whiteside, J. 1981 Directions in human factors for interactive systems. In Lecture Notes in Computer Science 103. New York: Springer-Verlag. Lewis, R. A. 1972 Legibility of capital and lowercase computer printout. Journal of Applied Psychology 56;280-281. Ling, R. F. 1973 A computer generated aid for cluster analysis. Communications of the ACM 16:355-361. "Martin, J. 1973 Design of Man-Computer Dialogues. Englewood Cliffs, N. J.: Prentice-Hall. Masterson, J. L., and Hirsch, R. S. 1962 Machine recognition of constrained hand-written Arabic numbers. IEEE Transactions on Human Factors In Electroni Ai HFE-3:62-65. . •• .• -__ .. , ,-. 59 Mayer, Richard E. '981 The psychology of how novices learn computer programming. Computing Surveys 13:121-141. Mayer, R. E., and Bayman, P. 1981 Psychology of calculator languages: a framework for describing differences In users' knowledge. Communications of the ACM 24:511-520. "Miller, L. A. 1974 Programming by non-programmers. International Jou~rnal of Man-Machine Studies 6:237-260. 1981 Natural language programming: styles, strategies, and contrasts. IBN Systems Journal 20:184-215. Miller, L. A., and Thomas, J. C. S1977 Behavioral issues in the use of interactive systems. International Journal of Man-Machine Studies 9:509-536. Mills, H. D. 1980 Management of software engineering systems, Part I. %%• Principles of Software Engineering. IBM Systems Journal 19:415-420. 60 Moran, Thomas P. 1981 An applied psychology of the user. Computing Surveys S~13:•1-11.• Murray, W. E., Moss, C. E., and Parr, W. H. 1981 A radiation and Industrial hygiene survey of video display terminal operations. Human Factors 23:413-420. Murray, W. E., Moss, C. E., Parr, W. H., Cox, C., Smith, M. J., Cohen, B. F. C., Stammerjohn, L. W., and Happ, A. 1981 Potential Health Hazards of Video Display Terminals. U. S. Department of Health and Human Services, Public Health Service, Center for Disease Control, National Institute of * Occupational Safety and Health, Division of Biomedical and N Behavioral Science, Division of Surveillance, Hazard N Evaluations and Field Studies. Washington, D. C.: U. S. Government Printing Office. Myer, T. H. 1968 How well do people point? Grafacon Interface 2 (Bolt Beranek & Newman Inc., Cambridge, Mass.). 61 Neal, A. S. ,71 1977 Time intervals between keystrokes, records, and fields in data entry with skilled operators. Human Factors 19:163-170 (Also: Technical Report HFC-8. San Jose, Calif.: IBM Corp., System Development Division, Human Factors Center, October 1974). Nickerson, R. S. 1969 Man-computer interaction: a challenge for human factors research. Ergonomics 12:501-517. Noll, A. Mf. 1972 Man-machine tactile communication. SID Journal 1(2):5-l1. P~ock, Gary K. 1980 Experiments with Voice Input for Command and Control: :2; Using Voice Input to Operate a Distributed Computer Network. Technical Report, Navy Electronic Systems Command, Washington, D. C. Poulton, E. C., and Brown, C. H. 1968 Rate of comprehension of an existing tele-printer output and of possible alternatives. Journal of Applied Psychology 52:16-21. 62 Ramsey, H. R., and Atwood, M. E. 1979 Human Factors in Computer Systems: A Review of the Literature. Technical Report SAI-79-111-DEN, 21 September 1979. Engelwood, Colo.: Science Applications, Inc. Reisner, Phyllis 1981 Human factors studies of database query languages. A survey and assessment. Computing Surveys 13:13-32. Rupp, B. A. 1981 Comments on certain German video display terminal regulations. Human Factors Society Bulletin 24(10):3-4. Seibel, R. 1972 Data entry devices and procedures. Pp. 311-344 In H. P. Van Cott and R. G. Kinkade, eds., Human Engineering Guide to Equipment Design. Revised ed. Washington, D. C.: U.S. Government Printing Office. Shackel, Brian 1981 The Concept of Usability. Paper presented at the Software and Information Usability Symposium, IBM, Poughkeepsie, N.Y. (15-18 September 81) and at the ITT Symposium on Human Factors and the Usability of Software, ITT Advanced Technology Center, Shelton, Conn. (5 Oct 81). 9. _ _ * -' ;: I- . I-. "l.. :111 - * _ _ 63 Shnelderman, Ben 1980 Software Psychology: Human Factors In Computer and Information Systems. Cambridge, Mass.: Winthrop Publishers. Shurtleff, D. A. 1980 How To Make Displays Legible. La Mirada, Calif.: Human Interface Design. Slack, W. 1971 Computer-based Interviewing system dealing with nonverbal behavior as well as keyboard responses. Science 171:84-87. Smith, L. B. 1967 A comparison of batch processing and Instant turnaround. Communications of the ACM 10:495-500. Smith, L. C. 1980 Artificial intelligence applications In Information systems. In Martha E. Williams, ad. Annual Review of Information Science and Technology. White Plains, N. Y.: Knowledge Industry Publications Inc. Smith, H. J. 1981 Job stress and VDT work. Human Factors Society Bulletin 24(10):4-5. 64 Smith, S. Le 1981 The Usability of Software: Design Guidelines for the "User-System Interface. Paper presented at the ITT Symposium on Human Factors and the Usability of Software, ITT Advanced Technology Center, Shelton, Conn. (5 October 1981). Smith, S. L., and Duggar, B. C. 1965 Do large shared displays facilitate group effort? Human Factors 7:237-244. (NTIS No. AD 633262) Smith, S. L., and Goodwin, N. C. 1970 Computer-generated speech and man-computer interaction. Human Factors 12:215-223. Steele, K. A. 1971 CPM/PERT. In Proceedings, 2nd Man-Computer Communications Seminar. Otawa, Canada: National Research Council of Canada, 81-84. Stewart, T. F. M. 1974 Ergcnomic aspects of man-computer problem solving. Applied Ergonomics 5:209-212. * -. *- *-. -• om .• • . ,.• • • , , 65 Strub, M. H. 1971 Evaluation of Man-Computer Input Techniques for Nilitary Information Systems. Technical Research Note 226. Arlington, Va.: U. S. Army Behavior and Systems Research Laboratory, May 1971. (NTIS No. AD 730315) Thomas J. C., and Carroll, J. M. "1981 Human factors in communication. IBM Systems Journal 20:237-263. Thompson, D. A. 1969 Man-computer system: toward balanced co-operation In intellectual activities. In Proceedings, International Symposium on Man-Machine Systems. IEEE Conference Record *1: INumber 69C58-MMS. Volume 1. New York: Institute of Electrical and Electronics Engineers. Turn, R. 1974 Speech as a Man-Computer Communication Channel. Report No. P-5120. Santa Monica, Calif.: Rand Corp. U. S. Bureau of the Census 1979 Statistical Abstracts of the United States 100th ed.: Table Number 685, p. 415. Washington, D. C.: U. S. Department of Commerce. 66 Van Cott, H. P., and Kinkade, R. G., eds. 1972 Human Engineering Guide to Equipment Design. Revised edition. Sponsored by the Joint Army-Navy-Air Force Steering Committee. Washington, D. C.: U. S. Government Printing Office. Wargo, M. J., Kelley, C. R., Mitchell, M. D., and Prosin, J. J. 1967 Human Operator Response Speed, Frequency, and Flexibility: A Review Analxsis and Device Demonstration. Report No. CR-874. Washington, D. C.: National Aeronautics and Space Administration. Williges, Robert C., and Williges, Beverly H. "1981 Users' Considerationv in Computer Based Information Systems. Technical Report CSIE-81-2. Virginia Polytechnic Institute and State University, September 1981. (NTIS No. AD A106194) Witten, I. H,, and Madams, P. H. C. 1977 The telephone enquiry service: a man-machine system using synthetic speech. International Journal of Man-Machine Studies 9:449-464. .4 67 Wright, P. 1981 Problems to be Solved When Creating Usable Documents. "Paper presented at the Software and Information Usability "Symposium. 15-18 Sept. IBM, Poughkeepsie, N. Y. Zoltan, E., and Chapanis, A. 1982 What do professional persons think about computers? Behaviour and Information Technology 1:55-68. 4., 4,, *1.: 71 VI POPULATION GROUP DIFFERENCES Many areas of research in human factors have concentrated on systems that fit the average person. in those studies, Individual differences traditionally have been treated as little more than an error problem. Thus few data are available in many areas of human factors on the interaction of different systems with variables such as ability levels or age levels. Attempts to classify, describe, predict, and exploit individual and group differenzes extend to the beginnings of recorded historv. Some of the earlIest decipherable samples of writings include references to the physical and mental differences between men and women, serfs and noblemen, slaves and naesters, and barbarians and civilized persons. It was not until the nineteenth century, however, that the study of individual and group differences assumed the systematic and rigorous qualities of scientific investIgation. The attempts of Sir Francis Galton (1822-1911) to describe the nature of individual differences are the foundations of what is sometimes referred to as differential psychology. The principal authors of this chapter are Irwin L. Goldstein and Alphonse Chapanis. . . .,i . . . . . . . . . . . . . . . . . . . . . 77- ý_ . •S• - "since Calton, investigations of individual and group differences carried out by psychologists, anthropologists, and sociologists number in the hundreds of thousands. There is a psychological journal, The Journal of Cross-Cultural Psychology, entirely devoted to studies of this kind. One of the most important applications of this work in psychology has been the development of a multimtillion dollar testing industry. Psychologists have devised hundreds of tests of ability, achievement, skills, knowledge, and personality (Buros, 1978) that are used routinely for classifying and selecting employees for thousands of jobs and occupations.o One of the most ambitious and thorough attempts to relate individual characteristics of workers to job requirements is the Dictionary of Occupational Titles (U.S. Department of Labor, 1977). This compendium gives profiles of the educational, aptitude, interest, physical, and "temperament characteristics required of a worker to achieve average successful job performance in thousands of occupations. The military services have tried to do something similar on a more modest scale. In the preparation of personnel requirements data, the Air Force Design "Handbook (Air Force Systems Command, 1969) specifies that tasks should be rated along six dimensions: ambient environment, equipment characteristics, mental demands, physical demands, hazard exposure, and task criticality. Figure 6-1 shows the three levels of mental demands that may be required of people by various duties and tasks. ATests are also used for other purposes, for example, diagnosing and clssifying mental illnesses, but our concern here is with job-related uctivities. CODE I requires little or no formal training, just a basic Introduction to the task; ability to follow relatively simple written or oral instructions; little judgment, since only elementary decisions involved; little concentration; little or no recall of relevant knowledge for decisions or inference; only precise determinations, such as GO/NO-GO, UP/DOWN, MORE/LESS, YES/NO, ALL/NONE, CORRECT/INCORRECT, etc. CODE 2 requires moderate technical knowledge and training; some ability to adjust to changing situations; occasional exercise of judgment Involving use of technical knowledge; ability to understand and use technical manuals; some initiative and ingenuity required; occasional recall of relevant knowledge and experience of the practical type for decisions or inferences; decisions involving somewhat detailed procedures or measurements, as in assembling, disassembling, installing, removing, Inspecting, testing, operating, adjusting, computing, monltorlng, servicing, etc. "CODE 3 requires a high degree of complex and varied technical knowledge, with considerable formal and informal training; a high degree of continuous concentration, with attention to advanced and Involved elements of the task; continuous exercise of a high degree of judgment, with decisions based on varied and complex factors requiring .* understanding of underlying principles and procedures; extensive recall of relevant and precise knowledge and experience for decisions and inferennes; frequent decisions at the theoretical and abstract level; precise and detailed analysis, correlating, computing, organizing, and sequencing of processes or data, as in variable emergency procedures, troubleshooting, planning, scheduling, etc. -'a. FIGURE 6-1 Classification of the Mental Demands Made on Personnel by Duties and Tasks Source: Air Force Systems Command (1969). Sa••,• € ... " .' -, .. .'. .-. ," " -. a• . --- . -;" , • . . ,' . . •". . . . . .".. ''a •.. .-.. . . . . . . . . . . . .- • 4 Although it is seldom explicitly stated, the underlying rationale of most of these classifications is that the job or the occupation is a given, a fixed quantity. The aim of personnel selection Is therefore to find persons who have the abilities, skills, and other characteristics required to perform particular jobs. From the standpoint of haman factors, however, a job is not a fixed quantity but rather something that can be modified and designed to fit people with varying characteristics. Thus it becomes important to know in what ways people very and by how much. In this area there are serious gaps in our knowledge. The most thorough translation of Individual difference data into design requirements has been done in the field of anthropometry, which involves measurement of the human body. It is possible to write equipment design specifications so that the equipment will fit 90 percent, 95 percent, or any other proportion of a particular user population. The information necessary to write equally precise design specifications for other human "dimensions and characteristics, however, is not available. Attempts have been made to do that, but further research is needed on this complex problem. The Air Force's six task dimensions of ambient environment, equipment characteristics, physical demands, hazard "exposure, and task criticality are a good initial effort (see Table 6-1), yet the Air Force Design Handbook acknowledges its limitations: "Because * of the broad range of equipment characteristics, complete criteria are . not presented here. The following are merely suggested guidelines" (Section DN4C3, p. 13). For example, the manual states that Code 1 * equipment is ". . . complex but adequately designed for ease of use... . What the definition does not specify is ease of use for whom. Something that Is easy for an astronaut to use may be completely beyond "~~~'~~r-~~ ~ ;- r. r r r Cw . ~ . . . a--r r~ M " A 0 0 a ~ a ce 41 lowW U C adg cc - V V. . 4I aC C) .bu5L r :.M. a~ tow 0-.4-,4 O 1a 0 C~ 9i 4-0 0 W 01 r. ~ 4 l0 d v & 0 1..U 0 MV W V 30-.4 S. W.4 U4 -0- .4 aM AU a .41C 0 d.l4 of & 0 c aaaI.. ýU1 aS. 41 U 4'i *1UC0 A4j di.6 40 UU4 low.4 di 6 od c:-c 0" cr -4 v S W* ' $ . V4 qm6 61c . A104 6 Ai W U AJ 1 0 6 4 1 4b V,..0. " di,- "4 ccueb. . ob ;W I "0. 4 N..0 2 . .4 ff 4wv W 3 U. 3. -4 c.0IdvA C244 a4. v' .W C £4 0 &1 ow v hj'.'c.. &A cc v P%- 4 d45 UWWa . 0.4 f"h "04 v ILCCC . W fl 5.us &0 ~ A.'aw. id w4 4 ~ I u e SJ0 . S. I'm . I 4M .U b ID a 4 . L)a I .0 d% * M a. 0 4a LEW 4 . 0-I 60 L A o 3 *d wS C W41 W 4 04 & .4A I. 4JW 0 0u. u4 A-0 S. C .4 M.' ... 4 .. * A0 z4644 30.. di d k 06 a10 i W.4 c 1.4 c t *5.43 Oc 4f& l f.U oU . .0 , . a f 0.VI ,4v 0 .4 4.. Ia bd c. .4 .4 U M4&AU -0 .. M C.4~ .0I4..4 UM.0. U U' c wW4C~ Wa 614 -4 C A4.U U L. W U .U. -.4 0 " C~W 0% ow4' do. (A 0.b 41 W 1U 4 U~ .46 .1i9 0.4 %.4 01..,a 4L . w4d 10. S.CdiW op, 0 .4m & .0 .4U *&J 9 G .4 -Z 4-4.4 w .4 .4.4 C a-- 2 M Lc .4 SCU4.4 W 41 aU' .I. 'P4 * 6 the capabilities of an individual with only an elementary school education. To state the problem explicitly, we do not know exactly how "to design complex equipment so that it can be used with ease by people with average IQs, people with IQs as low as 80, people with fifth-grade reading abilities, or people for whom English is a second language. THE IMPACT OF FEDERAL ANTIDISCRIMINATION LEGISLATION Antidiscrimination legislation has focused attention on human factors issues related both to complying with legislative requirements and maintaining the productivity of a work force with greater diversity than in the past. As a result there is increased concern over the interaction of Individual differences with programs such as job redesign and training as well as over organizational attitudes toward various populations (e.g., the elderly) that may constrain their performance. As a result of the U.S. Civil Rights Act, federal guidelines have been developed concerning personnel decisions that affect protected classes, which include:- American Indian or Alaskian natives, blacks not of Hispanic origin, Hispanics, and Asian or Pacific Islanders. In addition, federal legislation has made it illegal to discriminate on the 6; basis of sex, age, or disability. Any personnel action resulting in adverse impact against any of these groups can result in litigation. In this context, personnel decisions are not limited to selection or * promotion but rather refer to any personnel practice, such as job and workplace redesign, selection for training, and the use of training as a basis for promotion. 7 Legal actions resulting from charges of discrimination have stimulated research on the procedures necessary to assess the validity of these types of personnel practices; however, most of the emphasis has been on the establishment of procedures to validate selection tests (American Psychological Association, 1980). Similar concerns are being expressed about methodologies for evaluating training and job redesign (Bartlett, 1978). The research emphasis has been on establishing data bases, so that it is possible to design programs that do not have adverse Impact. As a consequence of antidiscrimination legislation as well as social and economic factors, people from special population groups are moving into occupations that were previously considered nontraditional for them. An example is women who are entering managerial and blue-collar jobs and the military services. The military services are also accepting more people (male and female) who have lower ability as measured by traditional academic aptitude measures. These changes in the composition of the work force and .the armed services have revealed an important problem in addition to the human factors issues of designing jobs, equipment, and training to accommodate individual differences: It has only recently been recognized that organizational attitudes toward people entering nontraditional jobs may adversely affect productivity by hindering their performance and constraining occupational aspirations. "IiZ o"., SEX AND JOB PERFORMANCE "Sheridan's (1975) description of the American Telephone and Telegraph ,. Company's experience In placing women in craft jobs illustrates the implications of human factors for sex and job performance. Despite rigorous recruiting and comprehensive training efforts, the women recruited into a particular job dropped from training at an average rate of 50 percent, and the women who completed training usually did not last a full year on the job. A task analysis of the job Indicated that the physical tasks were extremely difficult for women to perform; furthermore, this analysis determined which tasks were causing the most difficulty. Some of the most serious problems centered on the use of a ladder that weighed approximately 80 lbs. and was 14 feet long before being extended. Women had great difficulty placing the ladder against a building because they had to apply force below the midpoint of the ladder just as the force required to raise it was increasing. A fiberglass tube was connected to the top rungs of the ladder that enabled the worker to push the ladder against the building much more easily. As a result, workers vho were 5 foot 2 inches weighing 120 pounds were able to raise a 72 lb. ladder with one hand. These and other design modifications not only allowed women to perform the job but also resulted in fewer back Injuries for men. 71-7 9 AGE AND JOB PERFORMANCE Important considerations with regard to age and job performance are that the average age of the population is increasing and both age discrimination legislation and rulings against forced retirement are resulting in a larger number of older people in the work force. Many of these individuals will require additional training as a result of job shifts, technological changes, or simply interest in a new career. The biases operating against these people are made obvious by Britton and Thomas's (1973) study of the views of employment interviewers. They noted that 50-year-old workers were viewed as the most difficult to place during a recession, the most difficult for an employer to train, and the least able to maintain production schedules. These views are based on preconceived beliefs that older workers cannot perform as well on the job and cannot easily acquire new skills. Data relevant to these questions are virtually nonexistent; a thorough review (Fozard and Popkin, 1978) of percepcual and cognitive data analyzed by age reinforces the view that there ere few data relevant to work situations. Much of that review is based on data from laboratory experiments on topics such as paired associate learning, iconic memory, and visual discrimination, making generalizations to work situations hazardous at best. The deficient state of this research Is summarized in Sheppard's (1970) generalizations about basic research on aging and job performance: The research fails to differentiate various aspects of the work situation, including physical, psychomotor, sensory, and socia2 IOI 10 characteristics; most of the emphasis is on average performance, with little, if any, attention to the substantial number of individual differences; and, there is a blind faith in trend extrapolations. If workers ages 30-40 have lower morale than workers ages 20-30, it is simply assumed that workers ages 40-50 will have even lower morale. A good example of the implications of our lack of knowledge is evidenced by the continuing controversy concerning airline pilot age, health, and performance. An Institute of Medicine (1981) report notes that although the average risk of acute incapacitation increases with age, there are large individual differences. In addition, while there "are decreases in capacity, speed or accuracy of attention, memory, aud intellectual skills with increasing age, there is also evidence that well-practiced skills may not show any age-related decline. The report concludes that there is a need for research on age-related changes among pilots and a need for research on pilot performance on tasks that are representative of actualwork situations. * Of more immediate relevance to this report are the relationships between group variables such as age and equipment design. For example, as they age, many people require the use of bifocals. How does the use -'., of bifocals relate to the need to read information from displays such as those found on word processing equipment? Is it possible that the displays must be designed differently or that the information must be displayed differently depending on the age of the operator? -Questions such as these constitute a largely unexplored topic for research. 7,, . 777 INTERACTIONS AMONG VARIABLES Another serious gap in our knowledge is how various Individual and group "differences interact to affect job performance. For example, there are considerable date available relating aging to maximum oxygen uptake, which determines the capacity of an individual to do prolonged heavy work (Astrand and Rodahl, 1977). These data show that there is a steady decrement In aerobic power beginning at about age 20, such that a 60-year-old attains about 70 percent of the maximum of a 25-year-old. Unfortunately, there are a few data on most population differences or individual differences as they are related to work situations. McFarland * . and O'Doherty (1959) concluded the following regarding the relationship of aging and work performance (pp. 454-455): Although most studies show an unrelieved picture of decline in capacities, it is well to remember that this constantly changing balance between physiological and psychological impairment, on the one hand, and increased experience, wisdom, and judgment, on the other, occasionally results in actual improvement of capacities, especially in those functions which are of greatest importance in daily living. 4! These and other interactions of variables are another almost completely untapped area of research. -o----- -. . I,. 12 NATIONAL AND ETHNIC DIFFERENCES There are, of course, other important differences in population characteristics that should be considered injob redesign and training systems. National and ethnic differences have implications for equipment design that have just recently begun to be investigated (Chapanis, 1975). These differences are reflected in anthropometric, physiological, psychological, language, and cultural variables that affect equipment design. For example, Ruffell-Smith (1975) notes that telegraph systems were originally used as communication devices in air traffic control systems; however, with the increased amount of speed of air traffic, voice communication systems replaced telegraph devices. Obviously, the use of the different languages of the many nations involved in air travel was a serious impediment to the operation of voice systems. After World War II English was chosen as the language of use because at that time most aircraft were operated by English-speaking countries. Yet there is a wide variation in English dialects and pronunciation, to the extent that some dialects, such as that spoken in Newcastle, are not understood by people elsewhere in the British Isles. Obviously the problems are more severe when the speaker's native language is not English. Ruffell-Smith's analysis of communication errors indicates that this problem can be' serious in air traffic communication, especially when the speed of reaction is a critical element in avoiding an accident. Clearly, the implications ;f these population differences should be considered In design decisions. . *, -,.* .. ,,, 13 Ethnic Variables in Human Factors Engineering (Chapanis, 1975) provides other examples of equipment design complexities caused by language differences. One chapter (Hanes, 1975) shows the variety of accounting keyboards that have been designed to accommodate some of the European and Mideast languages. Another chapter (Brown, 1975) illustrates the design problems that were encountered in designing a * computer terminal for Japanese, a language that is markedly different from the Indo-European languages. In general, there is little appreciation of the problems involved in designing equipment for diverse national and ethnic groups. The Human Engineering GIide to Equipment Design (Van Cott and ]inkade, 1972) Is the best single source of human factors data available, yet it is almost entirely concerned with American "and European data. It is necessary to learn to what extent its data and design recommendations need to be modified or supplemented for International use. INDIVIDUAL DIFFERENCES AND TPAIJING Closely related to problems of equipment design are those associated with the training of individuals to operate complex equipment. Here again our Information is seriously deficient. An approach that has some promise is the aptitude-treatment interaction (ATI) model. The goal of this approach is to match a particular mode of Instruction to an individual's distinctive characteristics so that each person is assigned the most appropriate learning procedure. A disordinal aptitude-treatment interaction is one in which individuals with high aptitude perform besý 14 with one treatment (e.g., training or display), while those with lower aptitude perform best with another treatment. Thus, the aptitude level of the individual determines the form of treatment that has the best chance of success. Aptitude in this context refers to any personal characteristics that relate to learning and so can include a broad range of variables, such aa styles of thought, personality, and various scholastic aptitudes. Treatment has typically referred to instructional modes like programmed instruction, computer-assisted instruction, visual versus verbal presentations, etc; it can be generalized, however, to any intervention, including job redesign. An exhaustive-review of this appealing strategy is provided in the text by Cronbach and Snow-(1977). They examined a large number of potential aptitudes, such as learning rates, abilities, and personality, and considered their interactions with various instructional techniques. While early reviews of this topic were more pessimistic, Cronbach and Snow's extensive review and reanalyses of data have led them to conclude that aptitude treatment interaction effects are real phenomena. They note that the findings that most clearly suggest ATI effects are those dependent on prior learning experience: The technique that works best is the one that an individual has already experienced. However, ATI effects have not often been generalized or replicated. Goldstein (1980) notes the need for systematic empirical and theoretical research that matches individual differences among learners to various instructional strategies.. The haphazard assignment of individuals with particular abilities to any available instructional technique is not likely to produce dividends. . .___ __.. _ . ., , ,. . . . , . . 15 BARRIERS TO SUCCESSFUL PERFORMANCE Another important topic is the Identification of barriers to successful * performance for different groups. For example, some employment Interviewers perceive women as more likely to be absent and to have fewer . skills, even though they have no evidence to support these beliefs (Britton and Themas, 1973). Similarly, the elderly are viewed as difficult to train (Britton and Thomas, 1973). Researchers concerned with these Issues emphasize that the Identification of organizational cotstraints, in military organizations for example, is a first step In understanding and resolving their serious retention problem. One study * (Boyd et al., 1975) of 1,573 women In their first tour in the Army's basic training program was critical of the program's failure to provide realistic expectations about the training process. Subsequent to the -j basic training program supervisors reported the main difference between good and poor performers was job-related attitudes (discipline, following orders, military courtesy) that were not adequately presented in basic training. RECOHHENDATICoNS FOR RESEARCH ON POPULATION GROUP DIFFERENCES A research program to explore issues concerning population group and Individual differences would need to take several approaches: (1) It is necessary to conduct literature reviews and examinatio.ns 16 of reports that forecast which type of population group variables (such as age and sex) and which type of work situation parameters (such as visual displays on a word processor) will be important In the future. "(2) It is necessary to collect and examine available theories and empirical data about the relevant parameters (e.g., changes in information processing capability as a function of age). (3) Research should be sponsored on a number of topics: "o The relationship between population group variables and performance on relevant work tasks. o The interaction between population group differences and various Interventions, such as job redesign and training. o The specification of design changes based on research findings resulting from these research recommendations. (4) In addition, data should be collected and analyzed to identify and remove organizational constraints that serve as barriers to the successful performance of various population groups, such as women and aged and handicapped people. ;JJ 17 REFERENCES Air Force Systems Command 1969 AFSC Design Handbook, Series 1-10,.General; AFSC DH 1-3; Personel Subsystems. First edition. Andrews Air Force Base, Washington, D.C.: Air Force Systems Command. American Psychological Association, Division of Industrial/Organizational Psychology 1980 Principles for the Validation and Use of Personnel Selection Procedures. Second edition. Berkeley, Calif.: American Psychological Association. Astrand, P. 0., and Rodahl, K. 1977 Textbook of Work Physiology. New York: McGraw Hill. Bartlett, C. J. 1978 Equal employment opportunity issues in training. Human Factors 20:179-188. Boyd, H. A., Dufilho, L. P., Hungerland, J. E., and Taylor, J. E. 1975 Performance of First-Hour WAC Enlisted Women: Data Base for the Performance Orientation of Women's Basic Training. "HumRRO Technical Report, FR-WD-CA 75-10. Alexandria, Va. ,%. . . . . . . . . . . . .~L .. a ..-. 18 Britton, J. 0., and Thomas, K. R. 1973 Age and sex as employment variables: views of employment service interviewers. Journal of Employment Counseling ' 10:180-186. Brown, C. R. 1975 Human factors problems in the design and evaluation of key-entry devices for the Japanese language. In A. Chapanis, ed., Ethnic Variables in Human Factors Engineering. Baltimore, Md.: Johns Hopkins University Press. Buros, 0. K., ed. 1978 The Eighth Meatal Measurements Handbook. Highland Park, N.J.: Gryphon. Chapanis, A., ed. 1975 Ethnic Variables in Human Factors Engineering. Baltimore, Md.: Johns Hopkins University Press. Cronbach, L. J., and Snow, R. E. 1977 Aptitudes and Instructional Methods. New York: Irvington. Fozard, J. L., and Popkin, S. J. j 1978 Optimizing adult development: ends and means of an applied psychology of aging. American Psychologist 33:975-989. I * - • I. ~~- ! Goldstein, I. L. 1980 Training in work organizations. Annual Review of Psychology 22:565-602. *° Banes, L. F. "1975 Human factors in international keyboard arrangement. In A. Chapanis, ed., Ethnic Variables In Human Factors Engineering. Baltimore, Md.: Johns Hopkins University Press. Institute of Medicine, National Academy of Sciences 1981 Airline Pilot Ae, Health and Performance. Washington, D.C.: National Academy Press. McFarland, R. A., and O'Doherty, B. M. 1959 Work and occupational skill. In J. E. Birren, ed., "Handbook of Aging and the Individual. Chicaso: University of Chicago Press. Ruffell-Smith, H. P. 1975 Some problems of voice communication for international aviation. In A. Chapanis, ed., Ethnic Variables In Human Factors Engineering. Baltimore, Md.: Johns Hopkins University Press. 20 Sheppard, H. L. 1970 On age discrimination. In H. L. Sheppard, ed., Towards an Industrial Gerontology. Cambridge, Mass.: Schenkman. Sheridan, J. A. 1975 Designing the Work Environment. Paper presented at the American Psychological Association, Chicago. U. S. Department of Labor 1965 Dictionary of Occupational Titles. Fourth edition. Washington, D.C.: U.S. Depiartment of Labor. Van Cott, H. P., and Kinkade, R. G., eds. 1972 Human Engineering Guide to Equipment Design. Revised edition. Washingon, D.C.: U.S. Government Printing Office. ¾ 9. N VII APPLIED METHODS IN HUMAN FACTORS As part of an engineering team, human factors, specialists apply their knowledge and skills to system definition, design, development, and evaluation in order to optimize the capabilities and performance of human-machine combinations. Their task can be formidable in complex system development. For example, military standard MIL-H-46855B of the Department of Defense details the human factors requirements that must be addressed in the development of military systems; an outline of these requirements appears as Figure 7-1. The outline is also a reasonable representation of the human factors considerations that may be relevant to the development of any system. In designing and creating systems human factors specialists use a variety of analytic and data-gathering techniques to assess problems, develop machine and human requirements and functions, and evaluate system SII The principal authors of this chapter are Alphonse Chapanis and Robert T. Hennessy. It is based on a workshop on applied methods held In December 1981 under the sponsorship of the Committee on Human Factors. The workshop participants and, therefore, the principal contributors to this . chapter are Alphonse Chapanis (workshop chairman), Johns Hopkins * -'University; Stuart R. Card, Xerox Palo Alto Research Center; David -M eister, US Navy Personnel Research and Development Center; Donald L. Parks, Boeing Aerospace Company; Richard W. Pew, Bolt Beranek & Newman Inc.; Erich P.'Prien, Memphis State University; John B. Shafer, IBM Corporation; and Robert T. Hennessy, National Research Council. Coprain . .enesy .1 - ~ *,** . . 2 3.1 General Requirements 3.1.1 Scope and Nature of Work o Analysis a Design/Development a Test and Evaluation 3.1.2 Human Engineering Program Plan and Other Data 3.1.2.1 Human Engineering Program Plan 3.1.2.2 Changes to the Human Engineering Program Plan 3.1.2.3 Other Data 3.1.3 Nou Duplication (of Effort) 3.2 Detail Requirements "3.2.1 Analysis 3.2.1.1 Defining and Allocating System Functions 3.2.1.1.1 Information Flow and Processing Analysis 3.2.1.1.2 Estimates of Potential Operator/Maintainer Processing Capabilities 3.2.1.1.3 Allocation of Functions 3.2.1.2 Equipment Identification 3.2.1.3 Analysis of Tasks 3.2.1.3.1 Gross Analysis of Tasks 1. Determine System Performance Can Be Provided by Proposed Personnel-Equipment Capabilities "2. Assure Human Performance Requirements Do Not Exceed Human Capabilities 3. Input Data for o Preliminary Manning Levels o Equipment Procedires o Skill/Training Requirements o Communication Requirements 4. Critical Human Performance 5. Possible Unsafe Practice 6. Promising Improvements in Operating Efficiency 3.2.1.3.2 Analysis of Critical Tasks 1. Identifying o Information Required by Man, Including Task Initiation Cues o Information Available to Man o Evaluation Process o Decision Reached After Evaluation o Action Taken o Body Movements Required by Action o Workspace Envelope Required by Action o Workspace Available o Location/Condition of Work Environment o Frequency/Tolerances for Action o Time Base o Feedback on Action Adequacy o Tools and Equipment Required FIGURE 7-1 Outline of Human Factors Requirements in the Development of Military Systems , •-i-.-..---- " - - - - . . -- -. . - s. -- _ _ __ _ _ _ ... .. . ,.-.,,o. V 3 o Number of Personnel Required and Specialties/Experience o Job Aids/References Required o Special Hazards Involved o Operation Interaction Where More Than One Crewman is . Involved o Operational Limits of Man (Performance) o Operational Limits of Machine (State-of-the-Art) 2. Covering All Affected Mission/Phases, Including Degraded Modes of Operation 3.2.1.3.3 Loading Analysis 1. Individual Crew Member Workload Analysis Compared with Performance Criteria 2. Crew Workload Analysis Compared with Performance Criteria 3.2.1.4 Preliminary System and Subsystem Design 3.2.2 Human Engineering Studies, Experiments and Laboratory Tests 3.2.2.1 Studies, Experiments and Laboratory Tests 3.2.2.1.1 Mockups and Models 3.2.2.1.2 Dynamic Simulation 3.2.2.2 Equipment Detail Design Drawings 3.2.2.3 Work Environment, Crew Stations and Facilities Design o Atmospheric Conditions "o,Weather and Climate o Range of Accelerative Forces o Acoustic Noise, Vibration and Impact Forces o Provision for Human Performance During Weightlessnesr o Provision for Minimizing Disorientation o Space for Crew, Activity and Equipment o Physical, Visual and Auditory Links for All Man-Equipment *' Interfaces o Safe, Efficient Walkways, Stairways, Platforms, Inclines o Provision to Minimize Psychophysiological Stresses o Provision to Minimize Fatigue--Physical, Emotional, Work-Rest Cycle o Protection from Hazerds--Chemical, Biological, Toxicological, Radiological, Electrical, Electromagnetic o Optimum Illumination Per Visual Tasks o Sustenance, Storage and Sanitation o Crew Safety Protection Relative to Mission Phase and Control- Display Tasks 3.2.2.4 Human Engineering in Performance and Design Specifications 3.2.3 Equipment Procedure Development "3.2.4 Human Engineering Test and Evaluation 3.2.4.1 Planning 3.2.4.2 Implementation (Include As Applicable) o Simulation or Actual Conduct of Mission/Work Cycle o Human Participation Critical to Speed, Accuracy, Reliability, Cost FIGURE 7-1 Continued .. a.a*- .L. t ** * -*.*. . 4 o Representative Sample of Non Critical Scheduled/Unscheduled Maintenance Tasks o Proposed Job Aids o Use of Representative User Personnel, Clothing and Equipment o Task Performance Data Collection o Task Performance Discrepancies--Required vs. Obtained o Criteria for Acceptable Performance 3.2.4.3 Fallure Analysis (Human Error Factors) 3.2.5 Cognizance and Coordination (Interdisciplinary Integration) 3.3 Data Requirements Per Contract Dita List S3.4 Data Availability to Procuring Activity 3.5 Drawing Approval by HFE for Man-Machine Interface U FIGURE 7-1 Concluded Source: Adapted from Parks and Springer (1976). 5 "or subsystem performance. Although many of these problems would ideally be solved with the experimental methods used in scientific research, practicing human factors specialists rarely have the luxury of using properly counterbalanced experimental designs, with a range of levels of factors and the precise control of unmanipulated variables. This is not to minimize the importance of experimental methods which are used whenever possible and have provided much of the basic data In human factors handbooks. However, applied methods are necessary both as suplements to experimental methods, e.8., for problem analysis and structuring, and as substitutes when the pressures and constraints of the engineering design environment preclude experimental investigations. Most practical work in human factors is done under conditions that involve the incomplete specification of system functions, complex combinations of conditions that cannot be separated or controlled, restricted sets of alternatives, limited time and opportunities for investigation, and pressure to produce definitive results quickly. From necessity, human factors specialists have evolved an armamentarium of applied methods that are appropriate to these conditi(.ns and that are unfamiliar to most academic researchers. These applied methods are formal means for acquiring or organizing information about human factors characteristics that arise in the context of system design, development, and evaluation. Applied methods are diverse, reflecting the many purposes for which human factors information is used. Some of them come from psychology, for example, questionnaires and techniques for acquiring, summarizing, and analyzing data. Some have been borrowed, with or without modification, from other fields, such as Industrial engineering and time o.* - - . . . - 6 and motion engineering. For example, analytic methods draw heavily on the engineering practice of systems analysis, which identifies inputs, outputs, the functions performed, the range of values that variables may assume, process flow, the sequence of events, and the timing of the interrelations of system components. Other methods, such as the critical incident technique and link analysis, appear to have been createc by human factors specialists to meet their needs in solving particular problems. Whatever their origins, applied methods have been developed as tools to help answer questions when there are constraints of time, "dollars, and freedom of action and when experimental methods are not suitable to answer the questions that arise in system development. Although it is characteristic of applied methods that they make it possible to acquire and produce data and information only to the degree of resolution and reliability sufficient for a particular purpose, these methods are systematic and objective procedures. That is, the procedures are repeatable and input and output data are operationally defined. "The importance of applied methods in human factors work is clear from the number of technical reports and journal articles that discuss one or more applied methods. Two recent reports (Williges and Topmiller, 1980; Geer, 1981) list human factors procedures necessary for Air Force system analysis, design, and evaluation; the latter report gives brief descriptions and critiques of approximately 48 human engineering procedures, the majority of which are applied methods. Figure 7-2 lists applied methods that appeared in keyword lists of articles published between 1976 and 1981 in Human Factors, the journal of the Human Factors Society. 7 Accident studies Activity analyses Attitude studies Cost-benefit analysis Critical incident studies Decision analysis Delphi techniques Failure mode analysis Fault tree analysis Flow analysis Functional analysis Job analysis Lapse time photography Link analysis Near-accident studies Network flow analysis Operational sequence analysis Questionnaires Requirements analysis Task analysis FIGURE 7-2 Applied Method Names Appearing in Keyword Lists of Articles "in Human Factors Between 1976-1981 * . .. Despite this wide variety of applied methods, there is general agreement among human factors specialists that we need to improve existing methods and develop new ones (Topmiller, 1981; Meister, 1982). Advances in technology, particularly in the speed, power, and memory of computers, have generated corcern recently with the human factors elements of computer software. At the same time, the explosive growth of computer use, with resultant increases in the complexity and integration of system components, the automation of functions, and the use of artificial Intelligence, all have profound methodological implications for the analysis and description of the role of humans and computers in such systems. "Applied methods have never previously been treated as a single topic deserving attention in its own right.* Consequently, information has never been gathered on the number and varieties of applied methods available and the frequency and adequacy with which they are used. The workshop held by the Committee on Human Factors, on which the discussion "in this chapter is based, was an attempt by committee members and a group of acknowleJged experts in applied methods to identify problems and needs with respect to applied methods. Even in the absence of data on the "variety and frequency of use of applied methods, we have been able to identify several major problems and to recommend solutions, which may I-" make substantial improvements in practice possible. Three major problems are discussed: (1) the lack of adequate documentation; (2) the limited "0' *This situation contrasts with experimental methods, for which there are many textbooks and source books for readers at all levels of sophistication. 0.17~ 9 opportunities available to learn applied methods, either In colleger and •"universities offering human factors courses or as part of the continuing education of human factors specialists; and (3) the lack of research to Improve existing methods and to develop new methods that will provide the data and information needed in current and future practical human factors work. DOCUMENTATION OF APPLIED METHODS "The practical work of human factors specialists, unlike scientific '" research, does not result-in an orderly progression and an orderly accumulation of knowledge. Human factors projects (i.e., participation - in the design of systems) and the solution of special problems come and "go In great variety. Typically work is performed, reported, and forgotten as new systems, and problems develop. Codified, archival repositories of practical work--i.e., review books and articles that "- summarize the knowledge and procedures used in human factors applications to some point in time--are rare. As a result the historical memory 'of human factors methods resides largely in the heads and in the report files of practitioners. By contrast, in the literature on scientific * research, the methods used by Investigators are maintained and disseminated in the curricula of university departments and preserved on ,' library bookshelves. As an Important first step toward improving knowledge about and use "A of applied methods, we therefore recommend that one or more projects be "•initiated to compile and review the available information on applied r ..... *... . .-. ~ . . . .. . . . . ... 77- 7. 10 methodologies used In human factors and-related fields, such as industrial and organizational psychology, personnel selection, and "Instructional psychology. The object of the review would be to determine what methods have been used, how they have been used, where they are used, and what their advantages and disadvantages are. The project should also include a critical analysis of the methods. Other purposes of the review would be to structure or codify the methods and to document them for subsequent educational and research purposes. It would also be extremely valuable to practitioners, educators, and researchers in human factors to have a compendium that codifies and provides standard or generic descriptions of applied methods that are used in practical human factors work. Development of such a compendium would require a great deal of judicious and careful effort. One of the "primary difficulties would be to decide which methods are viable, valid, and useful. Because such a compendium would necessarily be an implicit endorsement of the methods described, we recommend that eight criteria be used in the selection process. Methods that meet the criteria listed I- below could be regarded as having sufficient stature to be of value in a variety of human factors applications: Importance--Does the method produce needed information? Cost--Is the method efficient in terms of effort and time? Utility--Can procedures for using the method be easily interpreted and implemented? Available Input Parameters--Can the necessary data be collected in a direct, objective, and reliable way? - * -'r 11 Usable Output--Does the method produce results that are interpretable and useful for decision making? * Validity/Verification--Can or has the method been found to produce the information It Is supposed to? Theoretical Foundation--Is the method supported by accepted .behavioral or measurement .principles? Robustness-Can the method be applied to a variety of problems or in different contexts? These criteria imply that the approach to documenting standard definitions of applied methods should be conservative. That is, only those methods for which there is evidence of practicality and validity should be selected for inclusion In a compendium. Methods used in "workload assessment provide an example of the Importance of using these criteria. Measurement of workload is a current topic of intense research interest; consequently a large number of theories, approaches, and positions have been put forward. Since most of the recent work has not * been validated through practical application, it would be Inappropriate to describe them as standard, accepted methods. Older methods exist for assessing imposed workload that, while perhaps wanting in certain respects, have been proven through repeated use to be practical, reliable, and valid (Parks and Springer, 1976) and are likely to meet our criteria. Nevertheless, there will be hard choices to make .in deciding "what constitutes an accepted, standard form of a method. *• Multiple' variations of a method should probably not be included. A coapendium that includes only a set of core methods that meet the critetia would be of great value for both practical work on system 12 development and as a foundation for the-education of human factors students at colleges and universities. Attempting comprehensive coverage of all variations of methods would unnecessarily complicate the task of documentation and delay the compilation, causing confusion and consequently inhibiting it: acceptance. A single, solid definition of each particular method would be most useful, since by Its nature an applied method undergoes some variation in each instance of its use because of the requirements and constraints of a particular project. In the meantime, additional documentation and research to extend or refine the standard methods can be carried out. In the course of compiling a reasonably comprehensive list of the most generally known applied methods (see Figure 3), It became appaent that the methodologies could be grouped into five categories according to their purpose. Five categories of applied methodologies seem appropriate: analysis, Identification of needs, data collection, prediction, and evaluation. Each methodology appears only under one heading, although several of them are appropriate to more than one category. The organization of Figure 7-3 is probably a useful guide to the scope of work Involved in documenting applied methods. The categories reflect a sequence of methods used, from the early concept definition of a system to Its evaluation. There is also a rough ccrrelation between - the difficulty and detail involved in particular methods and the stage of application In the process of system development. Documentation of applied methods necessarily requires review of the techni:al literature to extract descriptions of applied methods. To expect a single or a small group of experts to adequately review and ...................... *~' 13 AN/LYSIS System Analysis Function/Task Analysis Information Analysis Scenario Analysis Workload Analysis Time-Line Analysis Optrational Sequence Analysis Failure Mode Analysis Fault Tree Analysis *! Link Analysis Function Allocation Anthropometric Analysis Decision Analysis Display Evaluation Index IDENTIFICAZION OF n.EEDS Critical Incident Technique Surveys/Questionnaires Accident Investigation Interviews/Group Techniques Definition of User Population DATA COLLECTION Activity Analysis Time Lapse Photography Real Time Film/Video Recording Direct Observation Physiological Recording Quentitative Performance Recording and Analysis PREDICTION The Human Error Rate Procedure (THERP) Data Store Human Operator Simulator (HOS) Control Theory Accuracy Theory Predetermined T±me Analysis Readability Indices FIGURE 7-3 Geherally Known Applied Methods Categorized by Purpose 14 EVALUATION Test Plan Evaluation Simulation Mock-Ups Walk Through* Check List. Ratings FIGURE 7-3 Concluded -. zi:~e document the entire range of applied methods would be impractical; a more feasible approach would be to subdivide the work according to the five categories of purpose. The individual tasks would thereby be more t~actable and make better use of the skills of individuals whose knowledge and expertise Is likely to be confined to a single category rather than the full range of methods. This approach would also allow the work on each subset of methods to be performed concurrently. Uhatever the approach taken, producing a compendium of standard, usable descriptions of proven applied methods would be an extremely valuable contribution to the field of human factors and consequently to the future development of human-machine systems. .d. SURVEY OF HUMAN FACTORS SPECIALISTS'ON APPLIED METHODS Because of the dearth of.information on the variety and use of applied methods In human factors work we recommend a survey of human factors practitioners concerned with the acquisition, design, development, and "evaluation or modification of equipment and systems. Such a survey would determine the importance and frequency of use of existing applied methods in their work; the kind of information most needed in human factors applications for which existing applied methodologies are inadequate or 1f1 nonexistent; and the methods for which descriptions and guidance for use are most needed. The survey would provide the necessary information on which to base documentation, education, and research efforts. Review, codification, standardization, and documentation of existing methods should procee4 16 according to the priorities of importance and frequency of use derived from the survey. Information from the survey would be useful in shaping human factors curricula in colleges and universities so that students can be trained in applied methods that they will subsequently need on the job. The continuing education needs of human factors specialists could also be met by means of tutorials and symposia on the applied methods for which there iq the greatest need for information. Finally, the results "of the survey would provide a sound basis for basic research efforts to *.", extend or improve existing methods or develop new methods to meet these * . needs. *.', Construction of the survey instrument itself would require a review of the technical literature for descriptions and definitions of applied methods, which the survey recipients would be expected to rate. The literature review would also provide additional data, complementary to the anticipated survey, on the variety and frequency of use of applied methods reflected in the technical literature. A product of this review would be a relatively comprehensive bibliography of technical reports and journal articles that discuss applied methods in more than a cursory fashion; this bibliographic information would be extremely valuable for subsequent efforts on the codification and documentation of existing methods and the Initiation of research efforts to extend these methods or develop new ones. T.. -I.. . . . . . . . . . . . . . . "17 EDUCATION IN APPLIED METHODS ,Education in Colleges and universities SThe absence of codified information and the lack of easy access to source reports inhibits instruction in applied methods at colleges and universities that offer degree programs or courses in the field of human factors. General human factors textbooks give at best only a cursory overview of a few applied aethods and present case study examples that highlight the substantive issues and results rather than the methods. There are no texts suitable either for college-level instruction or as a , reference for practicing human factors specialists that adequately treat applied methods. The single exception, Research Techniques in Human Engineering (Chapanis, 1959), discusseslonly a limited set of methods. *, For the most part, instructors must rely on their own experience and the descriptions of applied'methods gleaned from the technical literature to develop course material. They have no current and comprehensive *• reference works to develop a balanced and thorough course in applied methods. Human factors work is diverse and is performed in many- settings--i.e., military research and development centers, other government facilities, and commercial organizations. Ideally, Instruction in applied methods would emphasize the methods of most use in real-life settings. Without data on the variety and frequency of use it 77,777-77" .:...18 is difficult to decide which applied methods should be taught in human factors courses at the undergraduate and graduate levels. Clearly the "development of a compendium of applied methods, as recommended in the previous section, would be of substantial benefit for formal educational purposes. Until such a compendium exists and survey data is compiled on the variety, frequency of use, and capabilities of applied methods, no meaningful recommendations can be made to improve education in applied methods in colleges and universities. Continuing Education in Applied Methods Of equal concern is the lack of suitable continuing education courses in applied methods for practicing human factors specialists. The problem of inadequate methodological preparation in formal education extends to the work setting. At present it appears that many presumably well-trained human factors specialists work without adequate knowledge of applied methods, and what knowledge they do have about these methods is * acquired on the job. Currently employed human factors specialists could benefit greatly from continuing education in applied methods specifically related to their current work. Development at colleges and universities of educational programs in applied methods that provide a thorough treatment "of a range of applied methods would require a substantial amount of planning and course design work. Undoubtedly the broad inception of these programs, and the realization of their eventual benefits in practice, will be some time in coming. Unlike formal education in 19 applied methods, however, the development of courses for continuing education could be done more easily and produce more immediate positive effects. iuman factors professionals are likely to be more easily educated because of their general knowledge of human factors techniques and the likelihood that they have at least a working familiarity with some applied methods. Because of their prevlous education and experience, continuing education courses for them can be much more practical, vith less emphasis on theoretical foundations. Based on the membership of the Human Factors Society, which numbers nearly 3,000, a reasonable estimate of the actual number of practicing human factors specialists in this country who could benefit from continuing education in applied methods is between 5,000 and 10,000. Fostering and promoting continuing education by means of tutorials 4- on applied methods is one of the most important and immediate ways to improve the field of human factors. Moreover, this kind of activity could most easily be initiated by military and other federal agencies charged with advancing scientific and engineering knowledge and practice. These tutorials could directly benefit human factors specialists employed by-the government as well as those employed by civilian organizations that develop equipment and systems for the government. It is therefore recommended that initial tutorials on applied methods be developed and conducted under the sponsorship of one or more government agencies. While we suggest methods to be discussed in the tutorial below, it would be more prudent to base the choice on a needs analysis of the data derived from the survey recommended above. 20 Such a tutorial could serve several purposes besides the obvious "one of improving the professional competence of human factors specialists. First, the materials generated for the tutorial would contribute to the development of standard definitions and documentation of applied methods, since the course materials would have to describe the subject methods with sufficient care and detail to allow human factors specialists to use them easily and properly. Second, the tutorials would "be a means for validating a prior needs analysis of which applied methods "are considered most important to human factors practitioners. Attendance at the tutorials would also help answer a more fundamental question: Is there genuine interest in learning about applied methods? Third, the initial tutorial would serve as a test to evaluate instructional methods and course structures for training in the use of applied methods. It is suggested that the initial tutorial should consist of three parts: (1) an introductory review of the applied methodologies within each of the five categories listed in Figure 7-3; (2) a comparison of techniques within each category and a discussion of how to select the appropriate method for a particular application; and (3) detailed instruction and practical work on a few selected methods. We suggest five particular methodologies as subjects for the initial tutorial: Task analysis; Time line analysis; Activities analysis; "Simulation; and Information Analysis. .47- 21 Because these methods as well as others are either poorly or inconsistently defined, brief definitions of the five methods recommended for the first tutorial are given in Appendix A. It would not be practical to cover more than five methodologies at the initial tutorial; * five may even be too many. There are a number of other specific concerns relevant to the form and development of a tutorial on applied methods. Experience has shown tutorials to be only the first step in learning to use a particular technique properly. Generally, an individual needs several days of supervised application to become competent in using a particular method. Therefore, the tutorial should not be simply a symposium but rather should be a workshop in which the attendees could gain hands-on experience. A by-product of the initial tutorial would be the development and testing of the structure and effectiveness of the initial Instructional methods. A tutorial on applied methods would probably require 10 to 40 hours of planning and preparing for each hour of instructional time. Since the tutorial should include practical workshop exercises in addition to lecture, a good part of-the effort of preparation would have to be devoted to development of materials. It is likely that the practicum would require one or more assistants in addition to the instructor. An individual or small group should be selected to develop a master plan for the tutorial workshop. The primary goal would be to choose the methods to be taught in the tutorial. This determination should be based largely on the'needs analysis of the data gathered from the methods survey of human factors practitioners recommended abee. The individual or group should also address such issues as the number of days the 22 "tutorial should run, whether it should be conducted independently or in Sassociation with a national meeting, the estimated costs, and the selection of instructors. The most obvious audience for the first tutorial are human factors, practitioners, although the needs.of other groups of professionals LLat could benefit from learning about applied methods, such as engineers, managers, students, and university teachers, should be considered at some point. Engineers are an important audience since they are likely to need to use applied methods in the course of system design and development and they are not likely to know where to seek information on methodologies. Managers are important because of their influential role in equipment and system development. Due to their position of authority, managers are able to influence practices of their employees. College and university "teachers are a relevant audience, since what they learn would be passed on to their students. And students, especially students in engineering and human factors, are a particularly important potential audience *. because of their receptivity to new techniques and the apparent lack of "adequate education in applied methods in colleges and universities. The tutorial format appropriate for human factors professionals may not be suitable for these other groups. if the first tutorial proves to be beneficial to human factors specialists, it would be worthwhile to design others tailored to the backgrounds and needs of these other "groups. We recommend that tutorials for these other groups be developed "first for engineers and subsequently for the remaining groups. For all audiences the tutorials should be repeated at several times "and locations both to make the experience available to all who are interested and to recover the initial development costs. i•• l'%- i i, i : -* - * * * -- -l-'l 23 RESEARCH ON APPLIED METHODS Each applied method was originated to fill scme particular need for information to support system design, e,/aluation, or problem analysis. "Through a succession of repeated, successful use in different contexts, methods have evolved and have become known and accepted as tools of the trade in human factors work. Because they were developed as a means to some practical end and so vary in form depending on the situations in which they are used, there has never been very much concern about their refinement or extension. -That is, an applied method has rarely been regarded as an important topic worthy of research investigation in its own right, independent of a particular use. Thic lack of status is partly reflected and partly caused by the abseuce of standard documentation of applied methods. In addition, the people who use *- applied methods are practitioners and, in some sense, generalists in •- human factors rather than specialists in methodology. There is no body' of experts who devote their careers to the study and development of "applied methods rather than their actual use, as there is for experimental design and statistical analysis. Applied methods, however, are the principal means by which human factors work Is accomplished. In light of their contibution to systems 'work, applied methods are a sufficently important topic to deserve research attention. Advances should not depend solely on Incidental efforts made by human factors specialists in the coube of their work. . Basic research specifically devoted to the validation, refinement, and ................................................ *i 24 -'a extension of existing methods and to the development of hew methods is essential. Improvement and Extension of Existing Applied Methods As previously discussed, fundamental problems are the lack of documented definitions and descriptions of existing applied methods and the lack of knowledge about what Information is needed in human factors work. Documentation and survey work is necessary Co provide- baseline descriptions and to help Identify the particular problems and shortcomings of existing methods. Without this Information it Is difficult to specify what research on which particular methods wtould have the greatest value in terms of its contribution to the improvement of human factors work. Nonetheless, we . propose some existing methods as subjects deserving research attention because from our experience It is apparent that these methods are widely used, criticol to system design and development work, and could be substantially improved: workload analysis; function allocation; task '- analysis; survey techniques; and protocol analysis. rprograms; Workload analysis is already the subject of many ongoing research however, It is important enough to merit expanded support for research on workload assessment methods. While the five methods named "above are, in our opinion, most deserving of research attention, the order of presentation should not be construed as indicating priorities among them. There is insufficient knowledge about the needs of the human factors community to assign priorities. a *. . '. , - 1 25 Development of New Applied Methods .J In. discussing current end future problems and trends in human factors applications to system development, Meister (1980, 1982) has identified "those Informational requirements of human factors specialists that imply "needs for the development of new applied methods. On the basis of these suggestions, we make general recommendations for research leading to the development of five new applied methods: 1. Methods for interpreting or extrapolating task/system requirements into personnel requirements; 2. Performance measurement methods that express measures in terms relative to base rates for particular system characteristics and/or demands; 3. Training technology methods for translating task/abilities requirements into training programs; 4. System evaluation methods--static, dynamic, and comparative; and 5. Methods for describing and evaluating task or systes. impact on affective responses of personnel. ILIi 26 SUMMARY There is a serious disparity between the importance of applied methodologies for human factors work, particularly systems and equipment design, and the efforts being made to document and codify them in a standard manner; to educate behavioral science and engineering students in their use in colleges and universities; to provide continuing education in applied methods to working human factors specialists; and to engage in research to improve existing applied methodologies and develop nev ones. It is of great importance to document what is currently known about applied methods. Increasing the accessibility of information on existing methods would be more valuable than developing new methods. What follows is a summary of our recommendations with respect to applied methods. o Existing methodologies should be assessed and documented in a codified compendium that provides standard descriptions of the most useful applied methods. This compendium would serve both as a comprehensive and readily available source for learning about and as a basis for determining specific research needs. o Human factors practitioners should be surveyed to determine the. importance and frequency of use of existing applied methods in their work; the kinds of information most needed in human factors applications for which existing applied methods are inadequate or nonexistent; and' methods fot which they require descriptions and guidance for use. * ..- * .~.- .. . .. 27 o Tutorials on applied methods should be developed to meet the continuing educational necds of human factors specialists. Methods recommended for the initial tutorial are: task analysis; time line arfalysis; activities analysis; simulat i on; and information analysis. o Basic research should be performed to improve and extend 4' existing applied methods. Methods in need of research include: workload "analysis; function allocation; task analysis; survey techniques; and * protocol analysis. o Basic research Im also required to develop new methods that can provide the information needed by human factors specialists to do their work. New methods needed include: (1) methods for interpreting or extrapolating task/system .requirements into personnel selection requirements; (2) performance measurement methods that express measures in terms relative to base rates for particular system characteristics and/or demands; (3) training technology methods for translating task/abilities requirements into training programs; (4) system evaluation methods-static, dynamic, and comparative; and (5) methods for describing and evaluating task or system impact on affective responses of personnel. L. 28 APPENDIX A .4, SHORT DEFINITIONS OF APPLIED METHODS RECOMMENDED AS SUBJECTS FOR TUTORIAL Task Analysis Task analysis is the process of analyzing functional requirements of a system to ascertain and describe the tasks that people must perform. Task analysis has two major aspects: The first specifies and describes the tasks; the second and more important analyzes the specified tasks to determine the number of people needed, the skills and knowledge they should have, and the training necessary. Results of task analysis are used in the development of operating procedures and technical manuals and the determination of critical equipment characteristics and task demands imposed on people. The analytic method involves decomposition of task content into their constituent elements, such as stimulus input,.required response, equipment output, and feedback information. Simulation Simulation is used (1) to allow users to experience, in advance of its operation, portions of a system that are more complex, more dangerous, or more expensive than an experiment could allow for or (2) to predict performance of systems that do not exist. Simulation Is a human factors *Ai I.9 methodology only when it is combined with one of the observational or measurement methodologies. And to extrapolate the observations or measurement3 to the real world requires a determination of the extent to which things that affect the observations of interest are realistically portrayed in the simulation. How to make this determination (cost/transfer function, part versus whole task simulation, which things to simulate) Is the key part of the technology that is still largely unresolved. In the absence of other effective means of predicting the behavioral consequences of system design, simulation is crucial. Time Line Analysis Time line analysis organizes a detailed task list for the operational scenario and procedures into serial order and plots the times of Individual tasks in sequence against a time base. It portrays sequential, parallel, repeated, and/or intermittent tasks according to what is done. The resulting accumulation of tasks and total performance time can be used to appraise: 1. The validity of the operations to be performed in contributing to system objectives; 2. The feasibility of performing required tasks within the required time; 3. Antecedent hardware and operations conditions to ensure that the requirements of each task element are met; 30 4. The compatibility of demands on the operator, ensuring that antecedent tasks are identified and performed, required skills "and performances are feasible and practical, and difficult, complex, or conflicting demands are avoided; and 5. Workload demands, by comparing time requirements to complete a task series to the time available for completion within the constraints of a given system. Information Analysis * information analysis identifies information and its flow through a system, usually as perceived from a user's viewpoint. For example, the "* flow of information necessary for the operation of an office differs from the flow of documents through that office. Certain system actions occur *. to the information received, which in turn becomes inputs to subsequent "actions. Information .analyses enable human factors specialists to assess and design the information requirements of the user interfaces. Activity Analysis In many situations involving field environments, simulations, or mock-ups, it is desirable and useful to catalog the distribution and/or sequential dependencies of workers' activities. In activity analysis an observer periodically or aperlodically samples the work being performed and classifies the results into a set of categories. The data may be rvq -. . .. * 31 obtained from direct observation or from video or film recording. Individual samples are then aggregated into activity frequency tables or' graphs or state transition diagrams. These analyses are especially uspful for documenting the way in which task requirements change with alternative system designs or environments or for estimates of relative cost effectiveness, manning requirements, or simply for understanding how individuals or groups spend their time. 7.7 32 REFERENCES Chapanis, A. 1959 Research Techniquesin Human E e , Baltimore , Md.: Johns Hopkins University Press. Geer, C. W. 198. HumanEn Ineering Procedures Guide Report AFAMR-TR-81.3 . Wright-Patterson Air Force 5 , * Base, Ohio: Air Force Aerospace Medical Research Laboratory. Meeister, D. 1980 Human Factors for the Future--Trends and Speculations. In Proceeding. of Syp.sjlum onHuman FactorsIn Systems Development: xpenrience and Trends. Karlstadt, Sweden: National Defense Research Institute. Meister, D. 1982 The role of human factors in system development. ApLIed Ergonomlcs 13:119-124. aL i :÷,c ~ 4 4 4 44 33 Parks, D. L., and Springer, W. 1. 1976 Human Factors Engineering Analytic Process Definition and Criterion Development for CAPES (Computer Aided Function-Allocation Evaluation System). Warminster, Pa.: Naval Air Development Center. TopmIller, D. A. 1981 Methods: past approaches, current trends and future requirements. In M. J. Horsal and K. F. Kraiss, eds., Manned System Design. Neo York: Plenum Press. Williges, R. C., and Topmiller, D. A. 1980 Task III: Technology Assessment of Human Factors Engineering In the Air Force. Wright-Patterson Air Force Base, Ohio: Air Force Systems Command. 062083