Given the recent discussions on clients, and interface design <bow to
Caliba>, the following seems both pertinent and insightful:
http://www.acm.org/cacm/AUG96/antimac.htm
Dona Gentner and Jakoba Nielsen
The Anti-Mac Interface
By exploring alternative interfaces that transcend the principles
behind conventional graphical interfaces, a human-computer interface
emerges that is based on language, a richer representation
of objects, expert users, and shared control.
At recent user interface conferences, several speakers have lamented
that the human interface is stuck. We seem to have settled on the WIMP
(windows, icons, menus, pointer) model, and there is very little real
innovation in interface design anymore.
Physicists and mathematicians often stretch their imaginations by
considering what the world would be like if some of their basic
assumptions and principles were violated (for example, see [1]). This
has led to new concepts such as non-Euclidean geometry, positrons,
antimatter, and antigravity. At the least, violating basic assumptions
is a useful mental exercise, but a surprising number of the resulting
concepts have provided useful descriptions of the real world.
In this article, we explore the types of interfaces that could result
if we violate each of the Macintosh human interface design
principles. We focus on the Macintosh interface because it is a prime
example of the current interface paradigm, and Apple Computer has
published an explicit list of Macintosh human interface design
principles [2]. These principles have not significantly changed since
the introduction of the Macintosh; style guides for other popular
graphical interfaces, such as Motif, OPEN LOOK, and Windows [16, 18,
22], list a very similar set of principles as the basis for their
interfaces.
We should state at the outset that we are devoted fans of the
Macintosh human interface and frequent users of Macintosh
computers. Our purpose is not to argue that the Macintosh human
interface guidelines are bad principles, but rather to explore
alternative approaches to computer interfaces. The Anti-Mac interface
is not intended to be hostile to the Macintosh, only different. In
fact, human interface designers at Apple and elsewhere have already
incorporated some of the Anti-Mac features into the Macintosh desktop
and applications. The Macintosh was designed to be "the computer for
the rest of us" and succeeded well enough that it became, as Alan Kay
once said, "the first personal computer good enough to be criticized."
This article should be taken in the same spirit. The Macintosh was
designed under a number of constraints, including:
It needed to sell to "naive users," that is, users without any
previous computer experience. It was targeted at a narrow range of
applications (mostly office work, though entertainment and multimedia
applications have been added later in ways that sometimes break
slightly with the standard interface). It controlled relatively weak
computational resources (originally a non-networked computer with
128KB RAM, a 400KB storage device, and a dot-matrix printer). It was
supported by highly impoverished communication channels between the
user and the computer (initially a small black-and-white screen with
poor audio output, no audio input, and no other sensors than the
keyboard and a one-button mouse). It was a standalone machine that at
most was connected to a printer.
These constraints have all been relaxed somewhat during the 12 years since the
introduction of the Macintosh, but we will explore what might happen if they
were to be eliminated completely.
The Macintosh Human Interface Design Principles
According to the Macintosh guidelines [2], the design of human interfaces for
Macintosh system software and applications is based on a number of
fundamental principles of human-computer interaction. These principles have
led to excellent graphical interfaces, but we wonder: How do these principles
limit the computer-human interface? What types of interfaces would result
from violating these principles?
We will address these two questions for each of the Macintosh human interface
design principles. (The Macintosh design principles and their corresponding
Anti-Mac principles are summarized in Table 1.)
Metaphors
The first Macintosh principle states that the interface should be
based on metaphors with the familiar noncomputer world around us. In
the Macintosh interface, computer files are represented as documents
in paper folders that are placed on a desktop. Files are deleted by
dragging them to the trash can. Many recent interfaces have tried to
overcome the limitations of the desktop metaphor by extending it to
some other room or building metaphor (e.g., Bob or Magic Cap, Figure
1) or to a village metaphor (e.g., eWorld). These 3D designs try to
emulate virtual reality on a flat screen but often seem to introduce a
level of clunky indirectness in achieving common user goals. They are
navigationally cumbersome, asking users to go to the "other end of
town" to pick up their email from the Post Office, and interactionally
cumbersome, overloading users with additional windows and other
interface elements necessitated by the metaphor but not by the user's
task. Although the use of metaphor may ease learning for the computer
novice, it can also cripple the interface with irrelevant limitations
and blind the designer to new paradigms more appropriate for a
computer-based application. The designers of the Phelps farm tractor
in 1901 based their interface on a metaphor with the interface for the
familiar horse: farmers used reins to control the tractor. The tractor
was steered by pulling on the appropriate rein, both reins were
loosened to go forward and pulled back to stop, and pulling back
harder on the reins caused the tractor to back up [5]. It's clear in
hindsight that this was a dead end, and automobiles have developed
their own user interfaces without metaphors based on earlier
technologies. Nonetheless, people today are designing
information-retrieval interfaces based on metaphors with books, even
though young folks spend more time flipping television channels and
playing video games than they do turning the pages of books.
The three classic problems with metaphors [9] are: The target domain
has features not in the source domain (e.g., telling the user that "a
word processor is like a typewriter" would not lead the user to look
for the replace command). The source domain has features not in the
target domain (a typewriter has the ability to mark up any form you
receive in the mail, but a user trying to do that on current computer
systems will normally fail). Some features exist in both domains but
work very differently (the treatment of white space representing space
characters, tabs, and line feeds is very different on typewriters and
in word processors). Therefore, users may have trouble seeing beyond
the metaphor to use the system in the ways it was intended. It is
possible to design interfaces to map very closely to metaphors (e.g.,
Bob Mack's NOSE-no-surprise editor-really did act like a typewriter in
all respects), but those interfaces will often be very low-powered and
suited mainly for walk-up-and-use situations.
As an example of the limitations of the desktop metaphor, consider the
trash can on the Macintosh desktop. It is a fine metaphor for the
wastebasket in an office. In both cases we dispose of objects by
putting them in the trash can and we're able to retrieve documents we
had thrown out until the trash is emptied. However, the limits of the
single-trash-can metaphor has led to a system that fails to meet the
user's needs and causes confusion by masking the realities of the
implementation. In the underlying implementation, there are separate
trash containers for each volume, such as a hard disk or a floppy
disk, but to avoid the confusion of multiple trash cans in the
interface, the contents of the trash containers for all mounted
volumes are combined and shown as a single desktop trash can. Because
there is only one trash can in the metaphor, if the user empties the
trash to create room on a floppy disk, the trash contents on both the
floppy and the hard disk are deleted, even though there was no need to
delete files from the hard disk's trash can. The desktop metaphor has
enforced a limitation on the interface that does not serve the user's
real needs. An alternative approach would be to simply cross out
files and have them disappear (perhaps to a temporary limbo before
being permanently discarded). This approach avoids the problems
caused by the trash can metaphor, even though it does not have an
analogy in the real world.
Metaphors not only constrain and mislead users, they can also limit
designers' ability to invent more powerful interface mechanisms. For
example, we are as guilty as anybody for using the tired book
metaphor: When we recently designed the user interface to 300MB worth
of Sun's online documentation, we used a book metaphor to unify the
icons, the vocabulary, and the hierarchical structure of the user's
navigation. Our excuse was that the interface will display information
originally written as separate printed manuals and that it would have
been confusing to use a free-form hyperspace model to represent this
text. There is no doubt, however, that the book metaphor prevented us
from introducing desirable features like the ability to reorder the
chapters according to their relevance scores after a search.
The desktop metaphor assumes we save training time by taking advantage
of the time that users have already invested in learning to operate
the traditional office with its paper documents and filing
cabinets. But the next generation of users will make their learning
investments with computers, and it is counterproductive to give them
interfaces based on awkward imitations of obsolete
technologies. Instead, we need to develop new interface paradigms
based on the structure of computer systems and the tasks users really
have to perform, rather than paradigms that enshrine outmoded
technology. The way to advance the interface is not to develop
ever-more-faithful imitations of the desktop, but instead to escape
the limitations of the desktop especially as computers themselves
become ubiquitous [21] and are used away from the desk.
Direct Manipulation
Using direct manipulation, users interact directly with objects in the
interface [17]. The archetypal example is to move a file from one
directory to another by opening the original folder and using the
mouse pointer to drag the file icon to the destination folder. This
procedure works well for simple actions with a small number of
objects, but as the number of actions or objects increases, direct
manipulation quickly becomes repetitive drudgery. The dark side of a
direct manipulation interface is that you have to directly manipulate
everything. Instead of an executive who gives high-level
instructions, the user is reduced to an assembly line worker who must
carry out the same task over and over.
Direct manipulation also means that users must always operate at the
atomic level. They cannot group a related series of basic actions into
one high-level action or use conditionals. Suppose we have a group of
images and want to convert all of the PICT files into icons (see
Figure 2). If the conversion requires several steps, this will be a
very tedious process with direct manipulation, but a simple scripting
language provides a natural means for specifying this task. Direct
manipulation also limits the precision of our actions to the precision
achieved with eye-hand-mouse coordination. Language and mathematics
can be more precise ("Place the bottom of the triangle level with the
middle of the circle") and more dynamic ("Maintain the height of this
histogram bar at 37% of that of the bar to its left"). Finally, direct
manipulation requires the user to be involved in every action, but
sometimes the user may not know what to do. For example, as
applications become more complex and involve many files spread
throughout the computer, installation and removal of applications
exceeds the understanding of most users. Rather than directly draging
files to the proper places, most users would prefer pressing a single
button on an installation program and have the computer move the files
to the appropriate places. Indeed, install and uninstall programs
have become an essential adjunct of complex software packages.
See-and-Point
The see-and-point principle states that users interact with the
computer by pointing at the objects they can see on the screen. It's
as if we have thrown away a million years of evolution, lost our
facility with expressive language, and been reduced to pointing at
objects in the immediate environment. Mouse buttons and modifier keys
give us a vocabulary equivalent to a few different grunts. We have
lost all the power of language, and can no longer talk about objects
that are not immediately visible (all files more than one week old),
objects that don't exist yet (future messages from my boss), or
unknown objects (any guides to restaurants in Boston).
If we want to order food in a country where we don't know the language
at all, we're forced to go into the kitchen and use a see-and-point
interface. With a little understanding of the language, we can point
at menus to select our dinner from the dining room. But language
allows us to discuss exactly what we would like to eat with the waiter
or chef. Similarly, computer interfaces must evolve to let us utilize
more of the power of language. Adding language to the interface allows
us to use a rich vocabulary and gives us basic linguistic structures
such as conditionals. Language lets us refer to objects that are not
immediately visible. For example, we could say something like "Notify
me if there is a new message from Emily." Note we are not advocating
an interface is based solely on language. Neither does the interface
have to understand full natural language. Real expressive power comes
from the combination of language, examples, and pointing.
Consistency
Consistency is one of those principles that sounds like a great idea
when you first come across it, but it is very difficult to apply the
principle in any real situation where there is a wide array of
conflicting things with which you can be consistent [8].
The basic advantage of consistency is the hope that learning will be
reduced if objects with a similar function always look and behave the
same. People will recognize applications more easily if they all have
similar icons. Yet in the real world, people have no difficulty
switching between ballpoint pens and fibertip pens even though they
look somewhat different and have different controls. They are similar
enough that they are both recognized as pens, whereas their varying
appearances provide pleasure and clues to their slightly different
functionality. It is the rich and fine-grained representation of
objects in the real world that allows pens or books to have a wide
variety of appearances and still be easily recognizable. As
representations of objects in the computer interface become richer and
more fine-grained, the need for complete consistency will drop.
Note that consistency is not symmetrical. Whereas we argue that
objects with similar functions need not have consistent appearances or
controls, it is still important objects with similar appearances have
similar behavior and functions. Except as a joke, a pen that looks
like a tennis shoe will not be very useful.
WYSIWYG
What You See Is What You Get (WYSIWYG) means your document, as it
appears on the screen, accurately reflects what it will look like when
it is printed. This is certainly a big improvement over earlier
computer-based text formatting systems such as troff, in which there
was no obvious relation between the document's appearance on the
screen and what was produced by the printer, and it took many trials
to get a new document to print properly.
The problem with WYSIWYG is that it is usually equivalent to WYSIATI
(What You See Is All There Is). A document has a rich semantic
structure that is often poorly captured by its appearance on a screen
or printed page. For example, a word may be printed in italic font for
emphasis, as part of a book title, or as part of a quotation, but the
specific meaning is lost if it is represented only by the fact that
the characters are italicized. A WYSIWYG document shows only the final
printed representation; it does not capture the user's intentions.
Other text representations, such as Standard Generalized Markup
Language (SGML), preserve the semantic meaning inherent in the text
and have rules for converting the text and semantics into their
appearance on the printed page. For example, a string of text may be
labeled as a book title in SGML. In one situation that text might be
printed in an italic font; in another case it might be printed in a
bold font, and in a third case it might be accessed by a bibliographic
retrieval program.
WYSIWYG assumes there is only one useful representation of the
information: that of the final printed report. Although we are not
arguing against a print preview function, even when the goal is to
produce a printed document, it may be useful to have a different
representation when preparing the document. For example, we may want
to see formatting symbols or margin outlines, or it may be useful to
see index terms assembled in the margin while we are composing.
In some sense, WYSIWYG is equivalent to the horselike tractor we
discussed earlier in connection with metaphors. WYSIWYG assumes people
want paper-style reports: Information might be produced (and even
consumed) on screen, but it should still be structured the same way it
was on paper. These days, however, who has time to read all the
documents we get? In reality, people rarely read information from
beginning to end the way they would have read a report in the good old
days where even busy managers got at most a single report per
day. Instead, electronic information should be modularized and
presented in ways that encourage people to read as little as possible
by popping information of interest to the specific user to the top,
while enabling each user to link to backup information as
needed. Object-oriented authoring and reading would allow the same
piece of information to be presented at multiple levels and possibly
from different perspectives. We should also mention that not all users
can see. For blind users, information that is encoded with SGML-like
rich intentional attributes can be represented in alternative formats
(e.g., sound) in a much more usable fashion than is possible with
simple screen readers that are limited to reading aloud from the
pixels on the screen without knowing what they mean.
User Control
It has become almost a religious crusade among WIMP advocates that the
user should be in control---the user, not the computer, should
initiate and control actions. The negative side of user control is
that the user has to be in control. There are many situations where
we do not want to be in control: flying an airliner, for example, or
watching our toast in the morning to see that it doesn't burn. Many
activities in life are either so difficult or so boring and repetitive
that we would like to delegate them to other people or to
machines. Similarly, on the computer, there are many activities that
we either do not want to control or do not know how to control. This
is prime territory for agents and daemons, computer processes that
tirelessly stand guard, take care of routine tasks, or have the
knowledge that we lack to handle complex tasks.
Most computer programmers gave up complete control some time ago when
they stopped writing in machine language and let assemblers,
compilers, and interpreters worry about all the little details. And
these days, few users still specify the exact routing for their email
messages. We've learned that it's not worth the trouble and that
computers can often do a better job than we can. Computers and people
differ widely in their expertise. For example, people are notoriously
bad at vigilance tasks, so it makes sense to let the computer take
control of periodically saving our word processor documents, backing
up our hard drives, and reminding us of our meetings.
Even if it were desirable, full control is becoming impossible with
networked computers. User control assumes you are the only actor in
the system, but these days, millions of people are on the Internet and
can change the system behind your back. Indeed, it is one of the
benefits of the Internet that new resources appear without any work on
your part. It would be possible to conceptualize the Internet as a
groupware application with 40 million users, but in reality, it is
better thought of as a system without any centralized control in which
things just happen without other users' asking you what you would
like.
Feedback and Dialog
This principle states that the computer interface should provide the
user with clear and immediate feedback on any actions initiated by the
user. It is closely connected with the previous principle of user
control: If the user is required to be in control of all the details
of an action, then the user's needs detailed feedback. But if a
sequence of activities can be delegated to an agent or encapsulated in
a script, then there is no longer a need for detailed and continuous
feedback. The user doesn't have to be bothered unless the system
encounters a problem that it cannot handle.
A supervisor interacts very intensely with a new employee and solicits
detailed progress reports, but as the employee gains experience and
the supervisor gains confidence, the need for this detailed feedback
and dialog decreases. The supervisor can assign complex tasks and be
confident that "no news is good news." Rather than always providing
the user with feedback on activities, the computer should be more
flexible in the amount of feedback it provides. Initially, the
computer could provide detailed feedback to familiarize the user with
its operations and instill confidence; then the feedback could be
scaled back over time and restricted to unusual circumstances or times
when the user requests more feedback.
Forgiveness
The forgiveness principle states that user actions should generally be
reversible and that users should be warned if they try to do something
that will cause irreversible data loss. But even forgiveness can have
a negative side. Consider as an example the common case where a user
wants to copy a file to a floppy that does not have enough free
space. The Macintosh gives an error message explaining there is not
enough room and we must throw away 125K. But when we throw away some
files and again attempt to copy the file, we get another error message
stating there is not enough room unless we empty the trash, and asking
if we want to empty the trash now. In this case forgiveness becomes a
nuisance. The underlying problem here is that the computer has a very
meager understanding of the interaction history. A stateless interface
is doomed to present the inappropriate message because it cannot
relate the user's action to its own prior recommendation to delete
files. The computer needs to build a deeper model of our intentions
and history.
Perceived Stability
Perceived stability means that elements in the computer interface
should not be changed without the user's involvement. For example,
icons on the desktop and windows should reappear as they were when the
user last closed them. But it is a sure sign of a lack of confidence
if a person's last words before leaving the room are, "Don't move
until I get back." In a computer interface, the principle of perceived
stability implies the computer will most likely make things worse on
its own and the user will be unable to function in a changing
environment. We're still children in the computer age, and children
like stability. They want to hear the same bedtime story or watch the
same video again and again. But as we grow more capable and are better
able to cope with a changing world, we become more comfortable with
changes and even seek novelty for its own sake.
One of the most compelling aspects of computer games and some
computer-based learning environments is the lack of stability that
derives from dividing control between the user and the computer or
even among users on networked computers [7]. This should not be
surprising, because the world we live and act and play in is an arena
of mixed control with initiatives from individuals, their associates,
and the larger environment. There are many things computers and other
people can do for us if we decide they don't have to maintain
perceived stability. Today, electronic messages and news articles
appear unbeckoned on our computer desktops. Applications have existed
for some time that can automatically scan and sort incoming mail
[12]. Programs such as Magnet, a Macintosh application that searches
for files meeting a specified criterion and moves them into the user's
folders, act as simple agents to build an improved environment for the
computer user.
The World-Wide Web is a prime illustration of the advantages of shared
control in user interfaces. The Web changes constantly, and every time
users connect to a home page, they might be presented with a
completely new interface. For example, AT&T changes its home page
daily [3], and in designing Sun's home page we decided we needed to
change it drastically every month to keep the users' interest [14]:
Stability can be boring! Denying the principle of perceived stability
can often make the interface simpler and more intuitive. We can be
easily overwhelmed by all the capabilities of a large application, but
if the application discreetly rearranges the interface from time to
time to offer us only the features of current interest to us, we can
get the feeling that everything we need is easy to find and readily at
hand.
Aesthetic Integrity
The principle of aesthetic integrity states that the graphic design of
the interface should be simple, clean, and consistent. Screens should
be visually pleasant and easy to understand. Part of the need for
aesthetic integrity derives from the limited expressiveness of current
computers. If computers could communicate with a richer language, it
would not be so important that everything have a "single look." With
networking, a person's computer world extends beyond the bounds of his
or her desktop machine. Just as a city designed by a single architect
with a consistent visual appearance would be difficult to navigate and
somewhat boring to visit, a variety of visual designs would make our
computer world more interesting, more memorable, and more
comprehensible.
As an extreme example, the old user interface to the Internet had
great aesthetic integrity, with one "ls" listing in ftp looking pretty
much like any other, and with all needed information (modification
date, size, and file type as indicated by the extension) shown in a
severely minimalist manner. Even so, the wild and woolly extremes of
Web home-page design have taken over. Users seem to prefer more
illustrative indications of location and content in cyberspace. Note
we do not argue for complete anarchy: With Darrell Sano we recently
designed SunWeb (Sun's internal Web pages) to have a unified, though
flexible, look [15]. We did want SunWeb to be recognizably different
from, say, Silicon Surf (SGI's Web pages), but we did not want the
Help buttons to appear in 20 different places. As our computer world
expands (especially over the network) to encompass millions of
objects, these objects should not all look the same. Totally uniform
interfaces will be drab and boring and will increase the risk of
users' getting lost in hyperspace. Richer visual designs will feel
more exciting. From a functionality perspective, richness will
increase usability by making it easier for users to deal with a
multiplicity of objects and to derive an understanding of location and
navigation in cyberspace.
Modelessness
Modelessness means the computer interface should not have distinct
modes that restrict the user's actions depending on the mode he or she
is in. Users should be able to perform any task at any time. Although
modelessness seems to be an object of veneration among some Macintosh
interface designers, even the section on modelessness in the Macintosh
Human Interface Guidelines [2] is primarily devoted to explaining how
to use modes successfully. The basic problem presented by modelessness
is that the user cannot cope with everything at once. Users need the
interface to narrow their attention and choices so they can find the
information and actions they need at any particular time. Real life is
highly moded [11]: What you can do in the swimming pool is different
from what you can do in the kitchen, and people can easily distinguish
between the two because of the richness of the experience and the ease
with which we can move between environments.
The Anti-Mac Interface
Although this article starts as an exploration of alternatives to the
individual Macintosh human interface principles, it has not quite
ended up that way. Just as the Macintosh design principles are
interrelated and give a resulting coherence to the Macintosh
interface, violation of those principles also points to a coherent
interface design that we call the Anti-Mac interface. The basic
principles of the Anti-Mac interface are:
The central role of language
A richer internal representation of objects
A more expressive interface
Expert users
Shared control
The Central Role of Language
Over the past million years, humans have evolved language as our major
communication mode. Language lets us refer to things not immediately
present, reason about potential actions, and use conditionals and
other concepts not available with a see-and-point interface. Another
important property of language missing in graphical interfaces is the
ability to encapsulate complex groups of objects or actions and refer
to them with a single name. An interface that can better exploit human
language will be both more natural and more powerful. Finally, natural
languages can cope with ambiguity and fuzzy categories. Adding the
ability to deal with imprecise language to the computer interface will
increase the computer's flexibility and its fit with people's normal
expressive modes. As Susan Brennan has shown [4], natural language, in
addition to being natural for people, has several advantages as a
medium for human-computer interaction, including the role of
negotiated understanding, use of shared context, and easy integration
with pointing and other input/output channels.
We are not proposing, however, what AI researchers would call a
"natural language interface." It seems a computer that can hold a
normal conversation with the user will remain in the realm of science
fiction for some time yet, and we are interested in computer-human
interfaces for the near future. Instead, we have in mind something
more like the interfaces of text-based adventure games, with their
understanding of synonyms, relatively simple syntax, and tolerance for
error in the input----a pidgin language for computers. The critical
research question is, "How can we capture many of the advantages of
natural language input without having to solve the 'AI-Complete'
problem of natural language understanding?" Command line interfaces
have some of the advantages of language, such as the large number of
commands always available to the user and the rich syntactic
structures that can be used to form complex commands. But command line
interfaces have two major problems. First, although the user can type
anything, the computer can understand only a limited number of
commands and there is no easy way for the user to discover which
commands will be understood. Second, the command line interface is
very rigid and cannot tolerate synonyms, misspellings, or imperfect
grammar. We believe that both these deficiencies can be dealt with
through a process of negotiation. Think of the way a new library user
might interact with a reference librarian. A librarian who had a
command line interface would understand only a limited number of
grammatically perfect queries, and the novice user would have to
consult an obscure reference manual to learn which queries to write
out. A reference librarian with a WIMP interface would have a set of
menus on his or her desktop; the user would search the menus and point
to the appropriate query. Neither interface seems very
helpful. Instead, real reference librarians talk with the user for a
while to negotiate the actual query. Similarly, we envision a
computer interface that utilizes a thesaurus, spelling correction,
displays of what is possible, and knowledge of the user and the task
to take part in a negotiation of the user's command. We could imagine,
for example, a dialog in which the user makes a free-form request, the
computer responds with a list of possible tasks that seem to match the
request, and both engage in a dialog to focus on the request the user
actually intended.
A Richer Internal Representation of Objects
Current interfaces have access to very little information about the
objects he user deals with. For example, the only information known
about a file may be its name, size, and modification date; the type of
data it contains; and the application that created it. In order to
deal more effectively with these objects, the computer needs a much
richer representation of them. For a document, this representation
could include its authors, topic matter, keywords, and importance;
whether there are other copies; what other documents it is related to;
and so forth. The list of attributes is similar to what would be
needed by a good secretary who was expected to handle the documents
intelligently. It is not necessary for the secretary to fully
understand the document's contents, but he or she must have a general
sense of what the document is about and of its significance. If a
computer interface is to handle documents intelligently, it must have
the same sorts of information. Much of this information does not
require natural language understanding. We already have techniques in
full-text search systems that could be used to automatically extract
this information from the document. As we move to text systems with
tags based on meaning rather than a WYSIWYG system, much of this
information will be available in the document itself. Further, if we
allow the computer more control over the interface, it could monitor
what we do with the objects and change their representations
accordingly [10]. For example, if two objects were almost always used
together, the computer could create a hypertext link between them.
A More Expressive Interface
The richer internal representation of objects will allow more
intelligent interpretation of user commands, but it will also be
reflected in a more expressive interface with a richer external
representation. The Macintosh interface was originally designed for a
9-inch display that contained less than 200,000 black-and-white
pixels. It is remarkable how well it has translated to today's 21-inch
displays that contain 2,000,000 color pixels---an increase of a factor
of 240 in information capability. This trend will continue until
displays approach the size of a desk and the practical resolution of
the human eye (an additional factor of 340) and the interface should
take advantage of this change to increase the expressiveness of the
objects it displays. Notice in Figure 3 how the books on a bookshelf
have a wide variety of appearances and yet are all recognizable as
books. This variety adds visual interest and helps us quickly locate a
particular book. Yet in our computer interfaces, all documents of a
given application usually appear identical. This practice is starting
to change----some graphics applications represent documents with
thumbnail images, but that is only a beginning. Improved displays and
sound can give us a computer world that is as comfortable to navigate
as the real world.
In addition to richer output, richer input devices will allow the user
more flexibility in manipulating objects, and monitoring devices like
active badges [20] will allow the computer to accommodate the user's
needs without explicit commands.
Expert Users
The GUIs of contemporary applications are generally well designed for
ease of learning, but there often is a trade-off between ease of
learning on one hand, and ease of use, power, and flexibility on the
other hand. Although you could imagine a society where language was
easy to learn because people communicated by pointing to words and
icons on large menus they carried about, humans have instead chosen to
invest many years in mastering a rich and complex language. Today's
children will spend a large fraction of their lives communicating with
computers. We should think about the trade-offs between ease of
learning and power in computer-human interfaces. If there were a
compensating return in increased power, it would not be unreasonable
to expect a person to spend several years learning to communicate with
computers, just as we now expect children to spend 20 years mastering
their native language.
It is sometimes claimed that humans have limited cognitive abilities
and therefore will not be able to handle increasingly complex
computers. Even though the human brain may stay about the same for the
foreseeable future, it is demonstrably not true that a fixed human
brain implies fixed abilities to use specific systems. As an example,
consider the ability to survive in a jungle without modern
weapons. Most readers of this article would probably fare rather
poorly if they were dropped in a jungle and told to catch dinner with
their bare hands. At the same time, anybody born and raised in the
same jungle would probably perform the dinner-catching task
superbly. The difference in performance between the two groups is due
to the benefits of having grown up in an environment and fully
internalized its signals and rules. As the example shows, two people
with the same brain capacity can have very different capabilities when
it comes to using a given system. We predict that people who have
grown up with computers will be much more capable as computer users
than the current generation of users. Thus, they will be able to use
(and will in fact, demand) expressive interfaces with advanced means
of expressing their wants.
Shared Control
But the entire burden should not be on the individual
user. Computer-based agents have gotten attention from computer
scientists and human interface designers in recent years, but their
capabilities have not yet progressed beyond the most simple-minded
tasks. Before people will be willing to delegate complex tasks to
agents, we need agents skilled both in specific task areas and in
communicating with users. The recent rapid rise in use of the Web
illustrates the advantage of sharing control of your computer world
with other people. By relinquishing control over a portion of the
world, you can utilize the products of other people's efforts,
knowledge, and creativity.
A computer environment in which both humans and computer-based agents
have active roles implies a mixed locus of control [13]. The user's
environment will no longer be completely stable, and the user will no
longer be totally in control. For general sanity, users still need to
determine the balance of control in the different parts of their
environment, they need the means to find out what their agents have
been up to, and they need mechanisms for navigating in the wider
networked world; but the Anti-Mac interface trades some stability for
the richer possibilities of a shared world. In the real world, we
don't want anyone rearranging the mess on our desks, but we don't mind
if someone puts a Post-It note on our computer screen, empties the
wastebasket overnight, or refills the newspaper stand with the latest
edition.
Mutually Reinforcing Design Principles
During the relatively short history of computing, several changes have
taken place in the ways computers are used and in the people who form
the main user community. These changes in usage and users lead to
important changes in user interfaces. Highly usable designs reflect a
set of principles that are mutually reinforcing and that match the
needs of the main users and their tasks.
For example, the original text-based Unix user interface was very
appropriate when it was designed. The command-line interface was
suited for a user community dominated by programmers who had the
expertise, interest, and time to understand how the many different
system features could be combined for optimal use through mechanisms
like piping and shell scripts. Early Unix users manipulated relatively
few data objects (their own code, text files, and memos) that were
mostly plain Ascii text. The Unix design principles of modular
commands, with many parameters and options, worked well for the chosen
users and their tasks. As evidenced by several attempts to build
graphical user interfaces on top of Unix, it is not possible to modify
the design by simply changing one or two aspects (e.g., by
representing files as icons), since the full set of design attributes
were chosen to work together. The successful modifications of Unix
have been those that combine several new design principles to achieve
a new mix of mutually reinforcing design elements (e.g., combining
icons with drag-and-drop and multiple workspaces to structure the
user's tasks).
Similarly, the Macintosh was optimized for a certain user community
with a certain main type of data: knowledge workers without any
computer science background who wanted to manipulate graphic
documents, but not overwhelmingly many of them. The various Macintosh
user interface principles work well together and help the chosen user
category achieve its main goals. Since the principles are mutually
reinforcing, it may not be easy to break just a few as changing users
and tasks lead to new requirements. For example, as argued earlier, it
becomes impossible to make all objects visible in the interface as
system usage changes to deal with millions of information objects. But
having invisible objects not only breaks the design principle of
see-and-point, it makes direct manipulation impossible, risks
weakening the chosen metaphors, and breaks perceived stability if some
objects are made visible anyway at certain times.
The Anti-Mac principles outlined here are optimized for the category
of users and data that we believe will be dominant in the future:
people with extensive computer experience who want to manipulate huge
numbers of complex information objects while being connected to a
network shared by immense numbers of other users and computers. These
new user interface principles also reinforce each other, just as the
Mac principles did. The richer internal representation of objects
naturally leads to a more expressive external representation and to
increased possibilities for using language to refer to objects in
sophisticated ways. Expert users will be more capable of expressing
themselves to the computer using a language-based interface and will
feel more comfortable with shared control of the interface because
they will be capable of understanding what is happening, and the
expressive interface will lead to better explanations.
Conclusions
First, to avoid misunderstandings, let us reiterate that we are not
criticizing the Macintosh interface because we think it is bad, but
because we view it as a starting point for considering the differing
needs future user interfaces must meet. Table 2 compares some of the
characteristics of the original Macintosh and Anti-Mac user
interfaces. Bruce Tognazzini's Starfire film [19] was an attempt to
show how a high-end workstation might look in the year 2004. The
interface visualized in the film has several similarities to our
Anti-Mac design: The system understands objects well enough to
integrate them seamlessly (it can construct a 3D texture map of a
person from a 2D video image); the screen is very large, with fairly
expressive interface elements; the computer has some independent
initiative (when the heroine searches for an article reprint, the
agent automatically decides to search further and display subsequent
articles with links to the original article); and the computer
monitors the user (it does not attempt speech recognition while she is
talking with a colleague).
Despite its Anti-Mac aspects, Starfire is still a very recognizable
computer with many similarities to current user interfaces and
research systems. Indeed, there are some Anti-Mac features in evidence
in current commercial products. For example, products like On
Location and SearchIt show autonomy in taking the initiative to index
the user's file system when needed and allow users to retrieve email
objects by content and other rich attributes even though each email
message may be part of a large text file when viewed through the
standard system interface.
Any realistic person who reads the preceding outline of the Anti-Mac
interface will realize it is mostly impractical with the current state
of computers and software. Language understanding is still in its
infancy; most of the population are having their first encounters with
computers; and most users know better than to trust computer-based
agents with a free rein on their systems. Today's standard WIMP
interfaces are fairly well suited to current hardware and software
capabilities. But hardware designers are not slowing down, and the
challenge for application and interface designers is to take advantage
of the coming computing power to move the computer-human interface to
a new plateau. Interface designers need to start work on these issues
now so that solutions will be available when the next generation of
computers arrives, and also to help guide the directions of hardware
and software development.
To realize the full benefits from the Anti-Mac approach, we argue, it
will not be sufficient to retrofit its features one at a time to
systems that basically follow current user interface architectures. We
believe an integrated design that builds a new user interface from the
bottom up will be more coherent and will have a better potential for
increasing user productivity by at least an order of magnitude. A full
Anti-Mac system will likely have to be based on deep object structures
in an architecture that supports network-distributed objects and
detailed attributes that are sufficiently standardized to be shared
among multiple functionality modules. Realistically speaking, such a
complete redesign will take some time to appear. Even if the full
system were not to appear for several years to come, it is necessary
for detailed research to begin now to develop the needed features and
to collect usability data on how the Anti-Mac characteristics should
be shaped to truly meet users' needs.
Acknowledgments
We would like to thank Bruce Tognazzini, Ellen Isaacs, Robin Jeffries,
Jonathan Grudin, Tom Erickson, and the members of the human interface
group at SunSoft for valuable comments on this article.
References
1. Alfven, H. Worlds-Antiworlds. W. H. Freeman, San Francisco, 1966.
2. Apple Computer. Macintosh Human Interface Guidelines.
Addison-Wesley, Reading, Mass., 1992.
3. AT&T. Home page at URL
http://www.att.com
4. Brennan, S. E. Conversation as direct manipulation: An iconoclastic
view. In The Art of Human-Computer Interface Design, B. Laurel, Ed.
Addison-Wesley, Reading, Mass, 1990, pp. 393--404.
5. Clymer, F. Treasury of Early American Automobiles. McGraw-Hill,
New York, 1950.
6. Cypher, A. Eager: Programming repetitive tasks by example. In
Proceedings of the ACM CHI'91 Conference Human Factors in
Computing Systems (New Orleans, April 28--May 2) 1991,. pp.
33--39.
7. Gentner, D. R. Interfaces for learning: motivation and locus of
control. In Cognitive Modelling and Interactive Environments in
Language Learning, F. L. Engel, D. G. Bouwhuis, T. Boesser, and G.
d'Ydewalle, Eds. NATO ASI Series, vol. F 87, Springer-Verlag, Berlin,
1992.
8. Grudin, J. The case against user interface consistency. Commun.
ACM 32, 10 (1989), 1164--1173.
9. Halasz, F., and Moran, T. P. Analogy considered harmful. In
Proceedings of the ACM Conference on Human Factors in Computer
Systems (Gaithersburg, Md., March 15--17) 1982, pp. 383--386.
10. Hill, W.C., Hollan, J.D., Wroblewski, D., and McCandless, T. Edit wear
and read wear. In Proceedings of the ACM CHI'92 Conference on
Human Factors in Computing Systems., (Monterey, Calif., May 3--7)
1992, pp. 3--9.
11. Johnson, J. Modes in non-computer devices. Int. J. Man-Mach.
Stud. 32 (1990), 423--438.
12. Malone, T. W., Grant, K. R., Turbak, R. A., Brobst, S. A., and Cohen,
M. D. Intelligent information-sharing systems. Commun. ACM 30, 5
(1987), 484--497.
13. Nielsen, J. Noncommand user interfaces. Commun. ACM 36, 4
(1993), 83--99.
14. Nielsen, J. Interface design for Sun's WWW site. (June 1995). URL:
http://www.sun.com/sun-on-net/uidesign
15. Nielsen, J., and Sano, D. SunWeb: User interface design for Sun
Microsystems' internal web. In Proceedings of the Second
International WWW Conference '94: Mosaic and the Web. (Chicago,
Oct. 17--20).
http://www.ncsa.uiuc.edu/SDG/IT94/
Proceedings/HCI/nielsen/sunweb.html
16. Open Software Foundation.OSF/Motif Style Guide. Prentice-Hall,
Englewood Cliffs, N.J., 1993.
17. Shneiderman, B. Direct manipulation: A step beyond programming
languages. IEEE Comput. 16, 8 (1983), 57--69.
18. Sun Microsystems, Inc. OPEN LOOK: Graphical User Interface
Application Style Guidelines. Addison-Wesley, Reading, Mass, 1990.
19. Tognazzini, B. The Starfire video prototype project: A case
history. In Proceedings of the ACM CHI'94 Conference on Human
Factors in Computing Systems (Boston, April 24--28), pp. 99--105.
Film clips available at
http://www.sun.com/tech/projects/ starfire
20. Want, R., Hopper, A., Falco, V., and Gibbons, J. The active badge
location system. ACM Trans. Inf. Syst. 10, 1 (1992), 91--102.
21. Weiser, M. The computer for the 21st century. Sci. Am. 265, 3
(1991), 94--104.
22. The Windows Interface: An Application Design Guide. Microsoft
Press, Redmond, Wash., 1992.
About the Authors:
DON GENTNER is Senior Staff Engineer in Human Computer Interaction at
Sun Microsystems. He works on human interface designs for current
products and thinks about how people interact with technology. Email:
don.gentner@sun.com, URL:
http://www. mediacity.com/~judith/don.gentner/
JAKOB NIELSEN is Distinguished Engineer for Desktop Technology at Sun
Microsystems, where he works on next-generation user interface
concepts, user interface design for the WWW. Email: jakob@eng.sun.com,
URL:
http://www.sun.com/columns/jakob
Authors' Present Address: Sun Microsystems, MTV 21-225, 2550 Garcia
Ave., Mountain View, Calif. 94043.
Permission to make digital/hard copy of part or all of this work for
personal or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial advantage, the
copyright notice, the title of the publication and its date appear,
and notice is given that copying is by permission of ACM, Inc. To copy
otherwise, to republish, to post on servers, or to redistribute to
lists requires prior specific permission and/or a fee.
©ACM 0002-0782/96/0800 $3.50