Embodiment
Principle
Definition
Giving human-like qualities to an AI system, either to interface with its environment, or to align behavior with human expectations. When used indiscriminately, undermines perceived utility to users due to its incomplete capabilities.

Overview

One might concisely define AI as human mimicry, so it is no wonder that designers tend to represent AIs as virtual beings that have avatars, names and even voices. Embodiment is the act of giving AI such tangible human-like features, though this carries both opportunities as well as drawbacks. The quantity and pervasiveness of embodied technologies is increasing rapidly, with the proliferation of voice assistants, chat-bots, and even anthropomorphic robots. However, AI by no means requires, or even benefits from, embodiment. Instead, embodiment should be seen as a deliberate choice where benefits are weighed against a host of complex design challenges.

The Human Touch

Our society has a mythology around machines imitating humans in order to create interactions and experiences that are more approachable (or at least, more futuristic). Moreover, designers of all stripes are naturally inclined to use human-relative terms when describing improvements to technology—friendly, empathetic, inviting, trustworthy. Many of us want our everyday systems and services, too, to have this ‘human touch’, so that everything from our HR department to our local sandwich shop can make every interaction as wonderful as possible.

However, we should not mistake these desires for a more humane world as desires for a world with more human-like machines. A virtual avatar does not make every technology more friendly, no matter how cute or playful it looks. After all, that virtual avatar will still have to represent a product that antagonizes users.

Products Versus Services

Are you designing a product, or a service? Embodiment often blurs this line. In a product, users expect a standard thing, whereas in a service, users expect a customized and individualized solution. Services often require interfacing with a human who can answer questions or handle exceptions. Unfortunately, an embodied interface can make a product feel like a service, creating highly mismatched expectations for users. When the service delivers a pre-scripted behavior, users can become distrustful or confused. Some products even attempt to weave in a service experience by having a human agent magically replace the ‘bot’. However, designers should make sure that users receive an interaction that matches their expectations, or at least know when they are talking to a real human.

Lingua Franca: Artificial Intelligence (AI) difference between a product and a service

Human Interaction

Many designers would like to condition users into treating an automated service like a human, or even better, for users to not realize the difference. However, a human using a machine for a task will never treat it the same way as they would another human—with politeness, greetings, and other trivialties. When machines have these qualities, humans often end up finding it abraisive.

When Embodiment Works

Embodiment works when it makes user interaction more seamless and accessible by reducing rather than erecting barriers. Voice assistants have done this with some success. Unlike touching a rectangular pane of glass, human interaction is natural and intuitive. For a first-time phone user, listening to music involves having to first learn a variety of concepts, deciphering unfamiliar shapes and UIs. Instead, voice assistants allow users to make requests more naturally, without tapping or swiping through a new technology. Gesture-recognition may take this even further by opening the interface up to foreign speakers or sign-language users. Embodiment works when natural actions yield expected results that conform to our personal experiences.

Clarity and Obfuscation

Unfortunately, embodiment often undermines its intended design. We hold a variety of expectations from other people, and we carry those expectations roughly to embodied technologies as well. Selecting features for your embodied technology may require some challenging decisions about age, gender, and personality. It will encourage potentially awkward and embarassing behaviors—one of the most commonly uttered phrases to chatbots is “I love you”[1]. We expect anything speaking to us in natural language to have a wide range of conversational abilities, such as opinions on various subjects, a sense of humor, and even individual quirks. When those features do not exist, the interface ‘breaks the fourth wall’ and we stop interacting in a natural way. Without clear boundaries of what these systems are capable of, they will inevitably undermine our expectations and dissuade us from interacting.

The Unfriendly Valley

The ‘uncanny valley’ is the gap between humans and creepy quasi-human AIs. However, a second ‘unfriendly valley’ may exist as well. Because these conversational agents have been programmed to complete a task, they lack real empathy or nuance. Most will eventually be seen as unfriendly, stubborn, or single-minded, as they constantly revert back to their intended design. Even if these agents are mistaken for real people, they may not be people that we want to chat with.

Unlike graphical user interfaces, conversational interfaces often lack a clarity of purpose. Users often remain unaware of the majority of the system’s capabilities. GUIs typically contain a button or other interactive element that relates every capability to the user. In a conversational setting, I may have to use a query such as “what can you do?” or “how do you do X?” to find the system’s uses. In designing a conversational agent, consider ways of conveying capabilities to a user, such as visual aids, tutorials, or other indicators (see Prototyping AI with Humans).

Individuality

Perhaps a common cause of the overuse of embodiment is the designer’s expectation that AI will ‘eventually’ begin to mimic humans extremely well. The media has further muddied this water by reporting optimistically that every new technology may finally replace human interaction.

However, consider whether your product even needs to be packaged in a human-like way at all. We call this design bias the myth of individuality. Take the simple example of booking a hotel room. Perhaps you may design a hotel-booking service as a virtual clerk who fields questions and concerns from the user. Of course, such a service may also be represented as an input form, where users can select dates and pick from available options as a list with checkboxes. Many user behaviors enabled by the simple form become laborious with a virtual clerk. People often spend inordinate amounts of time fiddling with forms, finding date ranges and then narrowing in on their final choice. This would be equivalent to asking the clerk hundreds of questions, repeating yourself over and over, and changing your mind mid-conversation. Ironically, the self-service form with checkboxes might be the more human-centered approach.

Design Questions

Re-design your product without embodiment (or vice-versa if it is not embodied already).
Does your system overwhelmingly require an embodied AI?
What human-like tendencies will users expect from your system that it doesn’t have?
Are any features easier or more intuitive without embodiment?
Determine how users will find out all the capabilities of your embodied AI.
How restricted are the capabilities of your system?
Are users expected learn new skills or interactions with the AI over time?
Does your system contain hints or mentions of additional capabilities elsewhere?
List out all the particular features (e.g. voice, gender, personality) of your embodied AI.
Given a human with such features, how does the embodied AI make them feel?
How can you implicitly communicate more subjective qualities like personality traits?

Considerations

Agentive Systems

Agents should be regarded as a separate kind of interface rather than a replacement for an in-person experience.

Similar to how a puppet show is not a direct representation of theater, agent-based interfaces are distinct from human interactions. A virtual, 3D model of a secretary is naturally not equivalent to a human secretary. Embodied systems are typically much narrower in functionality than humans, and their imitation of human behavior is often exaggerated. Instead of fruitlessly attempting to mimic human behavior, consider opportunities to leverage this virtual system for what it is, and excels best at.

Default Conditions

Beware of default behaviors for agents, as they will appear highly disproportionately to any other behavior.

Virtual agents typically have a few “default conditions” that allow them to interact even in cases where they do not recognize the input. A common example is a voice agent saying “I don’t understand” in response to an unusual query. These default conditions will appear disproportionately in the user interaction, and will break the fourth wall of embodied agent. Consider ways to mitigate this default condition, and think critically where the agents can better express a more helpful, actionable response.

Self-Service vs. Counter Service

Determine whether your users would prefer ‘self-service’ or ‘counter-service’ interactions.

When choosing whether to include embodiment, consider the difference between a self-service buffet and counter-service fast food. Counter services excel at helping users make choices and pick from various packaged options. Self-service restaurants excel at letting users spend time deciding and comparing, mixing and matching from many options. Self-service interfaces typically suffer from embodiment, while counter services can benefit.

Expectation Management

Embodiment creates vast expectations for capabilities, as well as running the risk of uncanniness.

Users regard a virtual embodiment as ‘uncanny’ if it tries to mimic human behavior without having an internally ‘sentient’ core. Humans will perceive this as a ‘hollow’ embodiment rather than a true person. Instead of attempting to solve this uncanny valley head-on, perhaps do not attempt to mimic human behaviors as strongly.

Internal Voice

It often helps to have an interface provide prompts through an ‘internal voice’ to avoid the design challenges of embodiment.

Tools that provided human-level capabilities can avoid embodiment entirely by conveying the tool as an ‘internal voice’ of the user. That way, the tool is not a bidirectional communication tool, but a reflection of the user’s internal state. Users can then take action without dealing with a mental representation of another person. For example, instead of having a bot ask the user what they want to do, the user could be prompted to complete the sentence “I want to [blank]”. This use of the internal voice may improve the system’s ability to handle user intent and remove the complexities of embodiment.

Product/Service Mismatch

Agentive tools can mistakenly turn a product experience into a service experience, where users expect the product to be delivered as a customized solution.

Many services are now provided over a digital interface that removes most human elements. For example, many banks now offer access to tellers over chat or SMS, where you speak with a live agent without entering a physical bank. This can create challenges for automated systems, which may have very similar UIs. Give users an indication of whether they are speaking with an automated system or a real person, and let users know if they are using a product or interacting with a service.

Further Resources

Footnotes


  1. AI makes the heart grow fonder by Nikkei Asian Review ↩︎