Kinda Human / Artifact Review - Conversational Symbiosis

Dates: August 2018 to October 2018
Team: Scott Dombkowski
Advisors: Stacie Rohrbach and Molly Wright Steenson
Work Type: Academic

To better understand how conversational symbiosis could be achieved between humans and artificial agents, I studied fifteen interfaces and one interactive experience with a specific focus on the relationship between a human and a particular artificial agent.

Musicolour (1953)

Musicolour was "a sound-actuated interactive light show" (Bird & Di Paolo, 2008) designed by Gordon Pask. It is especially noteworthy because it provides an example of a conversational interface that disrupts the black box model we see in the majority of today's interfaces. Its users were aware of its interpretation of their performance, thus enabling a user to reevaluate their actions. Pask's design also shows how cooperative action between a system and its users can be the result of a specific implementation. Musicolour was able to create a dialogue between musicians and itself, which in turn resulted in users committing to engage with the system. interfaces that invoke similar cooperation to conjure the "thoughts and words" (Conversation, 2017) that their systems require to create exchanges beyond the "predictable" (Pangaro, 2011) could provide numerous benefits to intimate partners today.

Pask, Musicolour

ELIZA (1966)

ELIZA, designed by Joseph Weizenbaum, enabled a user to communicate through a typewriter with a simulated psychologist. Weizenbaum chose the context of a conversation with a psychologist because it is "one of the few examples of categorized dyadic natural language communication in which one of the ... [participants in the psychiatric interview] is free to assume the pose of knowing almost nothing of the real world" (Weizenbaum, 1966) and enables "the speaker to maintain his sense of being heard and understood" (Weizenbaum, 1966). Weizenbaum was interested if users were able to immediately recognize the limits of the interface, enabling them to concentrate on communicating with the machine and leading to improved expression and understanding by users. Ultimately making ELIZA an example of what happens when one attends to the environment in which an experience resides.

ELIZA was also an attempt to create "[an] engagement in mutually beneficial, peer-to- peer exchange" (Dubberly & Pangaro, 2009). Implementations of "categorized dyadic national language communication" (Weizenbaum, 1966) like ELIZA or similar instruments, especially when users are committing to engage in a conversation, could enable improved interactions on conversational interfaces.

Weizenbaum, ELIZA

URBAN5 (1973)

URBAN5 was designed by Nicholas Negroponte and MIT's Architecture Machine Group to "study the desirability and feasibility of conversing with a machine about environmental design project... using the computer as an objective mirror of the user's own design criteria and to form decisions; reflecting formed from a larger information base than the user's personal experience" (Negroponte, 1970, p. 71). It achieved this by establishing a visual language that represented cubes and a question-and-answer dialogue between a user and a machine.

It hoped to establish an environment, where users would became aware of the restrictions of the application and their purpose within the application. URBAN5 also attempted to establish a "shared language" (Dubberly & Pangaro, 2009), by employing a block as its primary mode of manipulation and the creation of a shared understanding between users and the interface of a block and its capabilities within the environment. But, were ultimately unsuccessful at developing well-designed instruction and integrating objects, terms, and language familiar to a user to create a symbiotic relationship between the user and artificial agent.

Negroponte, URBAN5

The Coordinator (1987)

The Coordinator, one of the systems described in my literature review, was designed by Terry Winograd to "provide facilities for generating, transmitting, storing, retrieving, and displaying messages that are records of moves in conversations" (Winograd, 1987). It enabled a user to express themselves with little concern for the structure of that expression. Whereas a typical conversational interface provides one way to construct a message, The Coordinator offered numerous options. For example, "when Request is selected, templates appear prompting the user to specify an addressee, others who will receive copies, a domain, which groups or categorizes related conversations, and an action description, corresponding to the subject header in traditional mail systems" (Winograd, 1987). If a user were to select a different option, they would be provided with a different template designed for that specific request.

The Coordinator demonstrates how making a user's line of thought visible to the other agents interacting with them can help conversation progress in a beneficial direction. Similar mechanisms that are used to make thoughts visible could be particularly helpful in interfaces designed for intimate partners.

Majestic (2001)

Majestic was an alternate reality multiplayer game developed by Electronic Arts. Instead of engaging users on one platform, users were able to engage on multiple platforms as "new subscribers disclosed their phone number, fax number, email, instant messenger names, and other personal contact information" (Salvador, 2015). If a user disclosed different mediums, they would then receive messages pertaining to the game on those specific mediums. The game took place on a unique timeline, in that if a character needed to drive to a town an hour away, a user would have to wait an hour for that character to arrive in that town and not be able to simulate that period of time.

Unlike regular life simulation games that take users to an alternative world, Majestic users are taken to an alternative world within their world. It also serves an example of how one could immerse users into a simulation. For instance, observing another couple's conversations could help an intimate partner discern what behaviors are beneficial and not beneficial in their town relationship. This process may also aid a partner's objective analysis of their conversations and implement learnings into their relationship.

Lemonade (2015)

Lemonade Insurance is a "property and casualty insurance company that is transforming the very business model of insurance" (About Lemonade, n.d.). Instead of a more typical insurance application through an online form, users message with a chatbot using a real individual's avatar image to replicate an experience you would have with a more traditional insurance company.

Lemonade serves as an example of how an environment can potentially create the illusion of personal interaction. To what extent that illusion is successful is unknown.

M (2015)

Facebook M was a piece of functionality within Facebook's messaging platform Messenger. It utilized "human trainers [who] gamely do their best when they receive tough queries like 'arrange for a parrot to visit my friend ,'" (Simonite, 2017) that are impossible for a machine learning algorithm. Misunderstandings were common because of users' incorrect mental models of the tool. For instance, Facebook M received numerous unachievable requests, because a user recognized that M was different from Siri and Alexa and was able to complete requests those assistants were not able to, a user's notion of what is possible became flawed, leading to ineffective exchanges.

Facebook M's implementation of human backups serves as inspiration for how to overcome limitations in natural language processing models.

Facebook M

Allo (2016)

Google Allo is "a smart messaging app that helps you say more and do more" (Google, 2019). One way Allo addresses the complexity of conversation is with its "Smart Reply" functionality (very similar to Gmail's Smart Compose functionality) that suggested responses based on algorithms hidden in its backend.

Allo provides an example of an artifact that lacks in its ability to explain itself. For instance, a user will never really understand how Allo's smart replies are generated because the way Allo determines your "personality" (Google, 2019) remains an open question. Additionally, if a user wishes to influence the intelligence provided by Allo, they would not have a direct method to effect such intelligence. If implemented in a way that allowed for feedback, Smart Reply could create a shared language between a user and an agent.

Google Allo

Hatchimal (2016)

Hatchimals by Spin Master are "magical creature[s] inside colorful speckled eggs" (Hatch Club, n.d.). Unlike a regular toy where the child can immediately play with the toy after unboxing, Hatchimals need to be cared for some time before they hatch from their egg. Users' interactions with a Hatchimal evolve, from an egg to a hatching egg, to a baby, to a toddler, and eventually to a child. While interacting, users receive feedback from the sounds a Hatchimal makes and its changing eye colors (i.e., light blue eyes representing a Hatchimal that is cold, teal eyes representing a Hatchimal that is learning to talk).

The novelty and interaction patterns of a Hatchimal provide an example of an artifact that communicates without words. Whether it be through sounds (e.g., baby sounds) that users already understand or different colored eyes that they need to learn, users can glean information from a small set of feedback mechanisms. Similar strategies can be applied to an experience regardless of their complexity. One might even see an argument for limiting the mechanisms an experience can invoke.

Hatchimal

Jacquard (2016)

Jacquard by Google is a jacket that enables a wearer to interact with their phone through gestures on the jacket's cuff. The jacket is boasted as an entirely "new take on wearables that lets you do more than ever with the things that you love and wear every day" (Jacquard, n.d.).

Jacquard serves as an example of an artifact that facilitates interaction at an environmental level. Instead of adding a device, users interact with an artifact they would already be using. For my project, one can examine the artifacts that already comprise intimate relationships and discover potential opportunities to embed agents.

Jacquard

Objectifier (2016)

The Objectifier designed by Bjørn Karmann, a student at the Copenhagen Institute of Interaction Design. It was designed to empower "people to train objects in their daily environment to respond to their unique behaviors" (Objectfier, n.d.). For instance, a user would train the Objectifier to turn on a light when it recognized the cover of a book and turn off the light when it no longer recognized the cover of a book. To train the Objectifier, a user takes snapshots of the environment so that the device recognized on and off states. While the Objectifier gives a user an understanding of how a model could be trained with a yes or no state, it is not a training device for how that photo/sound recording is decoded and then used to differentiate future states.

The Objectifier ultimately serves as an inspiration for how an artifact can empower an individual to develop an understanding of how it is programmed.

The Objectifier

Internet Phone (2017)

The Internet Phone designed and created by James Zhou, Sebastian Hunkeler, Isak FrostaĚŠ, Jens Obel, students at the Copenhagen Institute of Interaction Design. The artifact is their "attempt to make the intangible processes of the internet tangible in order to inspire people to learn more about it" (The Internet Phone, n.d.).

This project serves as an example of how different modes of interaction (e.g., article token, developer token, incognito token, and history token) help users understand different technical aspects of an artifact. Ensuring that users grasp those aspects they might construe as unfriendly to users is especially important to ensure user have an understand the capabilities and limits of artificial agents.

The Internet Phone

Replika (2017)

Replika "is an AI friend that is always there for you" (Pardes, 2017) that you grow through conversation. It provides an environment that one is comfortable to express themselves in ways they would not normally. Replika is built on top of CakeChat, "a dialog system that is able to express emotions in a text conversation" (CakeChat, n.d.). CakeChat is described as a tool for constructing responses similar to those created by the individual communicating on Replika.

Replika and CakeChat provide an example of contemporary natural language processing model's capacity to effectively enter a conversation with a human. It also reveals potential areas of improvement, including CakeChat's relatively limited emotional range of anger, sadness, joy, fear and neutral.

Replika

Duplex (2018)

Google Duplex is "a new technology for conducting natural conversations to carry out "real world" tasks over the phone" (Leviathan & Matias, 2018) that utilizes Google Voice Search and WaveNet. It targets particular tasks and is constrained to closed domains (i.e., for a demo Google gave the creation of a haircut appointment and the creation of a restaurant reservation as two domains). Google restricted the demo to haircuts and restaurant reservations so that they could extensively understand those domains and build models to enable natural conversations.

Duplex unleashed critical but mixed public reaction to the technology and how it imitated a human without disclosing that is was not human. This work indicates the importance and benefit of clearly establishing expectations and avoiding deception when creating an artificial agent.

Google Duplex

Project Oasis (2018)

Project Oasis is "a self-sustaining plant ecosystem that reflects outside weather patterns by creating clouds, rain, and light inside a box" (Sareem, 2018). Users command a Google Assistant to show the weather in a specific location; Project Oasis then reflects that weather.

The project shows how with the creation of an alternate world, a user can test different scenarios. Such scenarios are not limited to weather but can expand to situations intimate partners could find themselves in.

Quantified Self (2018)

Quantified Self was "an immersive theater show centered on ethical uses of personal data" (Skirpan et al., 2018) created by Jacqueline Cameron, Michael Skirpan and Tom Yeh. Through the show, Cameron, Skirpan, and Yeh saw how an individual could learn more from creating this show than a typical educational setting. They also found that some of the discoveries users took away from Quantified Self were gained when participants talked to others engaged in the same activity about their unique experience. Similar interactions could be facilitated between intimate partners if they were allowed to converse with other couples about their own unique experiences.

The creators of Quantified Self also saw the importance of what a participant brings to a conversation and how the uniqueness of prior experiences should and can be adequately addressed by varying the content of the designed experience to align with user expectations.