Hi! Kudos for finding your way here. Just a heads up: This essay is still a work in progress, so I don't really think you want to spend any time reading it. A first, camera ready draft will hopefully find its way up soon™
Multiverse is an ongoing research project that explores new ways to read and write fiction with artificial intelligence. It asks questions about Human–AI interaction by investigating the future of interactive literature. Central is the eponymous prototype, a real-world interactive literature system that acts as an evolving test bed for literary experimentation.
The project evolved from my master's thesis in interaction design. It has found use in research workshops, teaching and as a tool for writing.
The first period of the work primarily investigated ways of reading AI–written fiction, documented in the published material and this essay. Current research explores novel interaction modalities for co-writing fiction with AI. If that sounds interesting to you, get in touch.
Multiverse and is a collaboration with Maliheh Ghajargar, Jeffrey Bardzell, Anuradha Reddy and Line Henriksen. The prototype is distributed as open source software but requires a GPT3–API key to function.
Anthropocentrism in AI
The present moment of AI creativity is in a rather peculiar state. Generative image models, like the canonical DeepDream, are "allowed" radical exercises in form. Models like Stable Diffusion and Craiyon flood social media with a tide of fully automated mashup extravagance celebrated as a new era in creativity. However, the same playful attitude does not seem to apply to our AIs' literary works. Texts created by large language models get judged against conservative human authorship. A picture is worth a thousand words, creatures of interpretation. A text is, at least on a surface level, very concrete, far more prone to scrutiny. AI-made images capture our imagination through strange hallucinations. Their writing stands out as drunken impersonation.
This discrepancy intrigued me. Are the writings of these models nothing but stochastic ramblings? Or, is our anthropocentric gaze blind to this thing, weird and novel? Multiverse began as an attempt to approach and appreciate AI writing as a novel medium.
Anthropocentrism refers to the notion of the human as separate from nature. Of course, such a challenge to the status quo is nothing new. The I Ching, written three thousand years ago in Chinese antiquity, uses chance as a method and motive. The modern movement was full of formal experiments like Dadaist cut-up poetry. Where the post-modernists questioned authorship through contradictory, intertextual works, personal computing is inseparable from the ambitions of hypertext. The latter is of particular interest as it manifests a non-sequential literary medium. It states our experience as the base of comparison and understanding. It follows that the "intelligence" in artificial intelligence approximates the "human." Who or what the "human" might be is often left implicit, a function of the ruling culture. In approaching AI literature, I argue that we must let go of this anthropocentrism. To do so requires an open mind and a critical re-evaluation of what literature can be.
Literature as sequence
Conventional human literature follows a linear structure. A story has a beginning and an end. Between these comes a linear sequence of passages that unfold the narrative. This ordering is a fundamental property that affords authors control over the experience. Literary devices like suspense (withheld information) or plot development (new information) function sequentially.
Contemporary AI lacks an "understanding" of human notions of story or sequence. They struggle especially with continuity, intentionality, and disambiguation of characters. But is this a shortcoming of the model or the medium? Can we imagine a fiction free of sequence with room for something beyond the current convention?
Of course, such a challenge to the status quo is nothing new. The I Ching, written three thousand years ago in Chinese antiquity, uses chance as a method and motive. The modern movement was full of formal experiments like Dadaist cut-up poetry. Where the post-modernists questioned authorship through contradictory, intertextual works, personal computing is inseparable from the ambitions of hypertext. The latter is of particular interest as it manifests a non-sequential literary medium.
Hypertext's distinguishing feature is the interlinking that affords arbitrary non-linear access. The granularity of this linking and sequence is up to the author. The Web has settled into a convention of linking complete, individual documents. Contrast this to hypertext fiction, which uses linking to create non-sequential narratives.
Despite the ubiquity of the Web, non-sequential hypertext literature remains exotic. Carter (2002) identifies the "expectation of order" as the leading blocker for would-be hypertext writers. Interconnected narratives converge toward exponential branching, resulting in a seemingly endless text to wire and edit. Whether cultural or a hard limitation of the human mind, hypertext seems a tough sell in a culture of sequence. However, these attributes make hypertext compelling for an AI free of cultural blockers and cognitive limits.
Despite its anthropocentric ambition, contemporary AI lacks many of hypertext's cultural blockers. An LLM is, by and large, a function that inputs existing text and outputs something new. For example, where a human would struggle to author single branching narratives, an AI has already written five. The fragmented nature of linked pieces suits an author that struggles with long form. Instead of creating a finished work in one go, AI hypertext affords interactive, co-creative discovery. Imagine a literary medium where a simple starting point story unfolds into a branching, infinite possibility space.
Multiverse presents a literary model that combines hypertext's AI–compatibility with limited sequentiality. This combination manifests in a tree structure through experimentation with hypertext networks. Trees are simple to manipulate programmatically and map well to the prompting models. They also introduce limited sequence, breaching from a stable root prompt. Early user testing found that this aided in comprehending foreign hypertext-like structures. Pope (2006) argues that "idiosyncratic and unfriendly" hypertext interfaces discourage readers. Having a "stable point" affords a grounding that can encourage discovery and experimentation. However, a data structure alone does not make an interface. The following section details the major design decisions in the Multiverse prototype by listening to human and AI feedback.
Designing multiversal literature
Single screen interfaces
Users of Multiverse spend most of their time in the story view, the place for reading and co-creating with the AI. Initially, the interface appears dense, more akin to something from a video game than iBooks. The broad strokes of this design have remained mostly intact from the original prototypes, initially inspired by the dialogue trees of role-playing games like Disco Elysium or Kentucky Route Zero.
At its core, Multiverse follows a strict design philosophy where the literary exploration takes place in a modeless, single-screen interface. It features three main sections: The current story ➊, possible continuations ➋, and the map ➌.
Displaying the state of a system with multiple tightly coupled views is not necessarily all too novel. Information designer eremites Edward Tufte (1997) discusses the value of complementary representations for understanding both large and small multiples of data. Matthew Davies (2019) of Subset Games employs highly fluid "single screen interfaces" that serve two purposes: They afford the player agency in complex real-time situations and give the designers a consequential limitation. Anything you add must fit into the already dense single screen through integration or addition to the existing elements. Most desktop "document-based" applications constitute a kind of single-screen interface. However, they tend to grow inward, hiding auxiliary information in modal views. Multiverse forgoes the pedagogical ambitions of progressive disclosure through a clear interaction hierarchy. Each section has a clear direction:
Selecting a continuation in ➋ progresses the story, moving in a downward direction of the current path.
Selecting a previously visited sentence ➊ performs the reverse, moving the reader up the path.
Exploratory or comparative movements go via the map ➌, providing direct access to any part of the tree.
Users often discover the auxiliary directions upon "bottoming out" a path. This phenomenon happens when the narrative has reached a place where the Human has lost interest. There are multiple reasons for this, most commonly repetition or incoherence.
Error correction as a part of the story
Bottoming out, or, error correction can be a good thing! (Spoiler: it was ok and really mostly a bad undo system (oh well))
Early iterations featured a conventional typewriteresuqe styling. Later iterations featured cool animations. Swooong!
Human-AI interfaces to come
Three years of on-and-off work on Multiverse has made it clear that many Human-AI interfaces are left to make. One might argue that this is an obvious statement. However, like many obvious things, this realization possesses surprising depth. We suddenly find ourselves in a situation that requires interfaces for applied ethics and a co-existence that reaches beyond the screen.
Linus Lee (2022) has begun charting a course towards AI interfaces that go beyond remixes of the prompt, a space Multiverse arguably occupies. The evolutionary path of Human-Computer interfaces from textual to graphical representation is evident in hindsight, channeling a rich history of human knowledge expression. However, simply reenacting this process with AI is sure to fail. Prompt-based interfaces are a powerful modality, yet, they operate on a level of uncertainty prohibiting deep understanding and mastery. Currently-existing AI is by all accounts a weak approximation of AGI, yet it is already at such complexity that its inner work resists comprehensible representation.
Unable to visualize AI, we must invent new interfaces and languages. For example, we use different lingual modalities when interacting with animals and other non-human entities, communication where both parties have adapted to a different baseline of understanding. The work done with Multiverse is far from anything that approximates this goal. However, the research indicates the potential of tightly coupled, interlinked, and multimodal interactions in building intuition for complex uncertainties. The elaboration of these principles is one of the main aspects that guide the project's next phase.
Writing fiction with AI
The focus on reading AI-written fiction was an essential first step in finding ways to co-create with non-human beings. At its core, to read is to take something seriously, to validate its value as a creation worth our attention. However, there is a clear limit where current models struggle to produce narratives of greater length and substance, even with multiversal methods. As such, the next phase of this project aims to investigate new ways of writing fiction with AI and humans as active co-authors.
The goal is not to create a "Grammarly for fiction." Instead, the basic principles behind the first phase of Multiverse still stand, a non-anthropocentric approach to AI to imagine novel mediums and their interfaces. The composition of different prompting modalities forms a Turing complete design space. Summarization, description, and completion appear as primitives from which we will build new writing methods.
Let us imagine that we are writing a short science fiction story. The first few pages flow easily, after which the text seems to develop some intangible friction. However, diagnosing the cause of such a rut can be difficult. Is the message incoherent, are the scenes getting too cluttered, or is it just not possible to find a way forward? It is easy to see how one could configure an LLM like GPT-3 to provide "interpretations" of passages using summarization and description, suggesting alterations that better fit a stated vibe. Such prototypes are already in the wild, but they often focus on single features. A key enabler will be when we can run these processes simultaneously, with multiple interpretations. Therefore, the model should not be a small, pop-up part of the interface; it should be the interface. The current differentiation between prompt and output is arguably a technical dependency. One of the goals of the next phase of this research is to find ways to blur this line, affording a much-needed integration with a non-anthropocentric voice.
Allowing for the artificial
Early iterations of Multiverse used OpenAIs GPT-2, the predecessor to GPT-3. The previous model is, by most accounts, inferior. It is less proficient art zero- and few-shot learning and much less coherent. Where GPT-3 has a decent grasp on recurring motives, its predecessor confuses pronouns, names, and places. Yet, these "deficiencies" also made it a lot more interesting to read and write with. GPT-3 is polished but predictable. When it fails, it does so in rather dull ways. Instead, it imitates a human by reminding you that most things we write are uninteresting. While this simulated domesticity is a feature of GPT-3 as a production model, you would not want a chatbot to transform into a rouge slam poet; it limits its ability to do creative writing.
One way to solve the issue of limited creative prowess is by making more advanced and appropriate language models. For example, the inspirational work of researchers like Mirowski et al. (2022) clearly shows that there is much to gain from this approach. However, a technological trajectory still sidesteps questions of how we relate and interface with our AI companions. For example, our research found that the activity of reading with Multiverse often turned into a task of "caring for the AI," making creative decisions based on an intuitive likelihood of garnering the most coherent generations.
Caring for a dull reflection of ourselves raises a question: What could we create if we abandoned the conceit of AI as an imitation of the human mind? If our AI models on echoes of ourselves, can it create something novel? How would a feral GPT-3 scale model communicate its strange conception of the world? Would intelligence made from numbers not have to be a different being all of its own? How would a feral GPT-3 scale model communicate its strange conception of the world? Would intelligence made from numbers not have to be a different being all of its own?
Thanks to my collaborators Maliheh Ghajargar, Jeffrey Bardzell, Anuradha Reddy, and Line Henriksen, my thesis advisor Anne-Marie Hansen, Benjamin Maus for all of his UI feedback, and Malte Dahlberg for proof-reading this essay.
- Carter, L. (2002). Argument in hypertext: Writing strategies and the problem of order in a nonsequential world. Computers and Composition, 20(1), 3-22.
- Pope, J. (2006). A future for hypertext fiction. Convergence, 12(4), 447-465.
- Tufte, Edward (1997). Visual Explanations: Images and Quantities, Evidence and Narrative, Graphic Press.
Davis, Matthew (2019). Into the Breach Design Postmortem , Game Developers Conference.
- Lee, Linus. Supposing better interfaces to language models (2022).
- Mirowski, P., Mathewson, K. W., Pittman, J., & Evans, R. (2022). Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals. arXiv.