What is CoAuthor?

CoAuthor is a human-AI collaborative dataset that captures rich interactions between 63 writers and 4 instances of GPT-3 across 1445 writing sessions.

figure

A human writer was presented with a prompt and given an instance of GPT-3 in each writing session. The writer freely wrote, requested suggestions from GPT-3, accepted or dismissed suggestions, and edited accepted suggestions or previous texts in any order they choose. In CoAuthor, all interactions between writers and the system (e.g. insert or delete text, move cursor, and get suggestions) were preserved with timestamps. These rich, nuanced, and fine-grained data allow the writing sessions to be replayed, thereby enabling designers to examine the same sessions from multiple analytical perspectives to better understand language model capabilities.

Overview

  • Types of writing - Creative & Argumentative
    • Creative writing: 830 stories written by 58 writers
    • Argumentative writing: 615 essays written by 49 writers
  • Interface - Text editor where writers can press the tab key to query the system and get five suggestions whenever they want
  • Writing logs - All interactions in the text editor are recorded with associated timestamps and can be used to replay writing sessions from beginning to end
  • Writers - Qualified crowd workers from Amazon Mechanical Turk

Basic statistics

  • Stories and essays: 418 words long
  • Number of queries: 11.8 queries per writing session
  • Acceptance rate of suggestions: 72.3%
  • Percentage of text written by humans: 72.6%



Getting Started

Example replay of a writing session in CoAuthor:

Example of replay

Browse more writing sessions in CoAuthor (best viewed on desktop):

Download a copy of the dataset (distributed under the CC BY-SA 4.0 license):



Have Questions?

Ask us questions at minalee@cs.stanford.edu!