Tracery and Python

by Allison Parrish

This is a tutorial on how to use Tracery in Python. Tracery is a computer language for random text generation originally developed by Kate Compton.

The easiest way to use Tracery in Python is to install pytracery, my Python port of Kate's original code. You can install it on the command line with pip:

$ pip install tracery

(You may need to type sudo in front of pip: sudo pip install tracery.) If you're running this in Jupyter Notebook, you can execute the cell below:

In [2]:
!pip install tracery
Collecting tracery
  Downloading tracery-0.1.1.tar.gz
Building wheels for collected packages: tracery
  Running setup.py bdist_wheel for tracery ... done
  Stored in directory: /Users/allison/Library/Caches/pip/wheels/23/56/b4/6160d182b2df47d125be5a2fd0b120a5efe21f33cb066bdf7c
Successfully built tracery
Installing collected packages: tracery
Successfully installed tracery-0.1.1

Then you need to import the tracery library:

In [3]:
import tracery

You don't need Python to use Tracery! Here's a version of this tutorial that you can use with any implementation of Tracery. I recommend Beau Gunderson's Tracery writer as a kind of Tracery playground. You can also use Kate Compton's Tracery tutorial, which has a visual editor or Cheap Bots Done Quick, which has a built-in editor for writing Tracery grammars for Twitter bots with a minimum amount of fuss.

You might be interested in reading Nora Reed's explanation of how @nerdgarbagebot works, which takes you through the process of ideating and implementing a Tracery grammar for a Twitter bot. (Nora Reed makes a lot of amazing bots with Tracery, including @thinkpiecebot.)

Rules and expansions

A Tracery grammar is a series of rules that tell the computer how to put text together, piece by piece. Tracery grammars consist of a series of rules and expansions. The goal of writing a Tracery grammar is to write rules and expansions that, when followed by the computer, produce interesting (funny, insightful, poetic) text. The word for generating a text from a grammar is "expand"---we'll be talking a lot below about "expanding" the grammar into a text. (Hopefully the reasons for using this word will become clear!)

In Python, Tracery rules and expansions are written as dictionaries, where the rules are keys and the expansions are values. Here's an example of a complete, but very boring, Tracery grammar:

In [5]:
rules = {
  "origin": "Hello, world!"
}

To generate text from this grammar, first create a Tracery Grammar object like so, passing the rules as the only parameter:

In [6]:
grammar = tracery.Grammar(rules)

Then call the .flatten() method of the Grammar object with "#origin#" as the only parameter. (I'll talk about what #origin# means in a second.)

In [7]:
grammar.flatten("#origin#")
Out[7]:
'Hello, world!'

This grammar can produce only one text: Hello, world!. Not very interesting, but helpful for the moment to illustrate how a grammar is put together and how to make it produce some output.

Here's a Tracery grammar with two rules, written again as a dictionary, where each rule and its expansion are key/value pairs:

In [9]:
rules = {
  "origin": "Hello, #noun#!",
  "noun": "galaxy"
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")
Out[9]:
'Hello, galaxy!'

This grammar, again, can only ever produce one text: Hello, galaxy! But it accomplishes it in a slightly more sophisticated way. Notice in the expansion for the origin rule the following text:

#noun#

When the Tracery generator encounters text that looks like this---a word surrounded by # signs---it looks in the grammar for a rule with the same name as the word, and replaces the text with the expansion for that rule.

Let's add a third rule to this grammar, just to see how it looks:

In [19]:
rules = {
  "origin": "#greeting#, #noun#!",
  "greeting": "Howdy",
  "noun": "galaxy"
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")
Out[19]:
'Howdy, galaxy!'

EXERCISE: Add another rule for the punctuation at the end of the sentence, so that the grammar produces the text "Howdy, galaxy?"

Adding alternatives

The examples above are really boring, because they can only ever produce one output. In order for a grammar to be able to produce different outputs, we need to make the expansions of our rules have alternatives for the computer to choose between. Rules with alternatives look like this:

"rule": ["alternative one", "alternative two", "alternative three"]

That is: the value of the rule is a list of strings (instead of an individual string). When Tracery expands a rule whose value is a list, it will select one item from the list at random.

Here's our "Hello, world!" grammar, now with multiple alternatives for what we're greeting:

In [20]:
rules = {
  "origin": "#greeting#, #noun#!",
  "greeting": "Howdy",
  "noun": ["world", "solar system", "galaxy", "local cluster", "universe"]
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")
Out[20]:
'Howdy, local cluster!'

Run the cell over and over again and you'll see different outputs. (Sometimes it'll look like it isn't working, but that's just because the computer randomly selected the same alternative twice in a row. It can happen!)

Let's make this "Hello, world!" example even more interesting by adding alternatives for the greeting rule:

In [57]:
rules = {
  "origin": "#greeting#, #noun#!",
  "greeting": ["Howdy", "Hello", "Greetings", "What's up", "Hey", "Hi"],
  "noun": ["world", "solar system", "galaxy", "local cluster", "universe"]
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")
Out[57]:
'Howdy, galaxy!'

Sometimes for debugging purposes, it's nice to generate multiple outputs from the same grammar in one cell execution. To do this, print() the value of the .flatten() function in a for loop. (You don't have to re-create the Grammar object each time.)

In [62]:
rules = {
  "origin": "#greeting#, #noun#!",
  "greeting": ["Howdy", "Hello", "Greetings", "What's up", "Hey", "Hi"],
  "noun": ["world", "solar system", "galaxy", "local cluster", "universe"]
}
grammar = tracery.Grammar(rules)
for i in range(5):
    print(grammar.flatten("#origin#"))
Howdy, universe!
Hi, world!
Greetings, local cluster!
What's up, galaxy!
Greetings, universe!

Remember that in Python you can format dictionaries and lists with some flexibility. For example, your grammar might be a bit more readable if you write each option on a separate line:

In [35]:
rules = {
  "origin": "#greeting#, #noun#!",
  "greeting": [
    "Howdy",
    "Hello",
    "Greetings",
    "What's up",
    "Hey",
    "Hi"
  ],
  "noun": [
    "world",
    "solar system",
    "galaxy",
    "local cluster",
    "universe"
  ]
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")
Out[35]:
'Greetings, world!'

You don't always have to write the expansions as string literals and list literals. You can use a variable with a list assigned to it, for example—this is especially helpful if you have a long list of things that you plan to use in multiple grammars, or if you want to get the list of things from another source (e.g., a text file).

Modifiers

Let's make a more sophisticated grammar that produces sentences in the format "Dammit Jim, I'm a X, not a Y!" popularized by the ground-breaking science fiction program, Star Trek. I happen to have a list of professions, which I'm going to put into a variable here. (I got this list from Darius Kazemi's Corpora Project—an excellent place to find lists of things. And they're already preformatted in a way that makes it easy to cut-and-paste them into your Tracery grammars.)

In [36]:
professions = [
    "accountant",
    "actor",
    "archeologist",
    "astronomer",
    "audiologist",
    "bartender",
    "curator",
    "detective",
    "economist",
    "editor",
    "engineer",
    "epidemiologist",
    "farmer",
    "flight attendant",
    "forest fire prevention specialist",
    "graphic designer",
    "hydrologist",
    "librarian",
    "mathematician",
    "middle school teacher",
    "nutritionist",
    "painter",
    "rancher",
    "referee",
    "reporter",
    "sailor",
    "sociologist",
    "stonemason",
    "surgeon",
    "tailor",
    "taxi driver",
    "teacher",
    "therapist",
    "tour guide",
    "umpire",
    "undertaker",
    "urban planner",
    "veterinarian",
    "web developer",
    "welder",
    "writer",
    "zoologist"
]

The grammar for generating our Star Trek phrase might look like this:

In [42]:
rules = {
  "origin": "#interjection#, #name#! I'm a #profession#, not a #profession#!",
  "interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
    "good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
  "name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
  "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")
Out[42]:
"rats, Steve! I'm a tailor, not a economist!"

This is pretty good, but there are problems. The first is that we typed in all of the interjections in lower case, but they're supposed to have the first letter capitalized (since they're at the beginning of the sentence). The second problem is that the grammar occasionally produces something like

yes, George! I'm a economist, not a zoologist!

"A economist" isn't right. It should be "an economist." English indefinite articles are tricky that way!

There are several ways to solve these problems. We could just change all of our interjections to be capitalized, and add the appropriate article to the beginning of each profession. But (1) this will be time consuming and (2) it means that we won't ever be able to re-use those same rules with the unmodified versions of those rules. What to do?

Thankfully, Tracery comes equipped with a series of modifiers that take the expansion of a rule and apply a transformation to it. The modifiers are included with pytracery, but they're in a separate module, so you need to import them in their own import statement:

In [39]:
from tracery.modifiers import base_english

And then you have to explicitly "add" them to the Grammar object after you create it, like so:

grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)

The two modifiers we're going to use are .a, which adds the appropriate indefinite article before the expansion of a rule, and .capitalize, which capitalizes the first letter of the expansion.

Use the modifers by adding .a inside the # signs, right after the name of the rule. For example, change:

#interjection#

to

#interjection.capitalize#

Here's our "Dammit Jim" generator with the modifiers in place:

In [45]:
rules = {
  "origin": "#interjection.capitalize#, #name#! I'm #profession.a#, not #profession.a#!",
  "interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
    "good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
  "name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
  "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
grammar.flatten("#origin#")
Out[45]:
"Congratulations, Jim! I'm an accountant, not an undertaker!"

Nice! Another modifier you can use is .s, which turns the text in the expansion into its plural version. Using this, we can modify the above example to be a Star Wars meme instead of a Star Trek one:

In [47]:
rules = {
  "origin": "These aren't the #profession.s# we're looking for.",
  "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
grammar.flatten("#origin#")
Out[47]:
"These aren't the urban planners we're looking for."

The origin rule

By convention, the "starting" rule of Tracery grammars is called origin. A lot of tools that use Tracery grammars follow this convention, and for ease of interoperability it's probably best if you do too. But you can actually use any name you want, as long as you use that name in the call to .flatten(). For example, we could rewrite the above example like so:

In [55]:
rules = {
  "origin": "These aren't the #profession.s# we're looking for.",
  "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
grammar.flatten("#origin#")
Out[55]:
"These aren't the referees we're looking for."

Like any other rule, the "starting" rule can have multiple options. We could use this to, for example, create a grammar that outputs Star Wars memes half the time and Star Trek memes the other half:

In [66]:
rules = {
  "origin": ["#interjection.capitalize#, #name#! I'm #profession.a#, not #profession.a#!",
             "These aren't the #profession.s# we're looking for."],
  "interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
    "good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
  "name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
  "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
for i in range(10):
    print(grammar.flatten("#origin#"))
Hallelujah, John! I'm an urban planner, not a bartender!
Yes, Larry! I'm an archeologist, not a rancher!
Whoa, Tom! I'm a tailor, not a hydrologist!
Alas, George! I'm a tailor, not a farmer!
These aren't the urban planners we're looking for.
Oops, Steve! I'm a painter, not a reporter!
These aren't the tailors we're looking for.
Eureka, Tom! I'm an actor, not a sociologist!
These aren't the astronomers we're looking for.
Oops, Gary! I'm an actor, not an archeologist!

Rules within rules within rules

The grammars we've written together so far have replacement syntax (#somethinglikethis#) only in the expansions for the origin rule. But you can include that syntax in any expansion you want! This is a powerful tool for building sophisticated grammars that are built up from reusable parts. For example, this tiny model of English reuses the noun and verb rules in multiple places, thereby preventing repetition and increasing expressiveness.

In [84]:
rules = {
    "origin": "#nounphrase.capitalize# #verbphrase#.",
    "nounphrase": ["the #noun#", "the #noun#", "#noun.a#", "#noun.a#", "the #noun# that #verbphrase#",
                   "the #noun# #prep# #nounphrase#"],
    "verbphrase": ["#verb#", "#verb# #nounphrase#", "#verb# #prep# #nounphrase#"],
    "noun": ["amoeba", "dichotomy", "seagull", "trombone", "corsage", "restaurant", "suburb"],
    "verb": ["awakens", "bends", "burns", "closes", "expands", "fails", "fractures", "gathers",
             "melts", "opens", "ripens", "scatters", "stops", "sways", "turns", "unfurls", "worries"],
    "prep": ["in", "on", "over", "against"]
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
for i in range(10):
    print(grammar.flatten("#origin#"))
The suburb against a restaurant melts on the dichotomy.
The corsage turns a suburb.
The restaurant fails the restaurant that stops.
The restaurant that opens the restaurant that gathers unfurls.
The dichotomy against a dichotomy ripens.
A trombone melts.
The suburb that unfurls an amoeba fractures.
The amoeba ripens the corsage that scatters the trombone.
The amoeba over a seagull opens against the corsage in the restaurant.
A restaurant ripens.

Next steps

Congratulations, you now know the basics of writing a Tracery grammar and how to use them in Python.

Tracery has a number of features that we didn't go into here, including the ability to save the output of a rule to be re-used later in the same expansion. See Kate Compton's tutorial for more information. You might be interested in these advanced text generators that Kate Compton made with Tracery.

If you're a Javascript programmer and you want to incorporate Tracery into your own projects, the source code is available here (also available as a Node module).