Peter Norvig
29 December 2015

Refactoring a Crossword Game Program

In my CS 212 class on Udacity, the most complex lesson involved a crossword game program (for games such as Scrabble® and Words with Friends®). The program was developed incrementally. First I asked "what words can be made with a rack of seven letters?", then I asked "how can you place words onto a single row?", and finally, I, with the help of the students, developed a program to find the highest scoring play anywhere on the board. This approach made for a good sequence of exercises, each building on the previous one. But the code ended up being overly complicated—it accumlated technical debt—because it kept around ideas from each iteration.

In this notebook I will refactor the program to pay off the debt.

Vocabulary

Our program uses these concepts:

  • Dictionary: A set of all legal words.
  • Word: a string of letters. Words in the dictionary are all uppercase.
  • Tile: a letter (or a blank) that can be played on the board to form words.
  • Blank: a tile with no letter on it; the player who places it on the board gets to choose which letter it will represent.
  • Rack: a collection of up to seven tiles that a player may use to make words.
  • Board: a grid of squares onto which players play tiles to make words.
  • Square: a location on the board; a square can hold one tile. (The variable s will stand for a square number, and sq for the contents of a square.)
  • Bonus: some squares give you bonus scores: double or triple letter or word scores.
  • Play: a play consists of placing some tiles on the board to form a continuous string of letters in one direction (across or down), such that only valid words are formed, and such that one of the letters is placed on an anchor square.
  • Anchor square: Every play must place a letter on an anchor square: either the center start square or a square that is adjacent to a tile previously played on the board.
  • Direction: Every play must be in either the ACROSS or DOWN direction. (The variable dir stands for a direction.)
  • Cross word: a word formed in the other direction from a play. For example, a play forms a word in the across direction, and in doing so, places a letter that extends a word in the down direction. This new extended cross word must be in the dictionary.
  • Score: the points awarded for a play, consisting of the sum of the word scores for each word made (the main word and possibly any cross words), plus a bingo bonus if all seven letters are used.
  • Word score: Each word scores the sum of the letter scores for each tile (either placed by the player or already on the board but part of the word) times the word bonus score. The word bonus score starts at 1, and is multiplied by 2 for each double word square and 3 for each triple word square covered by a tile on this play.
  • Letter score: The letter score is the value on the letter tile (for example, 1 for A and 10 for Q) times the letter bonus score. The letter bonus is 2 when a tile is first placed on a double letter square (or the center star) and 3 when first placed on a triple letter square; it is 1 for a tile already on the board, or for a new tile played on a non-letter-bonus square. The letter score for a blank tile is always zero.
  • Bingo: a bonus gained by using all seven tiles in one play. In Words with Friends® the bingo bonus is 35; in Scrabble® it is 50.
  • Game: players take turns making plays until one player has no more tiles. After making a play, the player's rack is replenished with tiles until the player has 7 tiles or until the bag of tiles is empty.
  • Prefix: a string of zero or more letters that starts some word in the dictionary. Not a concept that has to do with the rules of the game; it will be important in our algorithm that finds valid plays.

This notebook uses these imports:

In [1]:
from __future__  import division, print_function
from collections import defaultdict, namedtuple
from IPython.display import HTML, display
import random

Dictionary and Words

We will represent the dictionary as a set of words. A word is an uppercase string of letters, like 'WORD'. There are several standard dictionaries used by different communities of players; we will use the ENABLE dictionary—we can cache a local copy with this shell command:

In [2]:
! [ -e enable1.txt ] || curl -O http://norvig.com/ngrams/enable1.txt

Now we can define a word and load the dictionary:

In [3]:
def Word(w) -> str: return w.strip().upper()

DICTIONARY = {Word(w) for w in open('enable1.txt')}

def is_word(word) -> bool: 
    "Is this a legal word in the dictionary?"
    return word.upper() in DICTIONARY
In [4]:
len(DICTIONARY)
Out[4]:
172820
In [5]:
list(DICTIONARY)[:10]
Out[5]:
['DECIPHER',
 'ROUGHLY',
 'DULLSVILLES',
 'INDEHISCENT',
 'SAGEBRUSHES',
 'COMES',
 'WITCHERIES',
 'HOMES',
 'PECK',
 'GEEZ']
In [6]:
'WORD' in DICTIONARY
Out[6]:
True

Tiles, Blanks, and Racks

We'll represent a tile as a one-character string, like 'W'. We'll represent a rack as a string of tiles, usually of length 7, such as 'EELRTTS'. (I also considered a collections.Counter to represent a rack, but felt that str was simpler, and with the rack size limited to 7, efficiency was not a major issue.)

The blank tile causes some complications. We'll represent a blank in a player's rack as the underscore character, '_'. But once the blank is played on the board, it must be used as if it was a specific letter. However, it doesn't score the points of the letter. I chose to use the lowercase version of the letter to represent this. That way, we know what letter the blank is standing for, and we can distingush between scoring and non-scoring tiles. For example, 'EELRTT_' is a rack that contains a blank; and 'LETTERs' is a word played on the board that uses the blank to stand for the letter S.

We'll define letters to give all the distinct letters that can be made by a rack, and remove to remove letters from a rack (after they have been played).

In [7]:
BLANK    = '_'     # The blank tile (as it appears in the rack)
cat      = ''.join # Function to concatenate strings

def letters(rack) -> str:
    "All the distinct letters in a rack (including lowercase if there is a blank)."
    if BLANK in rack:
        return cat(set(rack.replace(BLANK, ''))) + 'abcdefghijklmnopqrstuvwxyz'
    else:
        return cat(set(rack))
    
def remove(tiles, rack) -> str:
    "Return a copy of rack with the given tile(s) removed."
    for tile in tiles:
        if tile.islower(): tile = BLANK
        rack = rack.replace(tile, '', 1)
    return rack
In [8]:
is_word('LETTERs')
Out[8]:
True
In [9]:
letters('LETTERS')
Out[9]:
'TLRSE'
In [10]:
letters('EELRTT_')
Out[10]:
'TLREabcdefghijklmnopqrstuvwxyz'
In [11]:
remove('SET', 'LETTERS')
Out[11]:
'LTER'
In [12]:
remove('TREaT', 'LETTER_') 
Out[12]:
'LE'

The Board, Squares, Directions, and Bonus Squares

In the previous version of this program, the board was a two-dimensional matrix, and a square on the board was denoted by a (row, col) pair of indexes. There's nothing wrong with that representation, but for this version we will choose a different representation that is simpler in most ways:

  • The board is represented as a one-dimensional list of squares.
  • The default board is 15×15 squares, but we will include a border around the outside, making the board of size 17×17.
  • Squares are denoted by integer indexes, from 0 to 288.
  • To move in the ACROSS direction from one square to the next, increment the square index by 1.
  • To move in the DOWN direction from one square to the next, increment the square index by 17.
  • The border squares are filled with a symbol, OFF, indicating that they are off the board. The advantage of the border is that the code never has to check if it is at the edge of the board; it can always look at the neighboring square without fear of indexing off the end of the board.
  • Each square on the board is initially filled by a symbol indicating the bonus value of the square. When a tile is placed on a square, the tile replaces the bonus value.

How will we implement this? We'll define Board as a subclass of list and give it two additional attributes:

  • down: the increment to move in the down direction; 17 for a standard board.
  • directions: the four increments to move to any neighboring square; (1, 17, -1, -17) in a standard board.

Jupyter/Ipython notebooks have a special convention for displaying objects in HTML. We will adopt it as a method of Board:

  • _repr_html_: return a string of HTML that displays the board as a table.
In [13]:
ACROSS = 1   # The 'across' direction; 'down' depends on the size of the board
OFF    = '#' # A square that is off the board
SL, DL, TL, STAR, DW, TW = EMPTY = '.:;*-=' # Single/double/triple letter; star, double/triple word bonuses

Square    = int # Squares are implemented as integer indexes.
Direction = int # Directions are implemented as integer increments

class Board(list):
    """A Board is a (linear) list of squares, each a single character.
    Note that board[s + down] is directly below board[s]."""

    def __init__(self, squares):
        list.__init__(self, squares)
        down = int(len(squares)**0.5)
        self.down = down
        self.directions = (ACROSS, down, -ACROSS, -down)
        
    def _repr_html_(self) -> str: return board_html(self)

We'll define WWF as the standard board for Words with Friends®.

In [14]:
WWF = Board("""
# # # # # # # # # # # # # # # # #
# . . . = . . ; . ; . . = . . . #
# . . : . . - . . . - . . : . . #
# . : . . : . . . . . : . . : . #
# = . . ; . . . - . . . ; . . = #
# . . : . . . : . : . . . : . . #
# . - . . . ; . . . ; . . . - . #
# ; . . . : . . . . . : . . . ; #
# . . . - . . . * . . . - . . . #
# ; . . . : . . . . . : . . . ; #
# . - . . . ; . . . ; . . . - . #
# . . : . . . : . : . . . : . . #
# = . . ; . . . - . . . ; . . = #
# . : . . : . . . . . : . . : . #
# . . : . . - . . . - . . : . . #
# . . . = . . ; . ; . . = . . . #
# # # # # # # # # # # # # # # # #
""".split())
In [15]:
assert len(WWF) == 17 * 17

Displaying the Board in HTML

I want to diaplay the board in HTML, as a table with different background colors for the bonus squares; and gold-colored letter tiles. I also want to display the point values for each letter on the tiles; I'll use a defaultdict of {letter: int} named POINTS for that.

In [60]:
def board_html(board) -> str:
    "An HTML representation of the board."
    size = board.down - 2
    squares = [square_html(sq) for sq in board if sq != OFF]
    row = ('<tr>' + '{}' * size)
    return ('<table>' +  row * size + '</table>').format(*squares)
    
board_colors = {
     DL: ('lightblue',  66, 'DL'),
     TL: ('lightgreen', 66, 'TL'),
     DW: ('lightcoral', 66, 'DW'),
     TW: ('orange',     66, 'TW'),
     SL: ('whitesmoke', 66, ''),
     STAR: ('violet',  100, '&#10029;')}

def square_html(sq) -> str:
    "An HTML representation of a square."
    color, size, text = board_colors.get(sq, ('gold', 120, sq))
    if text.isupper(): 
        text = '<b>{}</b><sup style="font-size: 60%">{}</sup>'.format(text, POINTS.get(text, ''))
    style = "background-color:{}; font-size:{}%; width:25px; height:25px; text-align:center; padding:0px"
    return ('<td style="' + style + '">{}').format(color, size, text)

POINTS = defaultdict(int, 
         A=1, B=3, C=3, D=2,  E=1, F=4, G=2, H=4, I=1, J=8, K=5, L=1, M=3, 
         N=1, O=1, P=3, Q=10, R=1, S=1, T=1, U=1, V=4, W=4, X=8, Y=4, Z=10)
In [61]:
WWF
Out[61]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDW
TLDLDLTL
DWDW
TLDLDLTL
DWTLTLDW
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DLDWDWDL
TWTLTLTW

Plays

A Play describes the placement of tiles on the board. We will implement Play as a named tuple of four components:

  • start: the index number of the square that holds the first letter in the word.
  • dir: the direction, with 1 indicating ACROSS and board.down (normally, 17) indicating DOWN.
  • letters: the letters of the word, in order, as a str. Blanks are lowercase. Some letters are from the rack; some may have been on the board.
  • rack: the letters that would remain in the player's rack after making this play. Not strictly necessary as part of the play, but useful information.

The function make_play returns a new board with the play made on it. It does not do any checking to see if the play follows the rules.

In [18]:
Play = namedtuple('Play', 'start, dir, letters, rack')

def make_play(board, play) -> Board:
    "Make the play on a copy of board and return the copy."
    copy = Board(board)
    end = play.start + len(play.letters) * play.dir
    copy[play.start:end:play.dir] = play.letters
    return copy

Example Board

Let's test out what we've done so far. I'll put some words on a board, which I will call board:

In [19]:
DOWN = WWF.down
plays = {Play(145, DOWN,   'ENTER', ''),
         Play(144, ACROSS, 'BE', ''),
         Play(138, DOWN,   'GAVE', ''),
         Play(158, DOWN,   'MUSES', ''),
         Play(172, ACROSS, 'VIRULeNT', ''),
         Play(213, ACROSS, 'RED', ''),
         Play(198, ACROSS, 'LYTHE', ''),
         Play(147, DOWN,   'CHILDREN', ''),
         Play(164, ACROSS, 'HEARD', ''),
         Play(117, DOWN,   'BRIDLES', ''),
         Play(131, ACROSS, 'TOUR', '')}

board = Board(WWF)
for play in plays:
    board = make_play(board, play)
board
Out[19]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDWB3
TLDLDLT1O1U1R1
G2DWB3E1C3DWI1
TLA1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1TLI1DWL1
E1DLS1DLE1L1Y4T1H4E1
TWTLE1DWR1E1D2TLS1
DLS1R1DL
DLDWDWE1DL
TWTLTLN1TW

Strategy for Finding Legal Plays

This is our strategy for finding all possible legal plays on a board:

  • Find all anchor squares on the board. An anchor square is a square that is adjacent to a letter on the board—every legal move must place a letter on an anchor square. (On the game's first play, there are no letters on the board, and the STAR square in the center counts as the only anchor square.)
  • Using just the letters in the rack, find all prefixes of words in the dictionary. For example, with the rack ABC, we find that B, BA, and BAC are all prefixes of the word BACK (and the rack contains other prefixes of other words as well).
  • For each anchor square and for both directions (across and down):
    • Try each prefix before the anchor (that is, to the left or above the anchor). Don't allow a prefix to extend to another anchor or off the board. That means we won't have to worry about cross words for the prefix. If there are already letters on the board before the anchor point, use them as the prefix rather than prefixes from the rack.
    • Starting at the anchor, march forward one square at a time, trying to fill empty squares with each possible letter from the rack that forms a valid word prefix. If the march forward hits letters that are already on the board, make sure they form a valid prefix too. Also check that any cross words are valid words. When we make a complete word (with an empty or OFF square ahead), yield the play that made the word.

So, each legal play will have a prefix of zero or more letters, followed by one letter from the rack covering an anchor square, followed by zero or more additional letters, which can be from the rack or already on the board.

Prefixes

Here we define the set of all prefixes of all words in the dictionary:

In [20]:
def dict_prefixes(dictionary) -> set:
    "The set of all prefixes of each word in a dictionary."
    return {word[:i] for word in dictionary for i in range(len(word))}

PREFIXES = dict_prefixes(DICTIONARY)
In [21]:
len(PREFIXES)
Out[21]:
276374

That's too many prefixes to look at; let's try a smaller example below. Note that the empty string is a prefix, and we include HELP because it is a prefix of HELPER, but we don't include HELPER, because there is nothing we can add to it to make a word in this dictionary:

In [22]:
dict_prefixes({'HELLO', 'HELP', 'HELPER'})
Out[22]:
{'', 'H', 'HE', 'HEL', 'HELL', 'HELP', 'HELPE'}

The function rack_prefixes gives the set of prefixes that can be made just from the letters in the rack. Most of the work is done by extend_prefixes, which accumulates a set of prefixes into results:

In [23]:
def rack_prefixes(rack) -> set: 
    "All word prefixes that can be made by the rack."
    return extend_prefixes('', rack, set())

def extend_prefixes(prefix, rack, results) -> set:
    if prefix.upper() in PREFIXES:
        results.add(prefix)
        for L in letters(rack):
            extend_prefixes(prefix+L, remove(L, rack), results)
    return results
In [24]:
rack = 'ABC'
rack_prefixes(rack)
Out[24]:
{'', 'A', 'AB', 'AC', 'B', 'BA', 'BAC', 'C', 'CA', 'CAB'}

The number of prefixes in a rack is usually on the order of a hundred, unless there is a blank in the rack:

In [25]:
len(rack_prefixes('LETTERS'))
Out[25]:
155
In [26]:
len(rack_prefixes('LETTER_'))
Out[26]:
1590

Anchor Squares

An anchor square is either the star in the middle of the board, or an empty square that is adjacent to a letter:

In [27]:
def is_anchor(board, s) -> bool:
    "Is this square next to a letter already on the board? (Or is it a '*')?"
    return (board[s] == STAR or
            board[s] in EMPTY and any(board[s + d].isalpha() for d in board.directions))

def all_anchors(board) -> list:
    "A list of all anchor squares on the board."
    return [s for s in range(len(board)) if is_anchor(board, s)]
In [28]:
all_anchors(WWF)
Out[28]:
[144]

Plays on Example Board

Let's work through the process of finding plays on the example board. First, we'll find all the anchors:

In [29]:
anchors = all_anchors(board)
len(anchors)
Out[29]:
53

To visualize these anchors, we'll make each one be a star, on a copy of board:

In [62]:
board2 = Board(board)
for a in anchors:
   board2[a] = STAR
    
board2
Out[62]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLB3
TLDLT1O1U1R1
G2DWB3E1C3I1
A1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1I1L1
E1S1E1L1Y4T1H4E1
TWE1R1E1D2S1
DLS1R1DL
DLDWE1DL
TWTLTLN1

Now we'll define a rack, and find all the prefixes for the rack:

In [31]:
rack = 'ABCHKNQ'

prefixes = rack_prefixes(rack)
len(prefixes)
Out[31]:
88
In [32]:
' '.join(prefixes)
Out[32]:
' ANKH ACN CANK BHA ACK HA HACKN A HACKB AN ABN KAC AQ AK AC BH CAN BAK BAC HANC CN N NAB H ANC C ANK CHAN CHAK KH KHAN NAK HAK BA QAN HANK CAH Q BACK CAB KHA BACH HAC BAH BANK CAK CA NACH AHC KAN NAC KAH KACH AH HACK BANKC CHAQ BANQ ANCH BHAK KANB BAN KAB ACKN NA K AB BACKH ANH KN KB HAB KNA KNAC B KA ABH CHA CHAB HAN ACQ CH BANC KBA ACH QA BHAN'

We wont go through all the anchor/prefix combinations; we'll just pick one: the anchor above the M in MUSES:

In [63]:
board3 = Board(board)
anchor = 141
board3[anchor] = STAR
board3
Out[63]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDWB3
TLDLDLT1O1U1R1
G2DWB3E1C3DWI1
TLA1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1TLI1DWL1
E1DLS1DLE1L1Y4T1H4E1
TWTLE1DWR1E1D2TLS1
DLS1R1DL
DLDWDWE1DL
TWTLTLN1TW

There's only room for prefixes of length 0 or 1, because anything longer than that would hit the anchor to the right of the G in GAVE; to avoid duplication of effort, we only allow words to run into other anchors on the right, not the left. Let's try the 1-letter prefix B first:

In [64]:
board3[140] = 'B'
board3
Out[64]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDWB3
TLDLDLT1O1U1R1
G2B3B3E1C3DWI1
TLA1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1TLI1DWL1
E1DLS1DLE1L1Y4T1H4E1
TWTLE1DWR1E1D2TLS1
DLS1R1DL
DLDWDWE1DL
TWTLTLN1TW

Now we can start to march forward. On the anchor square we can place any letter from the rack that makes a valid prefix, and that also turns .MUSES into a valid word. There's only one such letter, A:

In [35]:
board3[141] = 'A'
assert 'BA' in PREFIXES and is_word('A' + 'MUSES')
board3
Out[35]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDWB3
TLDLDLT1O1U1R1
G2B3A1B3E1C3DWI1
TLA1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1TLI1DWL1
E1DLS1DLE1L1Y4T1H4E1
TWTLE1DWR1E1D2TLS1
DLS1R1DL
DLDWDWE1DL
TWTLTLN1TW

We can continue marching forward, trying letters from the rack that form valid prefixes. Let's try the combination CK:

In [36]:
board3[142:144] = 'CK'
assert 'BACKBE' in PREFIXES
board3
Out[36]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDWB3
TLDLDLT1O1U1R1
G2B3A1C3K5B3E1C3DWI1
TLA1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1TLI1DWL1
E1DLS1DLE1L1Y4T1H4E1
TWTLE1DWR1E1D2TLS1
DLS1R1DL
DLDWDWE1DL
TWTLTLN1TW

We've spelled the word BACK, but we can't count it as a legal play, because we've hit two adjacent letters, BE, that are already on the board. We check that BACKBE froms a valid prefix, and continue to the next empty square, where we can choose an N:

In [37]:
board3[146] = 'N'
assert 'BACKBENC' in PREFIXES
board3
Out[37]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDWB3
TLDLDLT1O1U1R1
G2B3A1C3K5B3E1N1C3DWI1
TLA1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1TLI1DWL1
E1DLS1DLE1L1Y4T1H4E1
TWTLE1DWR1E1D2TLS1
DLS1R1DL
DLDWDWE1DL
TWTLTLN1TW

We continue to the next square (a double word square), and place an H, which completes a word, BACKBENCH, and simultaneously makes a cross word, THE:

In [38]:
board3[148] = 'H'
assert is_word('BACKBENCH') and is_word('THE')
board3
Out[38]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDWB3
TLDLDLT1O1U1R1
G2B3A1C3K5B3E1N1C3H4I1
TLA1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1TLI1DWL1
E1DLS1DLE1L1Y4T1H4E1
TWTLE1DWR1E1D2TLS1
DLS1R1DL
DLDWDWE1DL
TWTLTLN1TW

We would record this play, and backtrack to consider other letters for this and other prefix/anchor combinations. Now let's code this up!

Code for Finding All Plays

The function all_plays generates all legal plays by first trying all prefix plays, and then trying to extend each one, one letter at a time. (Note that it also generates the empty play, because a player always has the option of passing.)

In [39]:
def all_plays(board, rack):
    """Generate all plays that can be played on board with this rack.
    Try placing every possible prefix before every anchor point; 
    then extend one letter at a time, looking for valid plays."""
    anchors  = all_anchors(board)
    prefixes = rack_prefixes(rack)
    yield Play(0, 1, '', rack) # The empty play (no letters, no points)
    for anchor in anchors:
        for dir in (ACROSS, board.down):
            for play in prefix_plays(prefixes, board, anchor, dir, rack):
                yield from extend_play(board, play)

Note the syntax yield from, new in Python 3.3: "yield from c" is the same as "for x in c: yield x".

Now for the function prefix_plays, which returns a list of all partial plays consisting of a prefix placed before the anchor. Note that these are not legal plays; they are partial plays, some of which will end up being extended into legal plays.

There are two cases: if there are letters on the board immediately before the anchor, then those letters form the only allowable prefix. If not, we can use any prefix from the rack up to maxlen, which is the number of empty squares that do not run into another anchor, nor off the board.

In [40]:
def prefix_plays(prefixes, board, anchor, dir, rack) -> list:
    "Return all Plays of a prefix to the left/above anchor."
    if board[anchor-dir].isalpha(): # Prefix already on the board; only 1 prefix
        start = scan_letters(board, anchor, -dir)
        return [Play(start, dir, cat(board[start:anchor:dir]), rack)]
    else: # Prefixes from rack fit in space before anchor
        maxlen = (anchor - scan_to_anchor(board, anchor, -dir)) // dir
        return [Play(anchor - len(prefix) * dir, dir, prefix, remove(prefix, rack))
                for prefix in prefixes if len(prefix) <= maxlen]

Now extend_play takes a partial play, determines the square, s, that is one square past the end of the play, and tries all possible letters there. If adding a letter forms a valid prefix (and also does not form an invalid cross word), then we continue on (by calling extend_play recursively). If adding the letter forms a valid word, we yield the play.

In [41]:
def extend_play(board, play):
    "Explore all ways of adding to end of play; return ones that form full words."
    s = play.start + play.dir * len(play.letters)
    if board[s] == OFF: return
    cword = crossword(board, s, play.dir)
    possible_letters = board[s].upper() if board[s].isalpha() else letters(play.rack)
    for L in possible_letters:
        prefix2 = play.letters + L
        if prefix2.upper() in PREFIXES and valid_crossword(cword, L):
            rack2 = play.rack if board[s].isalpha() else remove(L, play.rack)
            play2 = Play(play.start, play.dir, prefix2, rack2)
            if is_word(prefix2) and not board[s + play.dir].isalpha():
                yield play2
            yield from extend_play(board, play2)

def scan_letters(board, s, dir) -> Square:
    "Return the last square number going from s in dir that is a letter."
    while board[s + dir].isalpha():
        s += dir
    return s

def scan_to_anchor(board, s, dir) -> Square:
    "Return the last square number going from s in dir that is not an anchor nor off board."
    while board[s + dir] != OFF and not is_anchor(board, s + dir):
        s += dir
    return s

Crosswords

If adding a letter in, say, the ACROSS direction also adds on to a word in the DOWN direction, then we need to make sure that this cross word is also valid. The function crossword finds the cross word at square s and returns it with a '.' indicating the empty square where the new letter will be placed, so we would get '.MUSES' and 'T.E' for the two crosswords in the 'BACKBENCH' play.

In [42]:
def crossword(board, s, dir) -> str:
    """The word that intersects s in the other direction from dir.
    Use '.' for the one square that is missing a letter."""
    def canonical(L): return L if L.isalpha() else '.'
    d = other(dir, board)
    start = scan_letters(board, s, -d)
    end = scan_letters(board, s, d)
    return cat(canonical(board[s]) for s in range(start, end+d, d))

def valid_crossword(cword, L) -> bool:
    "Is placing letter L valid (with respective to the crossword)?"
    return len(cword) == 1 or cword.replace('.', L).upper() in DICTIONARY

def other(dir, board) -> Direction:
    "The other direction (across/down) on the board."
    return board.down if dir == ACROSS else ACROSS
In [43]:
crossword(board, 141, ACROSS)
Out[43]:
'.MUSES'
In [44]:
crossword(board, 148, ACROSS)
Out[44]:
'T.E'

The function valid_crossword checks if replacing the empty square with a specific letter will form a valid word:

In [45]:
valid_crossword('.MUSES', 'A')
Out[45]:
True

We can now see all the prefix plays for the anchor at 141 (just above MUSES):

In [46]:
prefix_plays(rack_prefixes(rack), board, 141, 1, rack)
Out[46]:
[Play(start=141, dir=1, letters='', rack='ABCHKNQ'),
 Play(start=140, dir=1, letters='A', rack='BCHKNQ'),
 Play(start=140, dir=1, letters='N', rack='ABCHKQ'),
 Play(start=140, dir=1, letters='H', rack='ABCKNQ'),
 Play(start=140, dir=1, letters='C', rack='ABHKNQ'),
 Play(start=140, dir=1, letters='Q', rack='ABCHKN'),
 Play(start=140, dir=1, letters='K', rack='ABCHNQ'),
 Play(start=140, dir=1, letters='B', rack='ACHKNQ')]

And we can see all the ways to extend the play of 'B' there:

In [47]:
set(extend_play(board, Play(start=140, dir=1, letters='B', rack='ACHKNQ')))
Out[47]:
{Play(start=140, dir=1, letters='BA', rack='CHKNQ'),
 Play(start=140, dir=1, letters='BACKBENCH', rack='Q'),
 Play(start=140, dir=1, letters='BAH', rack='CKNQ'),
 Play(start=140, dir=1, letters='BAN', rack='CHKQ')}

Scoring

Now we'll show how to count up the points made by a play. The score is the sum of the word score for the play, plus a bingo score if all seven letters are used, plus the sum of the word scores for any cross words. The word score is the sum of the letter scores (where each letter score may be doubled or tripled by a bonus square when the letter is first played on the square), all multiplied by any word bonus(es) encountered by the newly-placed letters.

In [48]:
def score(board, play) -> int:
    "The number of points scored by making this play on the board."
    return (word_score(board, play) 
            + bingo(board, play) 
            + sum(word_score(board, cplay) 
                  for cplay in cross_plays(board, play)))

def word_score(board, play) -> int:
    "Points for a single word, counting word- and letter-bonuses."
    total, word_bonus = 0, 1
    for (s, L) in enumerate_play(play):
        sq = board[s]
        word_bonus *= (3 if sq == TW else 2 if sq == DW else 1)
        total += POINTS[L] * (3 if sq == TL else 2 if sq == DL else 1)
    return word_bonus * total

def bingo(board, play) -> int:
    "A bonus for using 7 letters from the rack."
    return BINGO if (play.rack == '' and letters_played(board, play) == 7) else 0

BINGO = 35

Here are the various helper functions:

In [49]:
def letters_played(board, play) -> int:
    "The number of letters played from the rack."
    return sum(board[s] in EMPTY for (s, L) in enumerate_play(play))
    
def enumerate_play(play) -> list:
    "List (square_number, letter) pairs for each tile in the play."
    return [(play.start + i * play.dir, L) 
            for (i, L) in enumerate(play.letters)]
            
def cross_plays(board, play):
    "Generate all plays for words that cross this play."
    cross = other(play.dir, board)
    for (s, L) in enumerate_play(play):
        if board[s] in EMPTY and (board[s-cross].isalpha() or board[s+cross].isalpha()):
            start, end = scan_letters(board, s, -cross), scan_letters(board, s, cross)
            before, after = cat(board[start:s:cross]), cat(board[s+cross:end+cross:cross])
            yield Play(start, cross, before + L + after, play.rack)

What should the BACKBENCH play score? The word covers two double-word bonuses, but no letter bonuses. The sum of the letter point values is 3+1+3+5+3+1+1+3+4 = 24, and 24×2×2 = 96. The cross word AMUSES scores 8, and THE is on a double word bonus, so it scores 6×2 = 12. There is one letter remaining in the rack, so no bingo, just a total score of 96 + 8 + 12 = 116.

In [50]:
score(board, Play(start=140, dir=1, letters='BACKBENCH', rack='Q'))
Out[50]:
116

We can find the highest scoring play by enumerating all plays and taking the one with the maximum score:

In [51]:
def highest_scoring_play(board, rack) -> Play: 
    "Return the Play that gives the most points."
    return max(all_plays(board, rack), key=lambda play: score(board, play))
In [52]:
highest_scoring_play(board, rack)
Out[52]:
Play(start=140, dir=1, letters='BACKBENCH', rack='Q')
In [53]:
make_play(board, Play(start=140, dir=1, letters='BACKBENCH', rack='Q'))
Out[53]:
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDWB3
TLDLDLT1O1U1R1
G2B3A1C3K5B3E1N1C3H4I1
TLA1M3N1H4E1A1R1D2
V4I1R1U1L1eN1T1TLI1DWL1
E1DLS1DLE1L1Y4T1H4E1
TWTLE1DWR1E1D2TLS1
DLS1R1DL
DLDWDWE1DL
TWTLTLN1TW

Playing a Game

Now let's play a complete game. We start with a bag of tiles:

In [54]:
BAG = 'AAAAAAAAABBCCDDDDEEEEEEEEEEEEFFGGGHHIIIIIIIIIJKLLLLMMNNNNNNOOOOOOOOPPQRRRRRRSSSSTTTTTTUUUUVVWWXYYZ__'
len(BAG)
Out[54]:
100

Then the function play_game will take a list of player strategies as input, and play those strategies against each other over the course of a game. A strategy is a function that takes a board and a rack as input and returns a play. For example, highest_scoring_play is a strategy. If the optional argument verbose is true, then the board is displayed after each play.

In [55]:
def play_game(strategies=[highest_scoring_play, highest_scoring_play], verbose=True) -> list:
    "A number of players play a game; return a list of their scores."
    board = Board(WWF)
    bag = list(BAG)
    random.shuffle(bag)
    scores = [0 for _ in strategies]
    racks = [replenish('', bag) for _ in strategies]
    while True:
        old_board = board
        for (p, strategy) in enumerate(strategies):
            board = make_one_play(board, p, strategy, scores, racks, bag, verbose)
            if racks[p] == '':
                # Player p has gone out; game over
                return subtract_remaining_tiles(racks, scores, p)
        if old_board == board:
            # No player has a move; game over
            return scores

def make_one_play(board, p, strategy, scores, racks, bag, verbose) -> Board:
    """One player, player p, chooses a move according to the strategy.
    We make the move, replenish the rack, update scores, and return the new Board."""
    rack = racks[p]
    play = strategy(board, racks[p])
    racks[p] = replenish(play.rack, bag)
    points = score(board, play)
    scores[p] += points
    board = make_play(board, play)
    if verbose:
        display(HTML('Player {} with rack {} makes {}<br>for {} points; draws: {}; scores: {}'
                     .format(p, rack, play, points, racks[p], scores)),
                board)
    return board

def subtract_remaining_tiles(racks, scores, p) -> list:
    "Subtract point values from each player and give them to player p."
    for i in range(len(racks)):
        points = sum(POINTS[L] for L in racks[i])
        scores[i] -= points
        scores[p] += points
    return scores

def replenish(rack, bag) -> str:
    "Fill rack with 7 letters (as long as there are letters left in the bag)."
    while len(rack) < 7 and bag:
        rack += bag.pop()
    return rack
In [56]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 9999;
In [57]:
play_game()
Player 0 with rack YVUGOED makes Play(start=144, dir=1, letters='DOGEY', rack='VU')
for 20 points; draws: VUONPAA; scores: [20, 0]
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDW
TLDLDLTL
DWD2O1G2E1Y4
TLDLDLTL
DWTLTLDW
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DLDWDWDL
TWTLTLTW
Player 1 with rack AESOEUD makes Play(start=162, dir=1, letters='SODA', rack='EEU')
for 22 points; draws: EEULGAO; scores: [20, 22]
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDW
TLDLDLTL
DWD2O1G2E1Y4
TLDLS1O1D2A1TL
DWTLTLDW
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DLDWDWDL
TWTLTLTW
Player 0 with rack VUONPAA makes Play(start=128, dir=1, letters='NAP', rack='VUOA')
for 24 points; draws: VUOARRN; scores: [44, 22]
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDW
TLDLN1A1P3TL
DWD2O1G2E1Y4
TLDLS1O1D2A1TL
DWTLTLDW
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DLDWDWDL
TWTLTLTW
Player 1 with rack EEULGAO makes Play(start=178, dir=1, letters='LEG', rack='EUAO')
for 22 points; draws: EUAOIIN; scores: [44, 44]
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLDW
TLDLN1A1P3TL
DWD2O1G2E1Y4
TLDLS1O1D2A1TL
DWTLL1E1G2DW
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DLDWDWDL
TWTLTLTW
Player 0 with rack VUOARRN makes Play(start=113, dir=1, letters='ARVO', rack='URN')
for 21 points; draws: URNVLER; scores: [65, 44]
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLA1R1V4O1
TLDLN1A1P3TL
DWD2O1G2E1Y4
TLDLS1O1D2A1TL
DWTLL1E1G2DW
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DLDWDWDL
TWTLTLTW
Player 1 with rack EUAOIIN makes Play(start=178, dir=17, letters='LIANE', rack='UOI')
for 10 points; draws: UOINEKA; scores: [65, 54]
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLA1R1V4O1
TLDLN1A1P3TL
DWD2O1G2E1Y4
TLDLS1O1D2A1TL
DWTLL1E1G2DW
DLDLI1DLDL
TWTLA1TLTW
DLDLN1DLDL
DLDWE1DWDL
TWTLTLTW
Player 0 with rack URNVLER makes Play(start=241, dir=1, letters='VENULE', rack='RR')
for 26 points; draws: RRCIENT; scores: [91, 54]
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLA1R1V4O1
TLDLN1A1P3TL
DWD2O1G2E1Y4
TLDLS1O1D2A1TL
DWTLL1E1G2DW
DLDLI1DLDL
TWTLA1TLTW
DLDLN1DLDL
V4E1N1U1L1E1DWDL
TWTLTLTW
Player 1 with rack UOINEKA makes Play(start=208, dir=17, letters='KNEE', rack='UOIA')
for 54 points; draws: UOIAPOG; scores: [91, 108]
TWTLTLTW
DLDWDWDL
DLDLDLDL
TWTLDWTLTW
DLDLDLDL
DWTLTLA1R1V4O1
TLDLN1A1P3TL
DWD2O1G2E1Y4
TLDLS1O1D2A1TL
DWTLL1E1G2DW
DLDLI1DLDL
TWK5A1TLTW
DLN1DLN1DLDL
V4E1N1U1L1E1DWDL
E1TLTLTW
Player 0 with rack RRCIENT makes Play(start=48, dir=17, letters='TRICORNE', rack='')
for 46 points; draws: O_IAUDS; scores: [137, 108]
TWTLTLTW
DLDWDWDLT1
DLDLDLR1
TWTLDWTLI1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1TL
DWTLL1E1G2DW
DLDLI1DLDL
TWK5A1TLTW
DLN1DLN1DLDL
V4E1N1U1L1E1DWDL
E1TLTLTW
Player 1 with rack UOIAPOG makes Play(start=205, dir=1, letters='PAIK', rack='UOOG')
for 30 points; draws: UOOGLJR; scores: [137, 138]
TWTLTLTW
DLDWDWDLT1
DLDLDLR1
TWTLDWTLI1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1TL
DWTLL1E1G2DW
DLDLI1DLDL
P3A1I1K5A1TLTW
DLN1DLN1DLDL
V4E1N1U1L1E1DWDL
E1TLTLTW
Player 0 with rack O_IAUDS makes Play(start=58, dir=1, letters='DInOSAUR', rack='')
for 44 points; draws: UIFATL_; scores: [181, 138]
TWTLTLTW
DLDWDWDLT1
DLDLD2I1nO1S1A1U1R1
TWTLDWTLI1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1TL
DWTLL1E1G2DW
DLDLI1DLDL
P3A1I1K5A1TLTW
DLN1DLN1DLDL
V4E1N1U1L1E1DWDL
E1TLTLTW
Player 1 with rack UOOGLJR makes Play(start=44, dir=1, letters='JO', rack='UOGLR')
for 38 points; draws: UOGLRRE; scores: [181, 176]
TWTLTLTW
DLDWJ8O1DLT1
DLDLD2I1nO1S1A1U1R1
TWTLDWTLI1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1TL
DWTLL1E1G2DW
DLDLI1DLDL
P3A1I1K5A1TLTW
DLN1DLN1DLDL
V4E1N1U1L1E1DWDL
E1TLTLTW
Player 0 with rack UIFATL_ makes Play(start=168, dir=17, letters='FIsTULA', rack='')
for 99 points; draws: IDFHWMM; scores: [280, 176]
TWTLTLTW
DLDWJ8O1DLT1
DLDLD2I1nO1S1A1U1R1
TWTLDWTLI1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1F4
DWTLL1E1G2DWI1
DLDLI1DLDLs
P3A1I1K5A1TLT1
DLN1DLN1DLDLU1
V4E1N1U1L1E1DWDLL1
E1TLTLTWA1
Player 1 with rack UOGLRRE makes Play(start=29, dir=17, letters='EGAL', rack='UORR')
for 32 points; draws: UORRQER; scores: [280, 208]
TWTLTLE1
DLDWJ8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWTLDWL1I1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1F4
DWTLL1E1G2DWI1
DLDLI1DLDLs
P3A1I1K5A1TLT1
DLN1DLN1DLDLU1
V4E1N1U1L1E1DWDLL1
E1TLTLTWA1
Player 0 with rack IDFHWMM makes Play(start=39, dir=1, letters='WHIM', rack='DFM')
for 31 points; draws: DFMISCE; scores: [311, 208]
TWTLTLE1
DLW4H4I1M3J8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWTLDWL1I1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1F4
DWTLL1E1G2DWI1
DLDLI1DLDLs
P3A1I1K5A1TLT1
DLN1DLN1DLDLU1
V4E1N1U1L1E1DWDLL1
E1TLTLTWA1
Player 1 with rack UORRQER makes Play(start=214, dir=1, letters='ROQUET', rack='RR')
for 35 points; draws: RRTAITE; scores: [311, 243]
TWTLTLE1
DLW4H4I1M3J8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWTLDWL1I1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1F4
DWTLL1E1G2DWI1
DLDLI1DLDLs
P3A1I1K5A1R1O1Q10U1E1T1
DLN1DLN1DLDLU1
V4E1N1U1L1E1DWDLL1
E1TLTLTWA1
Player 0 with rack DFMISCE makes Play(start=72, dir=1, letters='DEISM', rack='FC')
for 42 points; draws: FCEXBWH; scores: [353, 243]
TWTLTLE1
DLW4H4I1M3J8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWD2E1I1S1M3L1I1TW
DLDLDLDLC3
DWTLTLA1R1V4O1
TLDLN1A1P3R1TL
DWD2O1G2E1Y4N1
TLDLS1O1D2A1E1F4
DWTLL1E1G2DWI1
DLDLI1DLDLs
P3A1I1K5A1R1O1Q10U1E1T1
DLN1DLN1DLDLU1
V4E1N1U1L1E1DWDLL1
E1TLTLTWA1
Player 1 with rack RRTAITE makes Play(start=74, dir=17, letters='IRRITATE', rack='')
for 47 points; draws: NYOEOTS; scores: [353, 290]
TWTLTLE1
DLW4H4I1M3J8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWD2E1I1S1M3L1I1TW
DLR1DLDLDLC3
DWR1TLA1R1V4O1
TLDLI1N1A1P3R1TL
DWT1D2O1G2E1Y4N1
TLDLA1S1O1D2A1E1F4
DWT1L1E1G2DWI1
DLE1DLI1DLDLs
P3A1I1K5A1R1O1Q10U1E1T1
DLN1DLN1DLDLU1
V4E1N1U1L1E1DWDLL1
E1TLTLTWA1
Player 0 with rack FCEXBWH makes Play(start=222, dir=1, letters='EX', rack='FCBWH')
for 38 points; draws: FCBWHIA; scores: [391, 290]
TWTLTLE1
DLW4H4I1M3J8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWD2E1I1S1M3L1I1TW
DLR1DLDLDLC3
DWR1TLA1R1V4O1
TLDLI1N1A1P3R1TL
DWT1D2O1G2E1Y4N1
TLDLA1S1O1D2A1E1F4
DWT1L1E1G2DWI1
DLE1DLI1DLDLs
P3A1I1K5A1R1O1Q10U1E1T1
E1X8N1DLN1DLDLU1
V4E1N1U1L1E1DWDLL1
E1TLTLTWA1
Player 1 with rack NYOEOTS makes Play(start=263, dir=1, letters='STONY', rack='EO')
for 36 points; draws: EOTZB; scores: [391, 326]
TWTLTLE1
DLW4H4I1M3J8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWD2E1I1S1M3L1I1TW
DLR1DLDLDLC3
DWR1TLA1R1V4O1
TLDLI1N1A1P3R1TL
DWT1D2O1G2E1Y4N1
TLDLA1S1O1D2A1E1F4
DWT1L1E1G2DWI1
DLE1DLI1DLDLs
P3A1I1K5A1R1O1Q10U1E1T1
E1X8N1DLN1DLDLU1
V4E1N1U1L1E1DWDLL1
E1TLS1T1O1N1Y4A1
Player 0 with rack FCBWHIA makes Play(start=248, dir=1, letters='WAB', rack='FCHI')
for 35 points; draws: FCHI; scores: [426, 326]
TWTLTLE1
DLW4H4I1M3J8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWD2E1I1S1M3L1I1TW
DLR1DLDLDLC3
DWR1TLA1R1V4O1
TLDLI1N1A1P3R1TL
DWT1D2O1G2E1Y4N1
TLDLA1S1O1D2A1E1F4
DWT1L1E1G2DWI1
DLE1DLI1DLDLs
P3A1I1K5A1R1O1Q10U1E1T1
E1X8N1DLN1DLDLU1
V4E1N1U1L1E1W4A1B3DLL1
E1TLS1T1O1N1Y4A1
Player 1 with rack EOTZB makes Play(start=124, dir=1, letters='ZIT', rack='EOB')
for 22 points; draws: EOB; scores: [426, 348]
TWTLTLE1
DLW4H4I1M3J8O1G2DLT1
DLDLD2I1nO1S1A1U1R1
TWD2E1I1S1M3L1I1TW