#!/usr/bin/env python # coding: utf-8 #

Peter Norvig
29 December 2015

# # # Refactoring a Crossword Game Program # # In my [CS 212 class](https://www.udacity.com/course/design-of-computer-programs--cs212) on Udacity, the [most complex lesson](https://www.udacity.com/course/viewer#!/c-cs212/l-48634860) involved a crossword game program (for games such as Scrabble® and Words with Friends®). The program was developed *incrementally*. First I asked "what words can be made with a rack of seven letters?", then I asked "how can you place words onto a single row?", and finally, I, with the help of the students, developed a program to find the highest scoring play anywhere on the board. This approach made for a good sequence of exercises, each building on the previous one. But the code ended up being overly complicated—it accumlated [technical debt](https://en.wikipedia.org/wiki/Technical_debt)—because it kept around ideas from each iteration. # # In this notebook I will refactor the program to pay off the debt. # # # Vocabulary # # Our program uses these concepts: # # * **Dictionary**: A set of all legal words. # * **Word**: a string of letters. Words in the dictionary are all uppercase. # * **Tile**: a letter (or a blank) that can be played on the board to form words. # * **Blank**: a tile with no letter on it; the player who places it on the board gets to choose which letter it will represent. # * **Rack**: a collection of up to seven tiles that a player may use to make words. # * **Board**: a grid of squares onto which players play tiles to make words. # * **Square**: a location on the board; a square can hold one tile. (The variable `s` will stand for a square number, and `sq` for the contents of a square.) # * **Bonus**: some squares give you bonus scores: double or triple letter or word scores. # * **Play**: a play consists of placing some tiles on the board to form a continuous string of letters in one direction (across or down), such that only valid words are formed, and such that one of the letters is placed on an anchor square. # * **Anchor square**: Every play must place a letter on an anchor square: either the center start square or a square that is adjacent to a tile previously played on the board. # * **Direction:** Every play must be in either the `ACROSS` or `DOWN` direction. (The variable `dir` stands for a direction.) # * **Cross word**: a word formed in the other direction from a play. For example, a play forms a word in the across direction, and in doing so, places a letter that extends a word in the down direction. This new extended *cross word* must be in the dictionary. # * **Score**: the points awarded for a play, consisting of the sum of the word scores for each word made (the main word and possibly any cross words), plus a bingo bonus if all seven letters are used. # * **Word score**: Each word scores the sum of the letter scores for each tile (either placed by the player or already on the board but part of the word) times the word bonus score. The word bonus score starts at 1, and is multiplied by 2 for each double word square and 3 for each triple word square covered by a tile on this play. # * **Letter score**: The letter score is the value on the letter tile (for example, 1 for `A` and 10 for `Q`) times the letter bonus score. The letter bonus is 2 when a tile is first placed on a double letter square (or the center star) and 3 when first placed on a triple letter square; it is 1 for a tile already on the board, or for a new tile played on a non-letter-bonus square. The letter score for a blank tile is always zero. # * **Bingo**: a bonus gained by using all seven tiles in one play. In Words with Friends® the bingo bonus is 35; in Scrabble® it is 50. # * **Game**: players take turns making plays until one player has no more tiles. After making a play, the player's # rack is replenished with tiles until the player has 7 tiles or until the bag of tiles is empty. # * **Prefix**: a string of zero or more letters that starts some word in the dictionary. Not a concept that has to do # with the *rules* of the game; it will be important in our *algorithm* that finds valid plays. # # This notebook uses these imports: # # # In[1]: from __future__ import division, print_function from collections import defaultdict, namedtuple from IPython.display import HTML, display import random # # # Dictionary and Words # # We will represent the dictionary as a set of words. A word is an uppercase string of letters, like `'WORD'`. There are several standard dictionaries used by different communities of players; we will use the ENABLE dictionary—we can cache a local copy with this shell command: # In[2]: get_ipython().system(' [ -e enable1.txt ] || curl -O http://norvig.com/ngrams/enable1.txt') # Now we can define a word and load the dictionary: # In[3]: def Word(w) -> str: return w.strip().upper() DICTIONARY = {Word(w) for w in open('enable1.txt')} def is_word(word) -> bool: "Is this a legal word in the dictionary?" return word.upper() in DICTIONARY # In[4]: len(DICTIONARY) # In[5]: list(DICTIONARY)[:10] # In[6]: 'WORD' in DICTIONARY # # Tiles, Blanks, and Racks # # We'll represent a tile as a one-character string, like `'W'`. We'll represent a rack as a string of tiles, usually of length 7, such as `'EELRTTS'`. (I also considered a `collections.Counter` to represent a rack, but felt that `str` was simpler, and with the rack size limited to 7, efficiency was not a major issue.) # # The blank tile causes some complications. We'll represent a blank in a player's rack as the underscore character, `'_'`. But once the blank is played on the board, it must be used as if it was a specific letter. However, it doesn't score the points of the letter. I chose to use the lowercase version of the letter to represent this. That way, we know what letter the blank is standing for, and we can distingush between scoring and non-scoring tiles. For example, `'EELRTT_'` is a rack that contains a blank; and `'LETTERs'` is a word played on the board that uses the blank to stand for the letter `S`. # # We'll define `letters` to give all the distinct letters that can be made by a rack, and `remove` to remove letters from a rack (after they have been played). # In[7]: BLANK = '_' # The blank tile (as it appears in the rack) cat = ''.join # Function to concatenate strings def letters(rack) -> str: "All the distinct letters in a rack (including lowercase if there is a blank)." if BLANK in rack: return cat(set(rack.replace(BLANK, ''))) + 'abcdefghijklmnopqrstuvwxyz' else: return cat(set(rack)) def remove(tiles, rack) -> str: "Return a copy of rack with the given tile(s) removed." for tile in tiles: if tile.islower(): tile = BLANK rack = rack.replace(tile, '', 1) return rack # In[8]: is_word('LETTERs') # In[9]: letters('LETTERS') # In[10]: letters('EELRTT_') # In[11]: remove('SET', 'LETTERS') # In[12]: remove('TREaT', 'LETTER_') # # The Board, Squares, Directions, and Bonus Squares # # In the [previous version](https://www.udacity.com/course/viewer#!/c-cs212/l-48634860) of this program, the board was a two-dimensional matrix, and a square on the board was denoted by a `(row, col)` pair of indexes. There's nothing wrong with that representation, but for this version we will choose a different representation that is simpler in most ways: # # * The board is represented as a one-dimensional list of squares. # * The default board is 15×15 squares, but # we will include a *border* around the outside, making the board of size 17×17. # * Squares are denoted by integer indexes, from 0 to 288. # * To move in the `ACROSS` direction from one square to the next, increment the square index by 1. # * To move in the `DOWN` direction from one square to the next, increment the square index by 17. # * The border squares are filled with a symbol, `OFF`, indicating that they are off the board. # The advantage of the border is that the code never has to check if it is at the edge of the board; it can always # look at the neighboring square without fear of indexing off the end of the board. # * Each square on the board is initially filled by a symbol indicating the bonus value of the square. When a tile is placed on a square, # the tile replaces the bonus value. # # How will we implement this? We'll define `Board` as a subclass of `list` and give it two additional attributes: # # - `down`: the increment to move in the down direction; 17 for a standard board. # - `directions`: the four increments to move to any neighboring square; `(1, 17, -1, -17)` in a standard board. # # Jupyter/Ipython notebooks have a special convention for displaying objects in HTML. We will adopt it as a method of `Board`: # # - `_repr_html_`: return a string of HTML that displays the board as a table. # In[13]: ACROSS = 1 # The 'across' direction; 'down' depends on the size of the board OFF = '#' # A square that is off the board SL, DL, TL, STAR, DW, TW = EMPTY = '.:;*-=' # Single/double/triple letter; star, double/triple word bonuses Square = int # Squares are implemented as integer indexes. Direction = int # Directions are implemented as integer increments class Board(list): """A Board is a (linear) list of squares, each a single character. Note that board[s + down] is directly below board[s].""" def __init__(self, squares): list.__init__(self, squares) down = int(len(squares)**0.5) self.down = down self.directions = (ACROSS, down, -ACROSS, -down) def _repr_html_(self) -> str: return board_html(self) # We'll define `WWF` as the standard board for Words with Friends®. # In[14]: WWF = Board(""" # # # # # # # # # # # # # # # # # # . . . = . . ; . ; . . = . . . # # . . : . . - . . . - . . : . . # # . : . . : . . . . . : . . : . # # = . . ; . . . - . . . ; . . = # # . . : . . . : . : . . . : . . # # . - . . . ; . . . ; . . . - . # # ; . . . : . . . . . : . . . ; # # . . . - . . . * . . . - . . . # # ; . . . : . . . . . : . . . ; # # . - . . . ; . . . ; . . . - . # # . . : . . . : . : . . . : . . # # = . . ; . . . - . . . ; . . = # # . : . . : . . . . . : . . : . # # . . : . . - . . . - . . : . . # # . . . = . . ; . ; . . = . . . # # # # # # # # # # # # # # # # # # """.split()) # In[15]: assert len(WWF) == 17 * 17 # # Displaying the Board in HTML # # I want to diaplay the board in HTML, as a table with different background colors for the bonus squares; and gold-colored letter tiles. I also want to display the point values for each letter on the tiles; I'll use a `defaultdict` of `{letter: int}` named `POINTS` for that. # In[60]: def board_html(board) -> str: "An HTML representation of the board." size = board.down - 2 squares = [square_html(sq) for sq in board if sq != OFF] row = ('' + '{}' * size) return ('' + row * size + '

').format(*squares) board_colors = { DL: ('lightblue', 66, 'DL'), TL: ('lightgreen', 66, 'TL'), DW: ('lightcoral', 66, 'DW'), TW: ('orange', 66, 'TW'), SL: ('whitesmoke', 66, ''), STAR: ('violet', 100, '✭')} def square_html(sq) -> str: "An HTML representation of a square." color, size, text = board_colors.get(sq, ('gold', 120, sq)) if text.isupper(): text = '{}^{}'.format(text, POINTS.get(text, '')) style = "background-color:{}; font-size:{}%; width:25px; height:25px; text-align:center; padding:0px" return ('{}').format(color, size, text) POINTS = defaultdict(int, A=1, B=3, C=3, D=2, E=1, F=4, G=2, H=4, I=1, J=8, K=5, L=1, M=3, N=1, O=1, P=3, Q=10, R=1, S=1, T=1, U=1, V=4, W=4, X=8, Y=4, Z=10) # In[61]: WWF # # Plays # # A `Play` describes the placement of tiles on the board. We will implement `Play` as a named tuple of four components: # - `start`: the index number of the square that holds the first letter in the word. # - `dir`: the direction, with 1 indicating `ACROSS` and `board.down` (normally, 17) indicating `DOWN`. # - `letters`: the letters of the word, in order, as a `str`. Blanks are lowercase. Some letters are from the rack; some may have been on the board. # - `rack`: the letters that would remain in the player's rack after making this play. Not strictly necessary as part of the play, but useful information. # # The function `make_play` returns a new board with the play made on it. It does not do any checking to see if the play follows the rules. # In[18]: Play = namedtuple('Play', 'start, dir, letters, rack') def make_play(board, play) -> Board: "Make the play on a copy of board and return the copy." copy = Board(board) end = play.start + len(play.letters) * play.dir copy[play.start:end:play.dir] = play.letters return copy # # Example Board # # Let's test out what we've done so far. I'll put some words on a board, which I will call `board`: # In[19]: DOWN = WWF.down plays = {Play(145, DOWN, 'ENTER', ''), Play(144, ACROSS, 'BE', ''), Play(138, DOWN, 'GAVE', ''), Play(158, DOWN, 'MUSES', ''), Play(172, ACROSS, 'VIRULeNT', ''), Play(213, ACROSS, 'RED', ''), Play(198, ACROSS, 'LYTHE', ''), Play(147, DOWN, 'CHILDREN', ''), Play(164, ACROSS, 'HEARD', ''), Play(117, DOWN, 'BRIDLES', ''), Play(131, ACROSS, 'TOUR', '')} board = Board(WWF) for play in plays: board = make_play(board, play) board # # Strategy for Finding Legal Plays # # This is our strategy for finding all possible legal plays on a board: # # - Find all *anchor squares* on the board. An anchor square is a square that is adjacent to a letter on the board—every legal move must place a letter on an anchor square. (On the game's first play, there are no letters on the board, and the `STAR` square in the center counts as the only anchor square.) # - Using just the letters in the rack, find all *prefixes* of words in the dictionary. For example, with the rack `ABC`, we find that `B`, `BA`, and `BAC` are all prefixes of the word `BACK` (and the rack contains other prefixes of other words as well). # - For each anchor square and for both directions (across and down): # - Try each prefix before the anchor (that is, to the left or above the anchor). Don't allow a prefix to extend to another anchor or off the board. That means we won't have to worry about *cross words* for the prefix. If there are already letters on the board before the anchor point, use them as the prefix rather than prefixes from the rack. # - Starting at the anchor, march forward one square at a time, trying to fill empty squares with each possible letter from the rack that forms a valid word prefix. If the march forward hits letters that are already on the board, make sure they form a valid prefix too. Also check that any cross words are valid words. When we make a complete word (with an empty or `OFF` square ahead), yield the play that made the word. # # So, each legal play will have a prefix of zero or more letters, followed by one letter from the rack covering an anchor square, followed by zero or more additional letters, which can be from the rack or already on the board. # # Prefixes # # Here we define the set of all prefixes of all words in the dictionary: # In[20]: def dict_prefixes(dictionary) -> set: "The set of all prefixes of each word in a dictionary." return {word[:i] for word in dictionary for i in range(len(word))} PREFIXES = dict_prefixes(DICTIONARY) # In[21]: len(PREFIXES) # That's too many prefixes to look at; let's try a smaller example below. Note that the empty string is a prefix, and we include `HELP` because it is a prefix of `HELPER`, but we don't include `HELPER`, because there is nothing we can add to it to make a word in this dictionary: # In[22]: dict_prefixes({'HELLO', 'HELP', 'HELPER'}) # The function `rack_prefixes` gives the set of prefixes that can be made just from the letters in the rack. Most of the work is done by `extend_prefixes`, which accumulates a set of prefixes into `results`: # In[23]: def rack_prefixes(rack) -> set: "All word prefixes that can be made by the rack." return extend_prefixes('', rack, set()) def extend_prefixes(prefix, rack, results) -> set: if prefix.upper() in PREFIXES: results.add(prefix) for L in letters(rack): extend_prefixes(prefix+L, remove(L, rack), results) return results # In[24]: rack = 'ABC' rack_prefixes(rack) # The number of prefixes in a rack is usually on the order of a hundred, unless there is a blank in the rack: # In[25]: len(rack_prefixes('LETTERS')) # In[26]: len(rack_prefixes('LETTER_')) # # Anchor Squares # # An anchor square is either the star in the middle of the board, or an empty square that is adjacent to a letter: # In[27]: def is_anchor(board, s) -> bool: "Is this square next to a letter already on the board? (Or is it a '*')?" return (board[s] == STAR or board[s] in EMPTY and any(board[s + d].isalpha() for d in board.directions)) def all_anchors(board) -> list: "A list of all anchor squares on the board." return [s for s in range(len(board)) if is_anchor(board, s)] # In[28]: all_anchors(WWF) # # Plays on Example Board # # Let's work through the process of finding plays on the example `board`. First, we'll find all the anchors: # In[29]: anchors = all_anchors(board) len(anchors) # To visualize these anchors, we'll make each one be a star, on a copy of `board`: # In[62]: board2 = Board(board) for a in anchors: board2[a] = STAR board2 # Now we'll define a rack, and find all the prefixes for the rack: # In[31]: rack = 'ABCHKNQ' prefixes = rack_prefixes(rack) len(prefixes) # In[32]: ' '.join(prefixes) # We wont go through all the anchor/prefix combinations; we'll just pick one: the anchor above the `M` in `MUSES`: # In[63]: board3 = Board(board) anchor = 141 board3[anchor] = STAR board3 # There's only room for prefixes of length 0 or 1, because anything longer than that would hit the anchor to the right of the `G` in `GAVE`; to avoid duplication of effort, we only allow words to run into other anchors on the right, not the left. Let's try the 1-letter prefix `B` first: # In[64]: board3[140] = 'B' board3 # Now we can start to march forward. On the anchor square we can place any letter from the rack that makes a valid prefix, and that also turns `.MUSES` into a valid word. There's only one such letter, `A`: # In[35]: board3[141] = 'A' assert 'BA' in PREFIXES and is_word('A' + 'MUSES') board3 # We can continue marching forward, trying letters from the rack that form valid prefixes. Let's try the combination `CK`: # In[36]: board3[142:144] = 'CK' assert 'BACKBE' in PREFIXES board3 # We've spelled the word `BACK`, but we can't count it as a legal play, because we've hit two adjacent letters, `BE`, that are already on the board. We check that `BACKBE` froms a valid prefix, and continue to the next empty square, where we can choose an `N`: # In[37]: board3[146] = 'N' assert 'BACKBENC' in PREFIXES board3 # We continue to the next square (a double word square), and place an `H`, which completes a word, `BACKBENCH`, and simultaneously makes a cross word, `THE`: # In[38]: board3[148] = 'H' assert is_word('BACKBENCH') and is_word('THE') board3 # We would record this play, and backtrack to consider other letters for this and other prefix/anchor combinations. Now let's code this up! # # # Code for Finding All Plays # # The function `all_plays` generates all legal plays by first trying all prefix plays, and then trying to extend each one, one letter at a time. (Note that it also generates the empty play, because a player always has the option of passing.) # In[39]: def all_plays(board, rack): """Generate all plays that can be played on board with this rack. Try placing every possible prefix before every anchor point; then extend one letter at a time, looking for valid plays.""" anchors = all_anchors(board) prefixes = rack_prefixes(rack) yield Play(0, 1, '', rack) # The empty play (no letters, no points) for anchor in anchors: for dir in (ACROSS, board.down): for play in prefix_plays(prefixes, board, anchor, dir, rack): yield from extend_play(board, play) # Note the syntax `yield from`, new in Python 3.3: "`yield from c`" is the same as "`for x in c: yield x`". # # Now for the function `prefix_plays`, which returns a list of all partial plays consisting of a prefix placed before the anchor. Note that these are not *legal* plays; they are *partial* plays, some of which will end up being extended into legal plays. # # There are two cases: if there are letters on the board immediately before the anchor, then those letters form the only allowable prefix. If not, we can use any prefix from the rack up to `maxlen`, which is the number of empty squares that do not run into another anchor, nor off the board. # In[40]: def prefix_plays(prefixes, board, anchor, dir, rack) -> list: "Return all Plays of a prefix to the left/above anchor." if board[anchor-dir].isalpha(): # Prefix already on the board; only 1 prefix start = scan_letters(board, anchor, -dir) return [Play(start, dir, cat(board[start:anchor:dir]), rack)] else: # Prefixes from rack fit in space before anchor maxlen = (anchor - scan_to_anchor(board, anchor, -dir)) // dir return [Play(anchor - len(prefix) * dir, dir, prefix, remove(prefix, rack)) for prefix in prefixes if len(prefix) <= maxlen] # Now `extend_play` takes a partial play, determines the square, `s`, that is one square past the end of the play, and tries all possible letters there. If adding a letter forms a valid prefix (and also does not form an invalid cross word), then we continue on (by calling `extend_play` recursively). If adding the letter forms a valid word, we yield the play. # In[41]: def extend_play(board, play): "Explore all ways of adding to end of play; return ones that form full words." s = play.start + play.dir * len(play.letters) if board[s] == OFF: return cword = crossword(board, s, play.dir) possible_letters = board[s].upper() if board[s].isalpha() else letters(play.rack) for L in possible_letters: prefix2 = play.letters + L if prefix2.upper() in PREFIXES and valid_crossword(cword, L): rack2 = play.rack if board[s].isalpha() else remove(L, play.rack) play2 = Play(play.start, play.dir, prefix2, rack2) if is_word(prefix2) and not board[s + play.dir].isalpha(): yield play2 yield from extend_play(board, play2) def scan_letters(board, s, dir) -> Square: "Return the last square number going from s in dir that is a letter." while board[s + dir].isalpha(): s += dir return s def scan_to_anchor(board, s, dir) -> Square: "Return the last square number going from s in dir that is not an anchor nor off board." while board[s + dir] != OFF and not is_anchor(board, s + dir): s += dir return s # # Crosswords # # If adding a letter in, say, the `ACROSS` direction also adds on to a word in the `DOWN` direction, then we need to make sure that this *cross word* is also valid. The function `crossword` finds the cross word at square `s` and returns it with a `'.'` indicating the empty square where the new letter will be placed, so we would get `'.MUSES'` and `'T.E'` for the two crosswords in the `'BACKBENCH'` play. # In[42]: def crossword(board, s, dir) -> str: """The word that intersects s in the other direction from dir. Use '.' for the one square that is missing a letter.""" def canonical(L): return L if L.isalpha() else '.' d = other(dir, board) start = scan_letters(board, s, -d) end = scan_letters(board, s, d) return cat(canonical(board[s]) for s in range(start, end+d, d)) def valid_crossword(cword, L) -> bool: "Is placing letter L valid (with respective to the crossword)?" return len(cword) == 1 or cword.replace('.', L).upper() in DICTIONARY def other(dir, board) -> Direction: "The other direction (across/down) on the board." return board.down if dir == ACROSS else ACROSS # In[43]: crossword(board, 141, ACROSS) # In[44]: crossword(board, 148, ACROSS) # The function `valid_crossword` checks if replacing the empty square with a specific letter will form a valid word: # In[45]: valid_crossword('.MUSES', 'A') # We can now see all the prefix plays for the anchor at 141 (just above `MUSES`): # In[46]: prefix_plays(rack_prefixes(rack), board, 141, 1, rack) # And we can see all the ways to extend the play of `'B'` there: # In[47]: set(extend_play(board, Play(start=140, dir=1, letters='B', rack='ACHKNQ'))) # # Scoring # # Now we'll show how to count up the points made by a play. The score is the sum of the word score for the play, plus a bingo score if all seven letters are used, plus the sum of the word scores for any cross words. The word score is the sum of the letter scores (where each letter score may be doubled or tripled by a bonus square when the letter is first played on the square), all multiplied by any word bonus(es) encountered by the newly-placed letters. # In[48]: def score(board, play) -> int: "The number of points scored by making this play on the board." return (word_score(board, play) + bingo(board, play) + sum(word_score(board, cplay) for cplay in cross_plays(board, play))) def word_score(board, play) -> int: "Points for a single word, counting word- and letter-bonuses." total, word_bonus = 0, 1 for (s, L) in enumerate_play(play): sq = board[s] word_bonus *= (3 if sq == TW else 2 if sq == DW else 1) total += POINTS[L] * (3 if sq == TL else 2 if sq == DL else 1) return word_bonus * total def bingo(board, play) -> int: "A bonus for using 7 letters from the rack." return BINGO if (play.rack == '' and letters_played(board, play) == 7) else 0 BINGO = 35 # Here are the various helper functions: # In[49]: def letters_played(board, play) -> int: "The number of letters played from the rack." return sum(board[s] in EMPTY for (s, L) in enumerate_play(play)) def enumerate_play(play) -> list: "List (square_number, letter) pairs for each tile in the play." return [(play.start + i * play.dir, L) for (i, L) in enumerate(play.letters)] def cross_plays(board, play): "Generate all plays for words that cross this play." cross = other(play.dir, board) for (s, L) in enumerate_play(play): if board[s] in EMPTY and (board[s-cross].isalpha() or board[s+cross].isalpha()): start, end = scan_letters(board, s, -cross), scan_letters(board, s, cross) before, after = cat(board[start:s:cross]), cat(board[s+cross:end+cross:cross]) yield Play(start, cross, before + L + after, play.rack) # What should the `BACKBENCH` play score? The word covers two double-word bonuses, but no letter bonuses. The sum of the letter point values is 3+1+3+5+3+1+1+3+4 = 24, and 24×2×2 = 96. The cross word `AMUSES` scores 8, and `THE` is on a double word bonus, so it scores 6×2 = 12. There is one letter remaining in the rack, so no bingo, just a total score of 96 + 8 + 12 = 116. # In[50]: score(board, Play(start=140, dir=1, letters='BACKBENCH', rack='Q')) # We can find the highest scoring play by enumerating all plays and taking the one with the maximum score: # In[51]: def highest_scoring_play(board, rack) -> Play: "Return the Play that gives the most points." return max(all_plays(board, rack), key=lambda play: score(board, play)) # In[52]: highest_scoring_play(board, rack) # In[53]: make_play(board, Play(start=140, dir=1, letters='BACKBENCH', rack='Q')) # # Playing a Game # # Now let's play a complete game. We start with a bag of tiles: # In[54]: BAG = 'AAAAAAAAABBCCDDDDEEEEEEEEEEEEFFGGGHHIIIIIIIIIJKLLLLMMNNNNNNOOOOOOOOPPQRRRRRRSSSSTTTTTTUUUUVVWWXYYZ__' len(BAG) # Then the function `play_game` will take a list of *player strategies* as input, and play those strategies against each other over the course of a game. A strategy is a function that takes a board and a rack as input and returns a play. For example, `highest_scoring_play` is a strategy. If the optional argument `verbose` is true, then the board is displayed after each play. # In[55]: def play_game(strategies=[highest_scoring_play, highest_scoring_play], verbose=True) -> list: "A number of players play a game; return a list of their scores." board = Board(WWF) bag = list(BAG) random.shuffle(bag) scores = [0 for _ in strategies] racks = [replenish('', bag) for _ in strategies] while True: old_board = board for (p, strategy) in enumerate(strategies): board = make_one_play(board, p, strategy, scores, racks, bag, verbose) if racks[p] == '': # Player p has gone out; game over return subtract_remaining_tiles(racks, scores, p) if old_board == board: # No player has a move; game over return scores def make_one_play(board, p, strategy, scores, racks, bag, verbose) -> Board: """One player, player p, chooses a move according to the strategy. We make the move, replenish the rack, update scores, and return the new Board.""" rack = racks[p] play = strategy(board, racks[p]) racks[p] = replenish(play.rack, bag) points = score(board, play) scores[p] += points board = make_play(board, play) if verbose: display(HTML('Player {} with rack {} makes {}
for {} points; draws: {}; scores: {}' .format(p, rack, play, points, racks[p], scores)), board) return board def subtract_remaining_tiles(racks, scores, p) -> list: "Subtract point values from each player and give them to player p." for i in range(len(racks)): points = sum(POINTS[L] for L in racks[i]) scores[i] -= points scores[p] += points return scores def replenish(rack, bag) -> str: "Fill rack with 7 letters (as long as there are letters left in the bag)." while len(rack) < 7 and bag: rack += bag.pop() return rack # In[56]: get_ipython().run_cell_magic('javascript', '', 'IPython.OutputArea.auto_scroll_threshold = 9999;\n') # In[57]: play_game() # That was an exciting game, with four bingos: TRICORNE, DINOSAUR, FISTULA, and IRRITATE. The FISTULA play garnered 99 points, from the bingo, the triple word score, and a sextuple bonus from the 4-point letter F (triple letter score in both directions). # # But that was just one game; Let's get statistics for both players over, say, 10 games: # In[58]: get_ipython().run_cell_magic('time', '', "\ngames = 10\n\nscores = sorted(score for game in range(games) \n for score in play_game(verbose=False))\n\nprint('min: {}, median: {}, mean: {}, max: {}'.format(\n min(scores), scores[games], sum(scores)/(2*games), max(scores)))\n") # # Tests # # I *should* have a complete test suite. Instead, all I have this minimal suite, plus the confidence I gained from seeing the game play. # In[59]: def sames(A, B): return sorted(A) == sorted(B) def test(): "Unit tests." assert is_word('WORD') assert is_word('LETTERs') assert is_word('ETHyLENEDIAMINETETRAACETATES') assert not is_word('ALFABET') rack = 'ABCHKNQ' assert sames(letters(rack), rack) assert sames(letters('ABAC_'), 'ABCabcdefghijklmnopqrstuvwxyz') assert dict_prefixes({'HELLO', 'HELP', 'HELPER'}) == { '', 'H', 'HE', 'HEL', 'HELL', 'HELP', 'HELPE'} assert rack_prefixes('ABC') == {'', 'A', 'AB', 'AC', 'B', 'BA', 'BAC', 'C', 'CA', 'CAB'} assert len(rack_prefixes('LETTERS')) == 155 assert len(rack_prefixes('LETTER_')) == 1590 DOWN = WWF.down plays = { Play(145, DOWN, 'ENTER', ''), Play(144, ACROSS, 'BE', ''), Play(138, DOWN, 'GAVE', ''), Play(158, DOWN, 'MUSES', ''), Play(172, ACROSS, 'VIRULeNT', ''), Play(213, ACROSS, 'RED', ''), Play(198, ACROSS, 'LYTHE', ''), Play(147, DOWN, 'CHILDREN', ''), Play(164, ACROSS, 'HEARD', ''), Play(117, DOWN, 'BRIDLES', ''), Play(131, ACROSS, 'TOUR', '')} board = Board(WWF) for play in plays: board = make_play(board, play) assert len(WWF) == len(board) == 17 * 17 assert all_anchors(WWF) == [144] assert all_anchors(board) == [ 100, 114, 115, 116, 121, 127, 128, 130, 137, 139, 141, 143, 146, 148, 149, 150, 154, 156, 157, 159, 160, 161, 163, 171, 180, 182, 183, 184, 188, 190, 191, 193, 194, 195, 197, 206, 208, 210, 212, 216, 217, 218, 225, 227, 230, 231, 233, 236, 243, 248, 250, 265, 267] assert crossword(board, 141, ACROSS) == '.MUSES' assert crossword(board, 148, ACROSS) == 'T.E' assert valid_crossword('.MUSES', 'A') assert not valid_crossword('.MUSES', 'B') assert sames(prefix_plays(rack_prefixes(rack), board, 141, 1, rack), [Play(start=141, dir=1, letters='', rack='ABCHKNQ'), Play(start=140, dir=1, letters='C', rack='ABHKNQ'), Play(start=140, dir=1, letters='K', rack='ABCHNQ'), Play(start=140, dir=1, letters='B', rack='ACHKNQ'), Play(start=140, dir=1, letters='A', rack='BCHKNQ'), Play(start=140, dir=1, letters='H', rack='ABCKNQ'), Play(start=140, dir=1, letters='N', rack='ABCHKQ'), Play(start=140, dir=1, letters='Q', rack='ABCHKN')]) assert sames(extend_play(board, Play(start=140, dir=1, letters='B', rack='ACHKNQ')), {Play(start=140, dir=1, letters='BA', rack='CHKNQ'), Play(start=140, dir=1, letters='BACKBENCH', rack='Q'), Play(start=140, dir=1, letters='BAH', rack='CKNQ'), Play(start=140, dir=1, letters='BAN', rack='CHKQ')}) assert len(BAG) == 100 assert replenish('RACK', ['X', 'B', 'A', 'G']) == 'RACKGAB' assert replenish('RACK', []) == 'RACK' assert replenish('RACK', ['A', 'B']) == 'RACKBA' assert score(WWF, Play(144, ACROSS, 'BE', '')) == (3 + 1) assert score(board, Play(140, ACROSS, 'BACKBENCH', 'Q')) == 116 return 'ok' test() # # Conclusion: How Did We Do? # # We can break that into three questions: # # - **Is the code easy to follow?**
I'm biased, but I think this code is easy to understand, test, and modify. # - **Does the strategy score well?**
Yes: the mean and median are both well over 350, which is enough for [the elite club](https://www.facebook.com/WWF350Club) of high scorers. No: this is not quite world-champion caliber. # - **Is the code fast enough?**
It takes less than 4 seconds to play a complete game for both players; that's fast enough for me. If desired, the code could be made about 100 times faster, by using multiprocessing, by caching more information, by not building explicit lists for intermediate results (although those results make the code easier to test), by using PyPy or Cython, or by porting to another language. # # We can also ask: What's left to do? # # - We could modify the program to play on a Scrabble® board. # - We could give players the option of trading in tiles. # - We could explore better strategies. A better strategy might: # - Plan ahead to use high-scoring letters only with bonuses. # - Manage letters to increase the chance of a bingo. # - Use blank tiles strategically. # - Play defensively to avoid giving the opponent good chances at bonus squares. # - Think ahead in the end game to go out before the opponent (or at least avoid being stuck with high-scoring letters in the rack). # - In the end game, know which tiles have not been played and thus which ones the opponent could have. # - The display could be prettier. # - The game could be interfaced to an online game server. # - More complete unit tests would be appreciated. # - We could compare this program to those of the [giants](https://www.cs.cmu.edu/afs/cs/academic/class/15451-s06/www/lectures/scrabble.pdf) whose [shoulders](http://ericsink.com/downloads/faster-scrabble-gordon.pdf) we [stood](http://web.archive.org/web/20040116175427/http://www.math-info.univ-paris5.fr/~bouzy/ProjetUE3/Scrabble.pdf) [upon](http://www.gtoal.com/wordgames/scrabble.html). # # Thanks to Markus Dobler for correcting one bug and making another useful suggestion. # #

# *[Peter Norvig](http://norvig.com)* # In[ ]: