Cross word puzzles. Jumbles. Cryptograms. Word games have been popular since
ancient times. Newspapers (aka. web pages on paper) devote entire pages for
to these puzzles. If you are great with words, they can be a lot of fun.
But if you have trouble unscrambling words, or simply do not know the word the
puzzle is using, they can be really frustrating.
Fortunately, with a little bit of coding, word puzzles become really easy.
While this might seem like cheating, this actually
a matter of debate,
particularly when the puzzle uses obscure words like "variegate".
But whether or not you think it is fair, that is exactly what you will do
in this assignment: cheat at word puzzles.
Of course, cheating is no fun unless you have something to cheat at. In
addition to the programming assignment, we have provided you with a simple
word game (also written in Python). We want you to solve the word puzzles
and submit your saved game as part of the assignment. You are free to
solve the word puzzles without your cheat program, but you will be graded
on whether your solution is correct. So cheat away!
Update (October 26, 2014): Several students are doing clever (but ugly
and slow) ways to get around the recursion in parts C and D. So we are forced
to introduce new rules:
You must use recursion in Parts C and D
You may not make any helper functions beyond the ones we have given you.
A Mild Warning
This assignment is brand new. It has never been used in this class before. We have
beta-tested this assignment on other students, but that will not catch all of the errors
and misunderstandings that will inevitably pop up. Furthermore, while we have some idea
of how long the assignment should take, that is different than how long it will actually
take you. The survey for this assignment is incredibly important.
First of all, we highly recommend that you have a partner, even if you have never
had a partner for this class before. That will make things a lot easier. In addition,
we recommend that you start early, and work on a little bit each day. In particular,
we have indicated several suggested mini-deadlines throughout the assignment. These
are not enforced, but if you follow them, this assignment should be very manageable.
Finally, make sure that you read all of the instructions before starting. Most
of your questions will have answers here somewhere. If not, you should check this site
and Piazza regularly for updates and clarifications. If we hear enough complaints that
this assignment (or parts of this assignment) are too hard, we will make modifications
to adjust the difficulty.
Learning Objectives
This assignment serves several important roles.
It gives you experience with writing for-loops.
It gives you experience with writing recursive functions.
It gives you experience working with text files.
It introduces the notion of a prefix map
It gives you (more) experience with using asserts to enforce your preconditions.
This is a harder assignment. It is also brand new. The upperclassmen in your dorm
have never seen it before, and they may or may not be able to help you. You might
be tempted to talk to students in other groups. We would like to remind you of our
Academic Integrity Policy.
If you talk to other students, please cite the collaboration in your source code.
Depending on the amount of collaboration, we may or may not take off points. We generally
reserve point deductions for sharing large amounts of code, and not just a few lines.
However, we are not going to explain our deductions ahead of time. The only way to
guarantee that you do not lose points is if you do not look at anyone else's code.
The bigger problem is if you do not cite your collaborators at all. If we discover
large similarities between two submissions, and neither of you cited the other,
we will be forced to hold an Academic Integrity Hearing. Neither of us wants that.
Collaboration Policy
You may do this assignment with one other person. If you are going to work
together, then form your group on CMS as soon as possible. If you do this
assignment with another person, you must work together. It is against the
rules for one person to do some programming on this assignment without the
other person sitting nearby and helping.
With the exception of your CMS-registered partner, is not a good idea to look
at anyone else's code or show your code to anyone else, in any form what-so-ever.
If you do see someone else's code, and it affects how you write your own, then
you must cite the source in your submission.
Working with Word Puzzles
The key to solving word puzzles is to have a good dictionary. We mean is a traditional
dictionary like Websters, not the dict type built into Python (though that is
useful too). On a computer, a dictionary is often just a list of words, one on each line.
For example, the file complete.txt provided as part of
this assignment is a simple text file with over 130,000 words.
As part of this assignment, you are going to have to read a text file like
complete.txt and load it into Python. And after you do load into Python,
you are going to have to arrange the words in some special way so that it is easy
to search them. In this section, we cover how to do both of these.
Reading from a Dictionary
Reading a file in Python is (relatively) easy. When you open a file, Python creates
a file object (e.g. a folder) that behaves sort-of like a sequence. You can use it
as the loop sequence in a for-loop, just like a list or a string. However, unlike a
list or string, you cannot slice a file.
To access a file, you use the built-in function open.
This function takes a string argument, which is the name of the file you want to open.
This function creates an object for the file, and returns the name of this object when
evaluated. For example, suppose you had a file called 'myfile.txt'.
You would process it with the following code:
file = open('myfile.txt')
for line in file:
# Do something with the line
file.close()
The loop variable line contains a string. Python coverts each line (up
to the '\n' character) into a string for you automatically.
You are now free to slice up the string further, or put it in an accumulator variable.
Note that we close the file when we are done with it. The file is an object and
close() is a method. This is not completely necessary -- Python will
work without it -- but it is a very good idea to do this. If you do not close
the file when you are done, there is a small chance that you might corrupt the
file, making it useless for later word puzzles.
If you have more questions about working with files, read Chapter 14 in your text.
Storing Words in Prefix Maps
Once you have loaded the file complete.txt, you now have a list with
130,000 strings in it. Assuming that you stored the name of this list in a variable
called wordlist, you can check if a string s is a valid word
with the boolean expression
s in wordlist
This suggests one way to unscramble a string: try all possible orders of the letters,
until you find one that is in the word list. However, this approach is going to be
very slow. The number of ways to reorder a string with n distinct letters is
factorial(n). That means that a word with 10 letters has over 3 million
possible orders! There has got to be a better way.
Fortunately, there is. The secret is based on the same principle of autocompletion
that you use on your phone. Type a letter into the box on the right. This web page
has a very small word list associated with it, and it will suggest all of the words
in its list that start with that letter. When you type a second letter, it will narrow
this list down to words starting with the first two letters, and so on. This process
makes sure that we are always working with strings that are prefixes to a
valid word. If you type in a string that is not a prefix to a word, you will get
nothing back.
For example, type y into the box above. You will get a small list of words.
You will notice that all the words have either a, e, or o.
Typing one of these three letters will eliminate the other possibilities from the
list. Typing anything other than these three letters causes the list to disappear,
as there are no valid words (in our dictionary).
It is clear that this can help us to unscramble words faster. Once we have the first
few letters, the number of possible words drops tremendously, so we can limit the
number of reorderings to look at.
What we need is some way to take a prefix and list the words that start with this
prefix. There are many different ways to do this. In this assignment, you will use
what is known as a prefix map. This is not the most efficient way to handle
words (in CS 2110 you may learn about an alternative called a trie), but it
is straight-forward enough for beginning programmers.
A prefix map is a dict where the keys are strings and the values are lists
of single characters. The key is a prefix, and the value is the list of all valid
"next" letters. For example, suppose we have a dictionary with the four words
'at', 'ate',
'are', and 'by'.
In that case, the prefix map is
The words 'at' and 'ate'
illustrate an unusual problem. Sometimes a string is both a prefix of another
word, and a word in its own right. To take care of this, we sometimes add the
empty string '' to the list of next letters.
If the empty string is there (as in the care of 'at'),
that prefix is a word. If the empty string is not there (as in the case of
'ar'), then it is not.
Word Puzzles and Recursion
We are going to work with several different types of word puzzles in this assignment,
but the solutions will all have the same flavor. They are all essentially variations
on "how many words can you make in Scrabble?" Once you get this idea down, you should
be able to work through the whole assignment.
To understand what we mean, suppose you have a tile rank in Scrabble, like the one
below. We want to figure out all of the words that that you can make from this rack.
What do we do?
One way to approach it is set aside some space next to the rack that we will call the
"prefix space". We will slowly add letters to prefix space. When we form a word, we
add it to the list. We then check to see if there are any other words with this prefix
(which we can do with a prefix map). If there are none, we start from the beginning.
Otherwise, we pick the next valid letter and look at the prefix again.
This is perhaps a bit confusing. As we said in class, you should always write a clear
specification before writing code. If we try to specify what we are doing with the
scrabble rack, it sounds something like this:
Find all words that start with the letters in the prefix space, and whose additional
letters all come from the tile rack, and only use each element in the rack once.
We will call this process of finding Scrabble words to extend the prefix the "prefix search".
To illustrate this process, suppose we have started making words with our tile rack
and we have 'st' in the prefix space. This will
look something like the picture below.
The only letters left to use are 'o', 'r',
'd', and 'w'. In
addition, if we use the dictionary provided in complete.txt, we see that the
only possible "next letters" after 'st' are
'a', 'e', 'h',
'i', 'o', 'r',
'u', and 'y'. The
only letter both of those have in common is 'o'. So we
add this letter to our prefix and start a new prefix search. As you can see, this is
a recursive process.
Now suppose we have 'wor' in the prefix space.
This will look something like the picture below.
Again, if we use the dictionary provided in complete.txt, we will see that
'd', 's', and
't' are all valid ways to add a next letter. This means
that we have to try out three more prefix searches, namely 'word''wors', and 'wort'.
So while the process is recursive (at each step we get a new prefix search to try),
we also have to repeat the search for all of the acceptable next letters. That means
that many of our functions will combine a for-loop and a recursive call.
The Word Puzzle App
When you write code, it is always good to have some motivation for the code you are
writing. Therefore, we have provided you with your own personal word puzzle that you
can cheat at. In fact, we want you to cheat, as you will be submitting a save game file
as part of the assignment.
Starting the App
The word puzzle app is the most complex Python application we have seen so far in
this class. It is made up of several modules. It also includes its own sound effects.
To make everything nice and neat, we have stored everything inside of a folder, called
wordapp. More importantly, you launch the app by running the folder
as a script, not by running any of the modules inside the folder.
To start up the app, navigate the command line to the folder that contains
the folder wordapp; do not go inside the wordapp folder.
Then type
python wordapp
This tells Python to run the modules inside of this folder as a script. It will launch
a GUI window with several buttons at the top and a lot of white boxes, as shown below.
Starting a New Puzzle
Before you can do anything, you must start in a new puzzle. Press the New button
to do this. Once you press the button, you will see a box asking you for your netid,
as shown below.
The purpose of asking for your netid is to ensure that every single person has their own
unique puzzle to cheat at. You will get the same puzzle every time you type in the same
netid, but a different puzzle when you type in a different netid.
Once you have entered your netid, the app will create the puzzle for you. For example,
if you enter your instructor's netid (wmw2), you will see the following puzzle.
Solving the Puzzle
We call this type of a puzzle a seven-cross The purpose of the puzzle is to
unscramble the letters so that you get six seven-letter words that match on all of
the letters where they intersect. You will solve one word at a time, alternating
vertical and horizontal, until the entire cross is solved.
When you start the puzzle, you will notice that only the rightmost column of boxes
is white, while all of the other boxes are greyed out. This is a hint that you
are only supposed to unscramble the letters in those boxes, and nothing else.
To unscramble the word, click on two different letters. They app will then swap
the two letters. If you click on a letter by mistake, you can always unselect it
by clicking again. Once the word is successfully unscrambled, the column will turn
blue, as shown below.
The puzzle will now make a new set of boxes white, directing you to unscramble the next
word. Notice that the last letter of this word is in a blue box. This letter is now locked
and you can no longer swap it with other letters.
You complete this process until the puzzle is completely solved. When the puzzle is finished,
all of the boxes will be blue and the app will say "SOLVED!" in the top right corner,
as shown below.
Other Features
The puzzle app has several other features. First of all, you can save your game at any
time by pressing Save. This will create a file called data.save inside
your assignment directory (not inside of wordapp). To restore your saved
game, press the Load button. This will reset the boxes to where they were when you
saved.
Finally, you can always undo all of your progress in the puzzle. If you press the
Reset, it will rescramble all of the words, and remove any blue boxes.
Working on the Assignment
As with the previous assignment, this assignment provides you with a lot of code already
written for you. We have both an primary module for the assignment and a script for
testing it. You also have a graphical application to try out your module. The only
difference between this and the previous assignment is the difficulty of the functions.
Assignment Source Code
Before you do anything else, you should download the zip archive A4.zip
from this link. You should unzip it and put the contents in a new directory.
As with the previous assignment, this directory will contain a few files:
This is a unit test module to verify that the module a4 is working properly.
We have gotten you started on some tests, but there are many tests still missing.
This is a folder that contains a lot of Python modules in it. These files are uses
to launch the word puzzle app described above. You will not modify
this file. However, you will use it to generate a save file that you will submit.
This dictionary is a list of the 10 most
common English words.
This is the dictionary that a4test.py uses
for most of its testing.
Pacing Yourself
You should start as this assignment soon as possible. This is a new assignment and we
are not sure how long it will take. We have tested it, and believe that it is feasible.
However, the earlier you start, the more help we can give you on the assignment.
At the end of each part, we have a suggested completion date. While this is not enforced,
we recommend that you try to hit these deadlines. If you cannot make this deadlines, it
might be a sign that you are having difficulty and need some extra help.
Testing and Debugging
You should continue using the testing and debugging methodologies that you used in the
previous assignments. We have already started several of the test procedures in the
unit test with a4test.py. Once again, will be graded on the completeness of
your test cases. Remember our comments from the previous assignments.
At the top of a4test.py you will see a new function called
assert_lists_equal. You cannot use assert_equals to compare
two lists. Just like the colors in the previous assignment, lists are different folders,
so we need to compare them element by element. This is a new testing function that does
that for you automatically. You do not need to understand this function, but if you
read it over, you can see how to make your own testing functions.
Debugging recursive functions can be hard, so you will probably add a lot of print
statements to a4.py. This is encouraged, but remember to remove them before
you submit the assignment.
Asserting Preconditions
As with the last assignment, we would like you to continue the policy of enforcing the
preconditions with assert statements. However, this is a little more difficult this time.
For example, note that one of the preconditions of the function autocomplete is
that pmap a prefix map. To enforce this precondition, we have to verify the
following:
pmap is a dictionary.
The keys of pmap are strings, and the values are lists.
The contents of all the value lists are either single characters or the empty string.
For every key of pmap, the prefixes of that key are also keys.
For every key k and character m in pmap[k], k+m is also a key.
Checking all of these will take more time than executing any of the functions that we are writing for this
assignment. And if we have to check this at every recursive call, the entire assignment
grinds to a halt.
Remember that enforcing preconditions is a courtesy to other programmers. We are not
required to do it, though it can help with debugging later on. In a situation like
this, we might choose to enforce (1) and (2) above, or maybe just (1), since it is
quick to do that. But we certainly would not enforce all 5.
Choosing which parts of the precondition to enforce and which not to enforce are a bit
of art form. To make this easier, we have spelled it out for you. In each specification,
after the precondition, there is a paragraph called "enforced preconditions". This
explains what part of the precondition the function is expected to enforce with an assert
statement. Your implementation should enforce these parts of the precondition, and
nothing more.
Assignment Instructions
This assignment is broken up into five tasks. The first four are programming
tasks, while the last task is using your assignment to solve the puzzles in
the Word Puzzle App.
The programming tasks all correspond to function stubs in a4.py.
The tasks are clearly labeled, and the functions are fully specified. You will
find this assignment to be a lot easier if you complete and fully test one task
before moving on to the next.
Task 1. Word Lists
The first task is a collection of functions working with word lists. There are three
functions to implement.
def build_word_list(file):
"""Returns a list of words from the given file"""
def word_list_by_size(words, size):
"""Returns the elements of words that have length size"""
def word_list_extend(words, prefix):
"""Returns the word list that is the result of adding prefix
to the start of every word in the list words."""
The complete specifications for these functions are given in a4.py. There are
examples there to help you understand the functions. Remember to enforce (part of) the
preconditions. The function specifications clearly outline what you should and should
not enforce.
You do not need recursion for any of the functions in this task. You can easily
implement these with for-loops.
We have provided you with completed test procedures for these three functions in
a4test.py. These tests use the small dictionary
short.txt. You might want to test your functions
on the longer dictionaries. However, this is not required.
Try to finish this part by Tuesday, October 21.
Task 2. Prefix Maps
In our overview of word puzzles above, we introduced the
notion of a prefix map. Your second task is to implement several functions for working
with prefix maps. In particular, you need to be able to create them and test whether
or not a word is in a prefix maps. This task requires that you implement four functions.
def pmap_add_word(pmap,word):
"""Adds a single word to a prefix map."""
def word_list_to_pmap(words):
"""Returns the prefix map for the given word list."""
def pmap_to_word_list(pmap):
"""Returns the word list for the given prefix map."""
def pmap_has_word(pmap,word):
"""Returns True if word is in the prefix map."""
The complete specifications for these functions are given in a4.py. There are
examples there to help you understand the functions. Again, remember to enforce (part of) the
preconditions.
Once again, you do not need recursion for any of the functions in this task.
You can easily implement these with for-loops. One of these functions will not need
loops at all.
We have provided you with completed test procedures for these four functions in
a4test.py. These tests use very short word lists of no more than a few words.
You might want to test your functions out on longer words lists, such
as the list in short.txt or
common.txt. Once again, this is not
required.
Try to finish this part by Friday, October 24.
Task 3. Autocomplete
The third task is to implement a single function:
def autocomplete(prefix, pmap):
"""Returns the list of all words that complete prefix in pmap"""
Once again, the complete specification is given in a4.py. You should reread
our discussion of prefix maps, as it explains how to use
a prefix map to help with autocompletion. Remember to enforce the appropriate preconditions.
You must use recursion to implement this function. However, we will tell you
that you also need to combine recursion with a for-loop. You need the for-loop to build
the list of words (using a list as an accumulator); you need recursion to compute the
words that go into the list.
This time, we have not provided any tests in a4test.py. It is up to
you to provide the tests this time. Hopefully you can figure out what tests you need
to write from looking at the tests for the previous tasks. Off the top of our head,
we can think of at least four tests that would be good to try.
Doesn't This Break the Rules?
When you write the autocomplete function, you will discover that prefix
gets larger and larger at each recursive call. This is completely the opposite
of what we said in class, that the argument must get smaller at each recursive call in
order for it to terminate.
While it is true that the prefix is getting larger, the number of words that can
complete the prefix is always getting smaller. That is why we are not breaking
the rules we laid out in class.
Update (October 26, 2014)
We are adding new rules for this function.
You must use recursion to implement this function.
You may not make any helper functions beyond the ones we have given you.
Try to finish this part by Tuesday, October 28.
Task 4. Puzzle Functions
The final task is to implement the functions that will help you with the puzzle game.
There are two functions to implement.
def scrabble(rack,size,pmap):
"""Returns the list of all valid words that you can form
from the tile rack using EXACTLY size letters."""
def match(template,pmap):
"""Returns the list of all valid words that match the given
template."""
Once again, the complete specifications are given in a4.py. The function
scrabble is useful for unscrambling words. The function match is
useful for solving crossword style puzzles where you know some, but not all, of the
letters. You can see how both of these functions are useful for the word puzzle game.
You will notice that both of these functions already have implementations. Both of
them call a helper function (scrabble_helper and match_helper,
respectively). The helper functions change the specification of the original functions
so they look like the Scrabble strategy we described above.
You are to implement these helper functions using recursion together with a for-loop.
These helper functions show off a common strategy with recursive functions. Many times,
it is easier to make a function recursive if we rewrite the specification. However, as
we have said many times, you are never allowed to change specifications. On the other hand,
you are always allowed to make a new function and give it a different specification. That
is what we have done here.
Once again, we have not provided any tests in a4test.py for these two
functions. It is up to you to test them. You should have at least four tests for each,
and possibly more.
Update (October 26, 2014)
We are adding new rules for these two functions.
You must use recursion to implement these two functions.
You may not make any helper functions beyond the ones we have given you.
Try to finish this part by Saturday, November 1.
Task 5. Solve Your Personal Puzzle
You now have all the tools that you need to cheat at the
word puzzle app. Start a new puzzle with your netid. If you are
working with a partner, you only need one puzzle, so pick one of your two netids.
Once you have your puzzle, solve it.
We do not care how you solve the puzzle. If you know enough words to solve
it without help, then great. However, we have picked some pretty obscure words, so
you will probably want to cheat.
When you are done, save your game. We want you to submit data.save as part
of this assignment.
Finishing the Assignment
Once you have everything working you should go back and make sure that your program
meets the class coding conventions. In particular, you should check that the
following are all true:
There are no tabs in the file, only spaces (recall, tabs look like arrows if whitespace is visible)
Functions are each separated by two blank lines.
Lines are short enough (80 chars) that horizontal scrolling is not necessary.
The specifications for all of the functions are complete and are docstrings.
Specifications are immediately after the function header and indented.
The enforced preconditions are all fully checked and asserted.
Furthermore, at the top of both a4.py and a4test.py you should
have three single line comments with (1) the module name, (2) your name(s) and netid(s),
and (3) the date you finished the assignment.
To finish the assignment, you will need to upload three files: a4.py,
a4test.py, and data.save. The last file is your saved game
from your solved puzzle. Upload these files to CMS by the due date: Sunday, November 2nd at 11:59 pm.
Do not submit any files with the extension/suffix .pyc. It will
help to set the preferences in your operating system so that extensions always appear.
Survey
In addition to turning in the assignment, we ask that you complete the survey
posted in CMS. Once again, the surveys will ask about things such as how long
you spent on the assignment, your impression of the difficulty, and what could
be done to improve it. The survey is particularly important this time because
it is the first time that we have ever given this assignment. We want to know
whether we should keep it, or look for a new assignment next year.
Please try to complete the survey within a few days of turning in this assignment.
Remember that participation in surveys comprise 1% of your final grade.
Remember that individuals must complete the survey, even if done as
a group.
The Origin of this Assignment
CS 1110 first switched to Python in 2012. Believe it or not, I (your instructor)
learned Python that same year so I could teach this class. Before that year, I had taught this
class in Java. I came to like Python a lot, and I am glad that we made the switch.
At the same time that I was learning Python, Cliff Johnson released the puzzle game
A Fool and His Money.
Among puzzle gamers, this was a game that we had been expecting for a very, very long time.
It is a sequel to the famous puzzle game
The Fool's Errand,
which I played when I was an college freshman, just like many of you.
A Fool and His Money has a lot of word puzzles in it, including the seven-cross
that you played as part of this assignment. It has a lot more word puzzles than that,
many of them much more difficult. I was convinced by
Andrew Plotkin's blog post
(himself a famous puzzle designer) that writing a program to solve a word puzzle was not
actually cheating. The interesting part is coming up with the program anyway.
So I decided to use this language that I had just learned -- Python -- to solve the
word puzzles in the game. As part of this process I discovered that Python is great
for working with strings and word puzzles. I wrote a lot of functions to solve the
game. And all of them were recursive. So when we decided to make changes to the recursion assignment,
I knew exactly what to do.
The functions that I wrote for the seven-cross were some of the simplest.
However, there are a lot more functions that I wrote that are
not part of this assignment. I am not going to tell you what those functions are.
If you want a challenge, get your own copy of
A Fool and His Money
and figure them out on your own. Have fun!
Course Material Authors: D. Gries, L. Lee, S. Marschner, & W. White (over the years)