Due: March 24, 2020 at 11:59pm. No late assignments will be accepted.
Submit your solution using CMS.
Some websites indicate to users whether a password they choose is strong or weak.
Your task in this assignment is to write a program that implements such a password classifier.
Given a password as input, it should
classify that password as either strong
or weak
.
Learn about prior work by studying the research literature. Use a search engine to find recent and/or influential research papers that talk about this topic. Such papers are likely to have appeared in conferences like "IEEE Symposium on Security and Privacy", "USENIX Security", and "ACM Conference on Computer and Communications Security".
There will be many papers -- too many for you to read in the limited time available. By reading the abstract and introduction of a paper, though, you should be able to ascertain whether that paper will be useful for this assignment. A paper will be useful because it suggests an approach for evaluating password strength or because it explains why some previously proposed approach is flawed. Skip papers that talk about machine learning, because that approach has been ruled out.
Document your investigations in this part by listing, for between 3 and 5 papers, the following.
The write-up for each paper (typeset with 10 point font and in a document called priorwork.pdf) will likely be between 1/2 and a full page --- certainly no more than 2 pages.
Write a design document (to be called design.pdf) that discusses your classifier's implementation and explains how any findings from the literature (documented in Part 1) influenced the design. One sensible way to structure this document would be to use the following sections.
Implement your classifier based on the description you gave in design.pdf) from Part 2. We will execute this implementation on some test files. We may also inspect the code.
classify.sh
.
This launch script will run your classifier source code, assuming you are not
using bash to implement your classifier.
$ ./classify.sh path/to/input.txt path/to/output.txt
strong
or
weak
,
followed by a newline, to
the output file (according to the strength of the password
it just read).
The output file should end with a newline.
An automated grading script will be used to test your classifier,
and this grading script considers other outputs as incorrect.
As an example, for a given input file passwords.txt
stored in directory
~/inputs/
with the following contents:
your program would be called as follows:i love you 1234 2984borawQ!
and would produce a corresponding output file classified_passwords.txt stored in directory$ ./classify.sh ~/inputs/passwords.txt ~/outputs/classified_passwords.txt
~/outputs/
with
the following contents:
weak weak strong
A setup script is allowed.
You may provide a setup script setup.sh
to perform any
initial compilation or configuration needed prior to execution of your
program on test data.
This script may download data files (e.g. wordlists).
But note that any wordlists we use in generating test cases
will be those we could freely download---not wordlists
for which payment is required.
So you have no motivation to pay for wordlists.
There are many password classifiers available for download on the web. Needless to say, you shouldn't be consulting them to write your system and you shouldn't be downloading them as part of the files that your system downloads prior to execution. To do otherwise would be a serious violation of academic integrity.
Target Environment. You may develop your system anywhere. But we will grade your system by running it on the Linux hosts in UGCLab (ugclinux.cs.cornell.edu; see here for more information). So use a programming or scripting language available within this environment, and use Linux hosts in UGCLab to test what you will submit.
Submissions that do not run on the Linux hosts in UGCLab will receive no credit for executing correctly. Visit the UGCLab and test your system before you submit it, leaving plenty of time to make changes that may be needed.
What to submit. CMS will be set-up for submissions of various elements, as follows.
classifier.zip
containing:
setup.sh
and classify.sh
for preparing and running your classifier respectively.README.txt
that documents how these scripts install, configure, and
run your classifer.
This document must be sufficiently clear that we can get your
classifier installed and running within a couple of minutes.
Instructions that are unclear will be penalized.
strong.txt
and weak.txt
that should contain 10 strong and 10 weak passwords, respectively.
Each password in these files should be separated by a newline, and the files must end
with a newline as well.
We will use our classifier to check the passwords in these files (among others),
and we may also use these files as input for testing other submissions.There will be a CMS-enforced limit of 10MB on the size of the archive; design your classifier with this constraint in mind.
Notes on Grading. Here is a rough breakdown of the relative importance of your submission.
strong.txt
, and weak.txt
.
Note, a particularly "bad" classifier could get penalized both because the prior work was poorly chosen and because those methods led to bad performance.
Resources. For a refresher on bash scripting, see these CS 2043 lectures: here and here.