CS212
Spring 1998
Problem Set 2
Public Key Cryptography

Issued: Monday, February 9
Due: Tuesday, February 24

About Partners and Academic Integrity

You may work with a (at most one) partner on this problem set. If you do choose to work with a partner, you must hand in one copy of your solutions with both of your names on it. If you do work with a partner, you must work with the same partner on the entire problem set. (You may not work with one person for questions 1 and 2 and another for the rest of the problem set.)

You may not share your code with anyone other than your partner, if you have one. You may not look at anyone else's code.

Many of the problems in this (and later) problem sets require output. The output that you hand in MUST have been produced by the code that you hand in. If your code doesn't work, then you most definitely should not hand in output showing it working. If you write some function (that accepts arguments) and submit its output, please specify which arguments were given to the function to produce the output you are submitting.

You would be amazed at how easy it is to figure out that someone has cheated, either by copying code or handing in spoofed output. Doing so risks failure in the class or worse, including possibly expulsion. See the Cornell Code of Academic Integrity. If you have any questions about what you are or are not permitted to do, feel free to contact the course staff.

Public key encryption and digital signatures play an important part in achieving private communication in a world that relies increasingly upon digital information. The fact that there are fast algorithms for exponentiation and for testing prime numbers lies at the root of RSA, a popular method for implementing public key encryption. In this problem set you will implement a version of the RSA system. By doing so, you will gain experience with some algorithms that, although simple, have a deep mathematical foundation and great practical importance.

This problem set consists of two written exercises and five programming exercises.

Written Exercises

Written Exercise 1

Principle: Induction.

Prove by induction (on n) that

Written Exercise 2

Principle: Substitution model.

Consider the following function:

   (define (iter <function>)
     (method ((f <function>) (i <integer>) (a <object>))
       (if (= i 0)
           a
           (f (iter f (- i 1) a)))))

Determine what the value of

   (iter plus-one 3 0)

is, where

   (define (plus-one <function>)
     (method ((x <number>))
       (+ x 1)))

The RSA Cryptosystem

Cryptographic systems typically use keys for encryption and decryption. An encryption key is used to convert the original message (the plaintext) to coded form (the ciphertext). A corresponding decryption key is used to convert the ciphertext back to the original plaintext.

In traditional cryptographic systems, the same key is used for both encryption and decryption. Two parties can exchange coded messages only if they share a secret key. Since anyone who learns that key would be able to decode the messages, keys must be carefully guarded and transmitted only under tight security.

Diffie and Hellman (1976) discovered a new approach to encryption and decryption: public key cryptography. In this approach, the encryption and decryption keys are different, and knowing the encryption key cannot help you find the decryption key. Thus you can tell your encryption key to anyone who wants to send you a message. They can then use it to encode a message to send to you. You do not have to worry about key security at all, for even if everyone in the world knew your encryption key, no one could decrypt messages sent to you without knowing your decryption key, which you keep private to yourself.

A popular method for implementing this scheme is due to Rivest, Shamir, and Adelman and is known as the RSA system.

How RSA Works

RSA does not work with characters, but with integers. The ASCII standard representation of a single character is a 7-bit integer. We will represent each block of two characters as a 14-bit integer obtained by concatenating the ASCII representations of the two characters. Code will be provided to convert character strings to lists of 14-bit integers and vice versa.

In the RSA scheme, you select two large prime numbers p and q. Recall that a prime number is a positive integer with no divisors other than itself and 1. You then define


  n = pq
  m = (p-1)(q-1)

The significance of these definitions is that

with very few exceptions, almost all numbers less than n, when raised to the power m modulo n, give 1;
m is difficult to compute, even if you know n.

Note: the notation [a=b] mod m means that a mod m = b mod m.

Now you pick a number e < m relatively prime to m, i.e., such that e and m have no factors in common except 1; then compute a number d such that [de = 1] mod m, i.e. de = km+1 for some k. We'll see how to do this below. Your public key, which you can advertise to the world, is the pair (n,e). Your private key is (n,d).

Anyone who wants to send you a secret message s (represented by an integer) encrypts it by computing


  E(s) = s^e mod n

That is, if the plaintext is represented by the number s, then the ciphertext E(s) is obtained by raising s to the power e, then taking the remainder modulo n.

The decryption process is exactly the same, but using d instead of e:


  D(s) = s^d mod n

The operations E and D are inverses:


D(E(s)) = (s^e)^d mod n
        = s^de mod n
        = s^1+km mod n
        = s(s^m)^k mod n
        = s(1)^k mod n
        = s mod n
        = s

The integer s representing the plaintext must be less than n. That's why we break the message up into 2-character chunks. Also, this only works if s is relatively prime to n, i.e. has no factors in common with n other than 1. If n is the product of two large primes, then all but negligibly few messages s < n satisfy this property. If by some freak chance s and n turned out not to be relatively prime, then the code would be broken; but the chances of this happening by accident are insignificantly small. It is easy to compute d efficiently from e and m. We show how below. Thus, to use the RSA system: pick large primes p and q; compute n = pq and m = (p-1)(q-1); choose e and use this to compute d such that [de = 1] mod m; publish the pair (n,e) as your public key, but keep d secret to yourself.

How secure is RSA? At present, the only known way to obtain d from e and n is to factor n into its prime factors p and q, then compute m and proceed as above. But no one knows how to factor large integers efficiently, despite centuries of effort by number theorists. Using known methods, factoring n = pq where p and q are 1000-digit primes would require years on today's fastest supercomputers. Until someone comes up with an efficient way to factor, or discovers some other way to compute d from e and n, the system is secure for all practical purposes.

Digital signatures

Public key encryption can be used to solve another problem of secure communication. Suppose you wish to send a message by electronic mail, and sign it so that the recipient can be sure that it really came from you. What is required is some scheme for signing a message in a way that cannot be forged. This is called a digital signature.

This can be done as follows. Apply a compression function (also called a hash function) that transforms the message to a single, relatively small number h. In general, there will be many messages that produce the same hash value. Now transform the hash value using your private key to give D(h). This is your digital signature, which you transmit along with the message. Anyone who receives the message can authenticate the signature by transforming it using your public key, giving E(D(h)) = h, and checking that this gives the same result as applying the compression function to the message.

Anyone who wanted to forge a message claiming to be from you must produce the number D(h). Anyone can compute the hash value h of the message, since the compression function is public. But as long as you are the only one who knows your private key, only you can produce D(h).

We can combine the two techniques as follows. Suppose Hillary wants to send Bill a message that only Bill can read, and sign the message so that Bill knows it could only be from her. She encrypts the message using Bill's public key. Then she signs the encrypted result using her own private key, using the signature method just described. When Bill receives a message, he first uses Hillary's public key to authenticate the signature, then decrypts the message using his own private key.

Bill can be sure that only someone with Hillary's private key could have sent the message. Hillary can be sure that only someone with Bill's private key can read the message. This is accomplished without exchanging any secret information between Bill and Hillary. It's this capacity for achieving secure communication without having to worry about exchanging secret keys that makes public key cryptography such an important technique.

Digital time stamping

One problem that needs to be solved before electronic documents can be considered legally binding is the need to verify the time at which the document was created. This is called a digital time stamp. It is much like using a public notary, but in fact can be easier, cheaper, and more secure. Typically, a digital time stamper should be a neutral party such as the government (hmmm). One means of digital time stamping is to have the owner of the document calculate the hash value of the document that they want stamped and send this to the time-stamper. The time-stamper encrypts a message that contains the hash value and the time that the request was received, using its own private-key. This time-stamp message is then returned to the sender, who can append it to their document. If at a later date someone is curious about or challenges the date when the document was written, they can decrypt the time stamp using the public key of the time-stamper and verify that the hash value of the document matches the hash value included in the time stamp. The security of the systems rests in the security of the time-stamper's private-key, a person who gains access to the private key of the time-stamper would be able to forge another date.

Added security can be built in by getting a document time-stamped by multiple time-stamping authorities, then a person would have to obtain the private key of all the authorities to be able to forge a date. This feature will probably be implemented by a hierarchy of time-stampers; that is, when one sends a request for a time-stamp, the time-stamper will include a time-stamp from another authority, and so on. This chain can only be forged if the forger know the private keys of all the authorities on the chain. Digital time-stamping also has the added advantage that the requestor does not need to reveal the whole document to the time-stamper, as one does with a public notary, instead one only sends the hash-value. Finally, it makes altering very difficult, since one would invalidate the time-stamp if they changed the document to any other that didn't have the same hash-value.

To illustrate a use for digital time-stamping, let's continue with the example of Bill and Hillary. After exchanging a series of torrid secret email messages for a couple of hours, Hillary and Bill decide to get married the next day. Hillary wants to have a prenuptial agreement because she makes much more money than Bill. Hillary's lawyer is vacationing in Europe, so he is unable to give them a physical copy of the prenuptial agreement to sign; instead he emails them a copy of the document. After they both read the document, they sign it using the method discussed in the previous section. The calculate the hash value of the document with signatures and have it time-stamped. A few years later when they file for divorce, Bill claims that Hillary got access to his private key and forged the document after they got married. Hillary produces the time-stamp, proving that the prenuptial agreement was signed before they were married. Bill's only recourse now is to try to claim that Hillary knew his digital signature back before they were married.

Implementing RSA

We'll assume that an RSA key is represented as a pair of integers, the modulus and the exponent (the pair (n,e) for a public key and (n,d) for a private key). We provided a very simple abstraction for this structure. Keys are implemented as <pair>s. We also defined creator and accessor functions for the new type, make-key, key-modulus and key-exponent. The creator make-key is a function that takes two integers, a modulus and an exponent, and returns an object of type <key> with the given modulus and exponent. The accessor key-modulus is a function that takes an object of type <key> and returns the modulus, and similarly for key-exponent.

The basic RSA transformation is then

 (define (RSA-transform <function>)
  (method ((number <integer>) (key <key>))
    (expmod number (key-exponent key) (key-modulus key))))

which implements both the encryption and decryption operations, depending on whether the key is public or private.

To generate RSA keys, we first need a way to generate primes. The most straightforward way is to pick a random number in some desired range and start testing successive numbers from there until we find a prime. The following procedure starts searching at a randomly chosen integer between smallest and (+ smallest range):

 (define (choose-prime <function>)
   (method ((smallest <integer>) (range <integer>))
     (bind (((start <integer>) (+ smallest (random range))))
                (search-for-prime
                 (if (even? start)
                     (inc start)
                     start)))))

 (define (search-for-prime <function>)
   (method ((guess <integer>))
         (if (fast-prime? guess)
            guess
            (search-for-prime (+ guess 2)))))

The test for primality is the Fermat test. This is a probabilistic method, which with high probability determines whether or not n is prime. It is based on Fermat's Theorem, which states that if n is prime, then all numbers a in the range 0 < a < n satisfy a^n-1 = 1; and if n is not prime, then at most half do (with a relatively small set of exceptions called Carmichael numbers, which are composite numbers that look prime as far as Fermat's test is concerned. There are so few of these we won't worry about hitting one).

(define (fermat-test <function>)
  (method ((n <integer>) (a <integer>))
        (= (expmod a n n) a)))

(define (fast-prime? <function>)
  (method ((n <integer>) )
    (and (fermat-test n 2)
         (fermat-test n 3)
         (fermat-test n 5)
         (fermat-test n 7))))

Now we can generate a public RSA key and matching private key. We'll represent this as a pair of keys, with the expected functions to create and access parts of a pair of keys:

(define <key-pair> <pair>)

(define (key-pair-public <function>) head)

(define (key-pair-private <function>) tail)

(define (make-key-pair <function>)
  (method ((public <key>) (private <key>))
          (pair public private)))

The following procedure generates an RSA key pair. It picks primes p and q that are in the range from 2^r to 2*2^r so that n = pq will be in the range 2^2r to 2^2r+2. In order to encode two characters per number, we need primes larger than 2¹⁴. In this problem set, we're using small values of n (2¹⁸ to 2²⁰) because we want you to play around with cracking an RSA system. By starting with larger random numbers, you can use the same method to produce a system that really is secure.

(define (generate-RSA-key-pair <function>)
  (method ((r <integer>))
          (bind (((size <integer>) (^ 2 r))
                 ((p <integer>) (choose-prime size size))
                 ((q <integer>) (choose-prime size size)))
              (if (= p q)   ; check that we haven't chosen the same prime twice
                  (generate-RSA-key-pair) ; VERY unlikely
                  (bind (((n <integer>) (* p q))
                         ((m <integer>) (* (- p 1) (- q 1)))
                         ((z <pair>) (select-keys m))
                         ((e <integer>) (head z))
                         ((d <integer>) (tail z)))
                       (bind ((a (make-key n e))
                              (b (make-key n d)))
                           (make-key-pair a b)))))))

Computing the Keys

The public and private keys e and d must satisfy


[de = 1] mod m

i.e. e and d must be multiplicative inverses mod m. One can show that a solution to this equation exists if and only if the greatest common divisor (gcd) of e and m is 1. We can pick e randomly in the range 0 < e < m, then calculate the gcd of e and m using the Euclidean algorithm. If the gcd is not 1, we pick again. The number d falls out as a by-product of this computation.

The Euclidean algorithm is based on the fact that for any positive integers k and m, there is a unique quotient q and remainder r such that


k = qm + r
0 <= r < m

Moreover, the gcd of k and m, call it g, is the same as the gcd of m and r. This gives us a recursive means of calculating g and integers s and t such that sk + tm = g: recursively compute u and v such that um + vr = g; then observe


g = um + vr
  = um + v(k - qm)
  = vk + (u - vq)m

so we can take s = v and t = u - vq.

(define (Euclid <function>)
;;; Euclidean algorithm for integers
;;; given m, n, calculates (s t g) where
;;; g = gcd(m,n) and sm + tn = g
  (method ((m <integer>) (n <integer>))
          (if (zero? n) (list 1 0 m)
              (bind ((q (/ m n))
                     (r (modulo m n))
                     (z (Euclid n r))
                     (u (first z))
                     (v (second z))
                     (g (third z)))
                    (list v (- u (* q v)) g)))))

Thus the call (Euclid e m) can be used to test whether the gcd of e and m is 1. If so, it also returns s and t such that se + tm = 1. The desired multiplicative inverse d of e is s mod m (we have to reduce mod m because the s returned by Euclid might be negative).

Encrypting and Decrypting Messages

Finally, to use RSA, we need some standard way to transform between strings of characters and numbers. We have provided procedures that convert a string to a list of numbers and vice versa.

The code for the problem set includes the conversion procedures string->intlist and intlist->string that convert between character strings and lists of integers. An integer between 0 and 2²⁸ encodes 4 successive characters from the message. If the total number of characters is not a multiple of 4, the message is padded with spaces.

 (string->intlist "This is a string.")
 (13396 14825 13472 4211 4193 14963 13554 13294 4142)
 (intlist->string '(13396 14825 13472 4211 4193 14963 13554 13294 4142))
 "This is a string. "

The code for these two procedures is included with the problem set code, but you are not responsible for it. You may want to look at it if you are interested in how character strings can be manipulated in Dylan.

To encrypt a message, we transform the message into a list of numbers and convert the list of numbers using the RSA process:

 (define (RSA-encrypt <function>)
  (method ((string <string>) (key <key>))
   (RSA-convert-list (string->intlist string) key)))

You might guess that the right way to encode the list of numbers would be to encode each number in the list separately. But this doesn't work well (it makes it much easier to crack the code). Instead, we encrypt the first number, add that to the second number (modulo n) and encrypt the result, add that to the next number and encrypt the result, and so on, so that each number in the resulting encrypted list will depend upon all the previous numbers.

Suppose that the unencrypted message is of the form

 A B C

and let [x] represent the RSA-transform of x. Then what all of this means is that the encryption of this message is

 [A] [[A]+B] [[[A]+B]+C]

Clearly, then, to unencrypt the message you have to remove the boxes and do some relatively simple arithmetic.

This is implemented as follows:

 (define (RSA-convert-list <function>)
   (method ((intlist <list>) (key <key>))
     (bind (((n <integer>) (key-modulus key)))
       (bind-methods
         ((convert ((l <list>) (sum <integer>))
             (if (null? l)
		 '()
                 (bind (((x <integer>)
                   (RSA-transform (modulo (+ (head l) sum) n) key)))
                     (pair x (convert (tail l) x))))))
       (convert intlist 0)))))

We'll leave it to you to implement the analogous RSA-unconvert-list procedure that reverses this transformation. Using that, we have:

 (define (RSA-decrypt <function>)
   (method ((intlist <list>) (key <key>))
     (intlist->string (RSA-unconvert-list intlist key))))

Finally, to generate digital signatures for encrypted messages, we need a standard compression function. In this problem set, we'll simply add the integers modulo 2¹⁸. (In practice, people use more complicated compression schemes than this.)

 (define (compress <function>)
   (method ((intlist <list>))
        (bind-methods 
           ((add-loop ((l ))
                      (if (null? l)
                          0
                          (+ (head l) (add-loop (tail l))))))
           (modulo (add-loop intlist) (^ 2 18)))))

Programming Exercises

You may load ps2.dyl by evaluating (load ps2).

To test the code, evaluate

 (define (test-public-key1 <key>) (key-pair-public test-key-pair1))
 (define (result1 <list>)
   (RSA-encrypt "This is a test message." test-public-key1))

Then result1 should be the list

(430373 143608 284252 756403 807616 440414 953872 178002 587685 776299 913714 615741)

We have generated a sample RSA key pair test-key-pair1 for you to test your code with. Keep in mind that punctuation and case matter.

Programming Exercise 1

The code is missing one of the procedures required to decrypt messages, RSA-unconvert-list. Implement this procedure, which takes as arguments a list of integers to decode and a decoding key, and returns a list of integers, undoing the transformation implemented by RSA-convert-list. (Hint. This procedure is similar in form to RSA-convert-list. If you find yourself doing something much more complicated, then you are barking up the wrong tree. Ask for help if necessary.)

To test your procedure, try

 (define (test-private-key1 <key>)
  (key-pair-private test-key-pair1))
 (RSA-unconvert-list result1 test-private-key1)

You should obtain the result

(13396 14825 13472 4211 4193 13044 14963 13984 14821 12531 13031 4142)

If that works, then you should be able to evaluate

 (RSA-decrypt result1 test-private-key1)

to obtain the original test message (except for some trailing spaces).

For this problem turn in a listing of your procedure, the sample encryption and decryption of the test message, and a sample encryption and decryption (using test-key-pair1 and test-key-pair2) of some messages of your choice.

Programming Exercise 2

In this exercise, you will implement the method for signing and encrypting messages described above. Part A asks you to create a simple data type to store signed messages. In part B, you will write a function to encrypt and sign a message, and also write a function to reverse that process (i.e. to authenticate a signed message and decrypt it.).

Part A

The first step in adding signed messages is to create the data type for signed messages. Later in the semester, you will learn how to create types using the define-class special form. For this question, however, you should instead use higher-order functions to create the type of signed messages. You need to write functions make-signed-message, signed-message-message, and signed-message-signature to fulfill the following contract:

	(signed-message-message (make-signed-message msg sig)) ==> msg
	(signed-message-signature (make-signed-message msg sig)) ==> sig

Note that make-signed-message is a function taking a message (which is a <list>) and a signature (which is just an <integer>) and returning a <signed-message>. Additionally, signed-message-message and signed-message-signature should take as input a <signed-message>.

Your function make-signed-message can be implemented in any way that you please ... as long as you can use it to write signed-message-message and signed-message-signature to fulfill the contract.

To get started, you should define the type of signed messages as follows:

	(define <signed-message> <function>)

For this problem, you should hand in your listings for the three signed-message functions, as well as output showing that your functions fulfill the above contract. (Hint: one of the written exercises is rather similar to this question.)

Now that you have defined signed messages, evaluate:

(define secret-message 
   (make-signed-message secret secret-signature))

to create the secret message. secret and secret-signature are both defined in the ps2.dyl file. Later on you will be cracking the RSA system and figuring out what the message is.

Part B

Define encrypt-and-sign, a procedure that takes as arguments a message to be encrypted and signed, the sender's private key, and the recipient's public key. The procedure should encrypt the message, compute a digital signature for it, and combine these to produce a signed message.

As a test, try

 (define (result <signed-message>)
   (encrypt-and-sign "Test message from user 1 to user 2"
          test-private-key1
          test-public-key2))

You should obtain a signed message whose message part is

(298372 294815 233763 477562 207235 481850 141975 330743 83249 524335 105034 270882 533959 492599 214913 121753 440135)

and whose signature part is 303892.

Now implement the inverse transformation authenticate-and-decrypt, which takes as arguments the received signed message, the sender's public key, and the recipient's private key. If the signature is authentic, the procedure should produce the decrypted message. If the signature is not authentic, the procedure should indicate this. Test your procedures by trying

	(authenticate-and-decrypt result test-public-key1 test-private-key2)

to recover the original message. For this problem turn in a listing of your procedures together with a demonstration that they work. Don't forget to demonstrate that they catch fraudulent signatures. You can use the two test keys for this.

Programming Exercise 3

For the next exercise we will need the procedure smallest-divisor, which given an integer n finds the smallest integer that divides n with no remainder. You should write a procedure that finds the smallest divisor by trying all the integers that could divide n. You can save time by checking various cases. For example, what is the largest integer that needs to be checked? What other numbers do not need to be considered (e.g. n is even)?

The procedure smallest-divisor should be completely stand-alone; that is to say, it should take n as an input parameter and not assume anything about n except that it is an integer.

For this problem turn in a listing of your smallest-divisor procedure and some examples of it working.

Programming Exercise 4

You now have a basic implementation of an RSA cryptosystem, complete with facilities for encryption, decryption, digital signatures and signature authentication, and generation of new keys. Since we have used such small primes to generate keys, you should also be able to crack the system. In order to crack an RSA system, recall you must factor the modulus n into its component prime factors p and q. You can do this using the procedure smallest-divisor. Note that you only need to find one prime divisor p; the other divisor is q = n/p. Write a procedure crack-RSA that, given a public key, returns the associated private key. Test your procedure using the pairs test-key-pair1 and test-key-pair2 to show that it generates the correct private keys, given the public keys.

For this problem turn in a listing of your procedure, together with demonstrations that it works.

Programming Exercise 5

Use your crack-RSA and authenticate-and-decrypt procedures to determine who sent the signed message in secret-message, who the message was intended for and what the content of the message is. The message and the public keys for various possible senders and receivers are at the end of the file ps2.dyl.

For this problem turn in the results of using these procedures to decrypt the message.

Source: W. Diffie and M. Hellman, ``New directions in cryptography,'' IEEE Transactions on Information Theory, IT-22:6, 1976, pp. 644-654.

Last Modified: 01/19/00 11:36 PM

CS212 Spring 1998 Problem Set 2 Public Key Cryptography

Issued: Monday, February 9 Due: Tuesday, February 24