Lecturer: Tom Roeder
Lecture notes by
Tom Roeder
We have now defined two functions that are hard to perform: computing the inverse of a one-way function and distinguishing the output of a pseudo-random function from a random function. We then gave high-level definitions of more useful operations: cryptographic hash functions and encryption, which can be based on one-way functions and pseudo-random functions, respectively. But shared keys are inherently limiting; these keys must be shared between each pair of principals and complicate the process of adding new principals to the system. Similarly, shared key operations are not easily applicable to cases where one principal performs an operation that affects many principals. An asymmetric key setup would solve both of these problems: each principal has its own key information that it does not need to share in secret with other principals.
For an example of how problems arise in symmetric-key settings, consider how we might perform some of our shared-key operations in a context with, say, three principals, A, B, and C. Principal A wants to send a message to B and C in such a way that both know that it came from A. If A and B share key kAB and A and C share key kAC, then it's not obvious how to send a bit string that guarantees this property (though such schemes exist); the naive solution of computing a pair (MAC(m, kAB), MAC(m, kAC)) and sending it as an authenticator doesn't work if B and C don't trust each other or don't trust A, since one element of the pair might pass the check for one principal and the other not pass the check for the other principal. If A, B, and C all share a single key, then B or C could create a MAC that appears to come from A.
So, shared keys between more than two principals lose some properties. First, they lose their binding to identities. Second, authentication for different principals cannot be guaranteed. Third, they complicate open systems, in which new principals can appear at any time, since new principals must be given a key shared with each other principal.
To get around this problem, recall the example of the stock broker. The client published a pair M1 and M2 of numbers. It happened that the stock broker was the principal that used these numbers and checked them, but any principal could have performed the stock broker's actions, since M1 and M2 were published by the client. We say that key information published like M1 and M2 is a public key and m1 and m2 are the corresponding private key.
Two-key or asymmetric cryptography relies on the existence of a computational primitive called trapdoor functions. A trapdoor function takes a domain to a range in such a way that it is easy to go from the domain to range and it is hard to go from the range to the domain, but it is easy to go from the range to the domain given a special string called the trapdoor.
We parametrize families of trapdoor functions on keys. So, the public key K (written as a capital letter) is used to go from the domain to the range and the private key k (written as a lower-case letter) is used to go from the range to the domain. As in our example, the private key should only be known to one principal. Further, the private key and the public key are usually related by some function f such that K = f(k); we would like f to be one-way so that it is hard to find the private key given the public key.
The existence of trapdoor functions implies the existence of one-way functions, so trapdoor functions are not known to exist, but there are many candidate functions that are believed to be trapdoor functions. For an intuition as to why such functions exist, consider a related problem: finding and checking witnesses for NP languages. The definition of NP requires that it be easy to check a witness, but it is not guaranteed to be easy to find a witness in the worst case.
Note further that unlike shared key operations, there can be no public-key security against adversaries with unlimited computational resources! Since K = f(k) and f is public, any such adversary can just search the space of keys to find k given K.
Now let's return to our examples from symmetric cryptography and see if we can generalize them to run in open systems using asymmetric cryptography.
In an open system, given any two principals A and B, A should be able to encrypt a message that can only be decrypted by B. If there is some binding established between principal identities and public keys, then these operations can easily be performed. A naive scheme might function as follows: principal A looks up public key KB for principal B and uses it to compute an encryption for B using some trapdoor function c = fKB(m). Then B, on receipt of this message computes f-1kB(c) = m.
But there's a significant problem with this scheme given our definitions of security for shared-key encryption: it doesn't satisfy Semantic Security, since it's trivial for an adversary to compute fKB(m) and fKB(m') and compare them against given ciphertexts in the different attack models. Once again we see that there is no Semantic Security without probabilistic encryption. This is especially true in the public-key setting, since every principal has access to an encryption function for every other principal, by definition. Especially when the space of possible messages is small, it is easy to simply check all messages under the encryption function to figure out what has been encrypted.
Handwritten signatures on physical documents are of great practical value under the assumption (though rarely true) that only the given individual could have written the signature. By analogy with handwritten signatures, digital signature schemes compute signatures that are appended to documents and used for authentication. It is known that one-way functions are both necessary and sufficient for digital signatures, but practical constructions of digital signature schemes built directly on one-way functions are not known.
Given a trapdoor function and a setup giving a binding from identities to public keys, a simplistic digital signature scheme might do exactly the opposite of encryption: principal B uses key kB on message m to compute s = f-1kB(m). Then, to check s, any other principal A would use KB to check that fKB(s) = m.
Whitfield Diffie and Martie E. Hellman write a paper called New directions in cryptography, in which they describe the idea of asymmetric cryptography.
This operation allows two principals to set up a shared key given a public-key system. It uses a computational assumption called, unsurprisingly, the Computational Diffie-Hellman (CDH) assumption. CDH states that, given g, a generator for a finite field of size n and randomly chosen values a and b in this field, it is hard for an adversary to construct ga b given only g, ga, and gb. If this assumption is true, then it can be used to exchange shared keys between two principals A and B as follows:
RSA was the first publicly known instantiation of a public-key cryptosystem. Rivest, Shamir, and Adelman shared the 2002 Turing Award for this development.
It turned out in 1997, however, that the RSA function and algorithm had already been known elsewhere when it was invented. Of course, Rivest, Shamir, and Adelman did their work entirely independently.
The definition of encryption in the public-key setting is very similar to the definition in the shared-key setting, but since public keys allow encryption and are known to all principals by assumption, every principal has access to an encryption machine as in the CPA attack model. In shared key encryption we can talk about security of schemes when an adversary has seen the encryption of only one message. But, since adversaries have access to encryption functions by default in the public-key setting, public-key encryption schemes must always be secure under CPA.
A public-key encryption scheme (E, D) has two properties
The same attack models apply here. Although an encryption function is automatically given to each principal, nothing immediately guarantees that adversaries have access to a decryption function. As before, Chosen Ciphertext Attack (CCA) and Chosen Ciphertext Attack 2 (CCA2) apply, and CCA2 implies Non-Malleability, as before.
Public-key encryption functions operate on fixed-size inputs and produce fixed-size outputs, just like shared-key functions, so the same comments on encryption modes apply here.
Public-key operations are significantly slower than corresponding shared-key operations. There are two factors at play here: First, shared key operations are often based on low-level bit manipulation primitives, although these primitives might have algebraic interpretations. Standard computer hardware is optimized already for these operations, so they can be performed very quickly. Public-key operations are often based on exponentiation of large integers; modern hardware is not optimized for these operations, though special hardware exists to compute them more quickly.
Second, public keys must have many more bits than shared keys that achieve the same level of security. This is mostly because shared keys are kept secret between each pair of principals, so an adversary has little choice (barring clever attacks on the cryptosystem) but to search the space of keys to try to find the right one. A public key, on the other hand, uniquely determines the corresponding private key, so some structure can be exploited by an adversary trying to find the private key.
The original RSA algorithm was specified with keys of length less than 500 bits, but recently it has been claimed that even 1024-bit keys could be factored, so 1024-bit RSA should be deprecated in favor of larger key sizes like 2048-bit RSA. On fast modern processors as of 2007, it takes on the order of 1ms to perform a 1024-bit RSA operation, whereas a hash function can be computed on the order of 1 us. 2048-bit RSA takes on the order of 30 ms.
To contrast and compare the key sizes, although 1024-bit RSA is already suspect and systems are moving to 2048-bit RSA, 128-bit AES keys are expected to remain secure long into the future (barring new attacks on AES).
To get the speed of symmetric key operations in open systems, key exchange protocols have been developed that initially use public-key operations to establish a shared key for a given communication session and then use that shared key (under, e.g., AES) for the remainder of the session. A simplistic example involves encrypting a large amount of data x. Given a secure public-key encryption scheme (E, D) with public key Kj for principal j, principal i can generate a new shared key k for AES and send AESk(x) || EKj(k). Then j can decrypt k and use k to decrypt x at high speed. Key k can then be used for a session of communication between i and j.
To generalize MACs to signatures requires definitions that defend against the equivalent of CPA. Instead of being able to see encryptions of its choice, an adversary should be able to see signatures of its choice, but still not be able to come up with a signature on a new message. A signature scheme consists of a signing algorithm S and a verification algorithm V. S takes a message and a private key and outputs a signature. V takes a message, a signature, and a public key and outputs 1 if the signature is valid for the message given the key. Otherwise, V outputs 0. A message m along with a signature s under key k is written <m>k, which we call a signed message.
This attack model is called Chosen Message Attack (CMA). No adversary should be able to find a message m and signature s such that V(m, s, K) = 1, even given access to a machine that computes S, as long as the adversary never actually requested S(m, k).
It is instructive in this case to consider why there are no corresponding attack models to CCA and CCA2. In public-key cryptosystems, verification function V is public, so all principals automatically have access to a verification function and can perform arbitrary verification requests. These models of semantic security do not apply here in any case, since there is no notion of secrecy with respect to signature schemes; a signature attests to the fact that a message was signed by a given principal, but does not hide any information.
Note that given a binding between identities and public keys, signature algorithms satisfy a very useful sort of consistency: any principal that receives a message and signature pair and finds that V(m, s, K) = 1 knows that it can send m and s to any other principal and that principal will also find that V(m, s, K) = 1.
Note that a signature scheme is a fundamentally asymmetric operation: one principal attests to many principals about a given document.
Merkle's Puzzles was one of the first public key cryptographic systems to be described. It allows principals A and B to agree on a secret key. Principal A invents a million keys and a million puzzles, where each puzzle encodes a different one of the keys. Each puzzle is assumed to take at least two minutes to solve and to fit into 96 bits. A sends these puzzles to B. B then picks a puzzle at random and solves it. B encrypts a pre-arranged string (say 0000) with the key from the puzzle it solved. B sends this encrypted string back to A. A tries each of the million keys on the message it receives from B. The one that decrypts the message and obtains the pre-arranged string is the shared key that A will use henceforth to communicate with B.
A wiretapper C could steal the million puzzles. However, C would need to crack all million of the puzzles in order to discover the shared key. (If the wiretapper didn't know the pre-arranged string, then it can't even use a known-plaintext attack.) Since cracking each puzzle requires at least 2 minutes, the wiretapper would need on average 330 days to find the key.
The RSA cryptosystem is based on the assumption that factoring large integers is computationally hard. This assumption is not known to be true, but is widely believed. It is not even known if factoring is an NP-complete problem. RSA keys are often of length a power of two, like 512, 1024, or 2048 bits.
The RSA algorithm operates as follows
RSA works because c = me, so cd = md e mod n = m mod n (need to apply Fermat's Little Theorem and the Chinese Remainder Theorem)
RSA as described above suffers from several problems given our definitions. First, it is deterministic, since a given message always encrypts to the same ciphertext. Further, it does not satisfy Non-Malleability, since two encryptions, can, for example, be multiplied to get a new encryption: me * (m')e mod n = (m*m')e mod n. In some contexts, this kind of malleability can be useful, but it should be taken into account when designing systems that use RSA.
One simple way to solve malleability problems is to add some sort of Message Integrity Check (MIC) using hash functions. For instance, if instead of computing me mod n, we computed (m || h(m))e mod n for some h chosen from a family H of collision-resistant hash functions, then the product would be ((m || h(m)) * (m' || h(m')))e mod n, which is (m'' || h(m''))e mod n for some m'', but h being a one-way function guarantees that it is hard for the adversary to find an m'' for which this holds.
As noted in the discussion on trapdoor functions, signatures can sometimes be constructed from encryption functions. RSA signatures function in a similar manner.
RSA signatures suffer from similar problems to malleability, but now these problems violate the fundamental CMA property of signature schemes. Since any two signatures can be combined to get a signature on a new message, it is trivial for an attacker in the CMA model to violate security.
The common solution to this problem is to hash the message first. Compute signature s = h(m)d mod n. Then even though the above attacks can be applied, the adversary can't figure out which message has been signed (to do so would require, given h(m)*h(m'), finding an m'' such that h(m'') = h(m)*h(m') mod n, which is prevented by pre-image resistance of the hash function).
ElGamal is an encryption scheme that, like RSA, depends on computational assumptions to guarantee security. Unlike the RSA assumption, however, ElGamal depends on an type of assumption called the discrete-logarithm assumption. Roughly, this assumption claims that it is hard in some groups to find x given gx mod n. The name comes from the fact that x log(g) mod n = log(gx) mod n and division is easy to compute in a group, so x is easy to compute given log(gx) mod n. ElGamal operates as follows.
ElGamal works because ax = gr x and b = yr * M = gr x * M. Note that this uses probabilistic encryption, and has been proven secure in some models.
ElGamal as described here also suffers from malleability attacks. Of course, these "attacks" can be useful in some systems that require manipulation of encrypted values without revealing these values. There exist signature functions, called Schnorr signatures, that give non-malleable ElGamal construction.
Most public-key cryptosystems are built over arithmetic in finite fields (algebraic structures that have addition and multiplication operations each with inverses). Elliptic Curve Cryptography (ECC) builds a finite field out of the set of solutions to an elliptic curve equation y2 = x3 + ax + b along with an additive identity element (that corresponds to the point at infinity).
We won't go into the details of constructing ECC versions of common cryptosystems, but several such examples exist. It should be noted that the addition operation in the group is not simply adding solutions in the underlying field.
ECC is valuable because it is believed to be harder to compute, e.g., discrete logs over the finite fields of ECC than in the underlying integer finite fields. This means that key sizes in ECC can be smaller than the corresponding key sizes in cryptosystems based on other fields. ECC is not, however, known to be harder than any other system.