CS513 Homework 4: Authorization Policies

General Instructions. You are expected to work alone on this assignment.

Due November 14, 10 am. No late assignments will be accepted.

Submit your solution using CMS. Prepare your solution using Word (.doc) or some ascii editor (.txt), as follows:

Use 10 point or larger font (single-spaced).
Start each problem's solution on a new page.
Use at most 1 page per problem.

Note: If your browser does not support Unicode, you will not be able to view this page correctly. Your browser must be able to render the following symbols: ⊆ (subset or equal), ≤ (less than or equal).

Change log:

[October 30] Problem 1, part 4a, was missing the definition of the labelling function L for datasets C, D, and E. This definition has been added and is color-coded green. Also, a typo in the appendix has been fixed. You might consider: What principle, discussed during the access control lecture, does this change log exemplify?

1. Mandatory Access Control

Sometimes individual data are less sensitive than their aggregate. For example:

Each part of a (3,2) secret sharing is indistinguishable from random. But any two parts of the share reveal the entire secret.
The budgets of individual departments of a company may not reveal much information. But collectively, they reveal where the company is concentrating its resources, and thus telegraph its business strategy.
In the movie Mission: Impossible (1996), the recovery of a NOC (non-official cover) list is a focus of Agent Ethan Hunt. One half of the list contains the code names of secret agents, and the other half contains the agents' real names. Each half individually reveals sensitive information, and their combination reveals even more information.

Aggregation is particularly relevant in the context of databases. For the purpose of this problem, suppose that a database comprises a number of datasets. (A dataset might be a table or a view.) Further, suppose that each dataset is assigned a sensitivity label such as Unclassified, Secret, or Top Secret. (We ignore compartments in this problem.) Then it might be the case that datasets A and B are both Unclassified, but that their aggregation is Secret. To model this, let the function L(R), where R is a set of datasets---for example, R={A,B}--- denote the sensitivity of the aggregation of all the datasets in R. As healthiness conditions on L, we require that:

For all A, L({A}) = sensitivity of A.
If R ⊆ R' then L(R) ≤ L(R').

Our goal in this problem is to develop a MAC model for this scenario. Suppose that an object is a document containing information derived from the database---e.g., the result of queries on datasets. A subject, as usual, is a process executing on behalf of a user. An entity is either a subject or an object.

Construct your own real-world example, using the database model above, of aggregate data that are more sensitive than their constituents. Your example should include at least three datasets. Identify what L(R) is for each possible subset R of your datasets. (If you need inspiration, begin by supposing that one of the datasets is a set of photographs.)
Suppose that each object (and subject) is labelled with its sensitivity (or clearance). We could then attempt to employ the Bell and LaPadula security conditions ("no read up, no write down"). However, we claim that these conditions are insufficient to guarantee the following policy:

P1: An object never contains information whose sensitivity is higher than the object's label.

Using your example database from part 1, prove this claim by exhibiting a series of read and write operations that effect such an information flow. You may freely invent entities and their labels.
Instead of sensitivity, suppose that each entity is labelled with a set of datasets. Give new conditions for reading and writing. Your conditions should guarantee the following policy:

P2: If X is labelled with R, then the information in datasets R should be allowed to flow to X, and information from datasets other than those in R should not be allowed to flow to X.
Suppose that entities are again labelled with sensitivity, not datasets. But, further suppose that the history of each entity is recorded. Let h(X) denote the history of entity X, which is the set of datasets from which information has flowed to X. For example, if h(O) = {A,B}, then information from A and B has been written to object O. And if h(S) = {B,C}, then information from B and C has been read by subject S. Note that while the sensitivity of an entity is constant, the history of an entity can be changed by read and write operations.
We seek to enforce the following policy:

P3: Information may flow from entity X to entity Y only if the sensitivity of the resulting information in Y would not exceed L(Y).
1. Consider the following series of operations. Let L(O1) = L(O2) = L(O3) = Secret, h(O1) = {E}, h(O2) = {D}, h(O3) = {C}. Let L(S) = Secret and h(S) = {}. Let L({C}) = L({E}) = Secret, L({D}) = Unclassified, and L({C,E}) = Top Secret.
  1. S reads O1
  2. S writes O2
  3. S reads O3
  4. S writes O2
  For each of these operations, state the new history function h() after the operation completes. Also, explain whether each operation should be allowed by arguing whether the operation satisfies policy P3.
2. Give new conditions for reading and writing. Your conditions should guarantee that policy P3 is enforced. As part of these, you will need to specify how each operation affects history. Argue that your new conditions enforce policy P3.
Compare and contrast model in part 4 with the Chinese Wall policy discussed in class. Is one more general than the other, or are they incomparable?

2. Noninterference

The security policy noninterference can be informally stated as,

"Changes in high inputs do not cause changes in low outputs."

Do the following systems satisfy this condition? Argue why or why not. Your interpretation of this informal condition may affect your answer, so the clarity and precision of your rationale is important. You may assume that covert channels (e.g., timing and termination) are unobservable; or equivalently, that these channels are allowed to leak information.. All systems have four channels: HI (high in), HO (high out), LI (low in), LO (low out).

System U copies all input from HI to HO, from LI to LO, and additionally echoes all data from LI to HO.
After receiving an input x on HI and an input y on LI, system V outputs y on LO if x is even, and yy on LO if x is odd.
Upon receiving a PIN x on HI, system W outputs x on LO with probability .01, or with probability .99 outputs a 4-digit number chosen uniformly at random. (PINs are 4-digit numbers in the range [0000..9999].)
Upon receiving an input x on HI and an RSA public key K on LI, system X outputs the encryption of x with K on LO.
When system Z receives x on HI, it echoes x to HO. It may delay arbitrarily long before echoing, and until the echo occurs, it cannot receive any additional data on HI. At any time after receiving at least one input on HI, Z may output "stop" on LO. After this, it immediately outputs either "even" or "odd" on LO, which represents the parity of the number of high inputs and outputs that have been performed. The system then stops.

3. Static Analysis

In lecture, we gave the following inference rules to enforce noninterference:

max(E, pc) ≤ x
	[Assign]
x = E; sif pc = C

D = B max C S1 sif pc = D S2 sif pc = D
	[If]
if (B) S1; else S2; sif pc = C

S1 sif pc = C S2 sif pc = C
	[Sequence]
S1; S2 sif pc = C

D = B max C S sif pc = D
	[While]
while (B) S; sif pc = C

Recall that these rules assumed that termination and timing channels were unobservable; or equivalently, that these channels were allowed to leak information.

Here, we explore changing these rules and assumptions. In each part below, the proposed change causes the analysis not to enforce noninterference. For each proposed change, identify a program that leaks the value of variable h. Show the proof tree (see this Appendix for an example proof tree) that purports to verify that your program is noninterfering, and explain why the program is actually interfering. Each proposed change is independent: in the second change, for example, use the original rule for [Assign] above, not the changed rule [AssignX].

Change [Assign] to:

max(E, pc) ≥ x
	[AssignX]
x = E; sif pc = C

Change [If] to:

S1 sif pc = C S2 sif pc = C

[IfX]

if (B) S1; else S2; sif pc = C
Change the assumption about termination channels: assume that an attacker can always correctly conclude whether a program has entered an infinite loop.
Change the assumption about timing channels: assume that, if a program terminates, then an attacker is informed of the number of steps the program executed. The number of steps is equal to the number of times the guard of any if or while statement was evaluated, plus the number of times any assignment statement was executed. In addition, change [While] to:

B = L S sif pc = C

[WhileX]

while (B) S; sif pc = C

S1 sif pc = C S2 sif pc = C
	[IfX]
if (B) S1; else S2; sif pc = C