In this section we define a simple programming language that incorporates our model. The goal of this exposition is not to seriously propose a programming language, but to demonstrate that the information flow model can be applied practically to a rich computational model, providing sound and acceptably precise constraints on information flow. These annotations could be applied to other programming models, such as compiled code or JVM code [LY96].
The language supports some simple
types: integers, strings, records, and arrays of any legal type
including other array types.
Procedures may contain variable
declarations, assignments, if
statements, and while
statements;
they return results by assigning to special
return variables, as in Pascal.
Variables of record or array types are
references to storage on the heap, as in
Java [GJS96] and CLU [LAB+84], so that assignment of
a record or array (e.g., r1 := r2
or a1 := a2
) makes the variables
aliases for each other.
For simplicity, the language has no exceptions. Exceptions complicate the propagation of basic block labels, but can be handled using the basic-block propagation described in Section 4.2, assuming that procedures declare the exceptions they might raise. Not having exceptions makes some programs clumsy, but programs written with exceptions can be straightforwardly translated into programs without exceptions, so there is no loss of generality.
For simplicity of presentation, the language also lacks global variables. In our simple language, the first basic block in a procedure has label . Global variables could be supported by allowing procedures to accept a special parameter that defines the label of their first basic block.
The language is extended with a few unusual features to support information flow control:
declassify
'' operator allows the declassification of
information.if_acts_for
statement
whether it is able to act for some
principal, and if it is, it may use the authority of that principal.label
, which permits run-time label checking. Variables
of type label
and argument-label parameters may be used to construct
labels for types that are mentioned within the procedure body.labelcase
statement can be used to determine the
run-time labeled type of a value, and a special type protected
conveniently encapsulates values along with their run-time labels.
We begin with an example of a program to illustrate some of the features in the language. Then we define the language in more detail, including how to check the constructs using label-checking rules.
Figure 6 shows the
"check_password
" procedure, which accepts a database of passwords,
a password, and
a user name, and returns a boolean indicating whether the string is
the right password for that user. This example is simple, yet
it uses declassification to control information
flow in a fine-grained way that is not possible under any previous model
of which we are aware.
pinfo = record [ names, passwords: string{chkr: chkr} ] check_password (db: array[pinfo{}]{}, user: string {}, password: string{client: chkr}) returns (ret: bool{client: chkr}) % Return whether password is the password of user i: int {chkr: chkr} := 0 % match: bool {client: chkr; % chkr: chkr} := false % while i < db.length() do % if db[i].names = user & % db[i].passwords = password then % match := true % {client: chkr; end % chkr: chkr} i := i + 1 % end ret := false % if_acts_for(check_password, chkr) then % ret := declassify(match, {client: chkr}) % end end check_passwordFigure 6: Annotated password checker
Two principals are mentioned in this code:
chkr
represents the password checker role, and
client
represents the principal of the calling client.
The password database is an array of records giving the
passwords for users; it should be protected by encapsulation
(e.g., check_password
should be a method),
but this complexity is avoided here. In a real program,
client
would be a parameter to the routine
rather than a single fixed principal; a more general (and more concise)
version of check_password
is
shown later, in Figure 8.
In the figure, all labels are declared explicitly by annotating
types with label expressions in braces. The
type T{L} describes a value of type T that is restricted by the
label L. For example, the
password
argument is readable by the checker but owned by the client,
which ensures that the checker can examine the password but cannot
distribute it further.
The annotations are onerous in this case partly
because variable labels are not being inferred;
Section 6 shows how to infer labels for this procedure.
The comments (beginning with a ``%'') on the right-hand side of the example indicate the static value of the basic block label, B, and are not part of the code.
To perform its work, the procedure
uses the db
database to find the password of the user.
As it looks at the data, its basic block label picks up any
dependencies.
For example,
the if
predicate examines password
and the fields of db[i]
;
therefore, its body has the
basic block label {client: chkr; chkr: chkr}
.
This means that the label of match
must be at least this restrictive,
or the assignment to match
in the body of the if
statement would leak information (i.e., would result in a compile-time error).
However, the client requires a result with label {client: chkr}
.
To provide such a result, the checker must
explicitly declassify match
,
removing chkr
from the owner set.
The declassification can be carried out only if the procedure
has the proper authority (i.e., can act for chkr
).
Therefore, the procedure checks whether it runs with this authority
(in the if_acts_for
statement); in the then
clause, the code
runs with the authority of the password checker; otherwise, it does not.
Code receives authority by being granted it in the principal hierarchy;
each procedure has its own principal that
can participate in acts-for relations.
Of course, granting code the right to act for a particular principal can
only be done by a process that acts for that principal, so chkr
must
have explicitly granted check_password
the right to act for it.
The expectation of the author of the procedure check_password
is that
it has been given this right. However,
the procedure is constructed so that it does not leak information if
the right to act for chkr
is later revoked.
The need for explicit declassification is not just an artifact of our model. The return value of the procedure is owned by the client, which means that the client has complete control over the value. The procedure's result could conceivably be used to determine the contents of the password database through exhaustive search. In this case, the implementor of the procedure has made a conscious decision that the amount of information leaked by the boolean return value is small enough that declassification is acceptable. Note that the decision about declassification is made locally, by code that acts for the owner of the data; no appeal to an external trusted agent is required.
It is not necessary to run the entire procedure under the
chkr
authority. Instead this authority is used just where it is
needed, for the declassification. Also, note that the procedure
only needs to declassify its result. It is not required to declassify
everything extracted from the password database, and is therefore protected
against accidentally leaking the database contents (e.g., if it
called a helping procedure that acted on the database).
Now, we explain the constructs of the language and the corresponding label-checking rules. The process of verifying a program according to these rules involves two major steps: first, basic block labels are propagated; then, each individual statement is verified in the context of the basic block that contains it. Verifying a statement requires that we check for the satisfaction of corresponding label constraints, which is discussed in more detail in Section 6.
If the language contained exceptions or gotos,
B would need to be computed through the basic-block
label propagation rule of Section 4.2. For our simple
constructs, the rules for propagation of B can be stated directly; since
control flow is determined only by if
and while
statements, the
basic-block propagation rule takes particularly simple forms. Note that in each
statement rule, represents the label for the basic block
containing the statement.
Many of these rules are similar
in intent to those found in Denning and Denning [DD77],
though these rules are different in that
they are expressed without using the meet operator ().
As described, values and slots have labels that restrict information flow. In statically-typed languages, values and slots also have static types that can be said to restrict flow. These two restrictions can be combined, so that labels are considered to be part of the type. This interpretation allows us to use standard type-checking and notions of subtyping to describe information flows.
Every variable, argument, and return type in the language has a
labeled type,
which consists of a base type such as int
, plus a
static label. For example, the
labeled type int{L}
represents an integer restricted by L.
In general, label expressions, consisting of the
join of several labels, may appear within the braces. For example,
int{chkr: chkr
L}
is restricted by both {chkr: chkr}
and L.
The type and label parts of a labeled type act independently. For any two types S and T where S is a subtype of T (ST), and for any two labels L1 and L2 where L1L2, S{L1}T{L2} [VSI96]. This formula also implies that S{L1}T{L1 L3} for any other label L3.
Parametric types such as arrays and records explicitly mention
labels on their type parameters. For example, we can form the type
array[int{L}]{L2}
, which is an array of labeled integers, where
the integers have label L
, and the array reference has label L2
.
In record types, similarly, the individual fields have labeled
types.
Given an assignment of the form v := e
, where v
is a variable
with type T{v
} and
e
is an expression with type S{e
},
the assignment is legal if
e
B v
,
where B is the label for the basic block containing the statement. This
condition guarantees both that the information in e
is not leaked by placing
it in the variable v
, and that performing
the store operation does not leak information because it happens
in this particular basic block.
Record and array assignments are similar to variable assignment, except
that they may convey information by virtue of the particular record or
array being assigned to. Consider record assignment of the form
r.f := e
. In this statement, r
is a record expression with label r,
f
is a field of the record type with declared type
T{f
}, and e
is an expression with type S{e
},
where ST. The assignment is legal
if
e
r
B
f
.
This rule is equivalent
to the one in Denning and Denning [DD77].
The rule for assignment to an
array element is similar, except that the label on the array index is
included on the left-hand side. Because
r
appears on the
left-hand side of the rule, fields and
array elements become immutable if the variable referring to the
record or array becomes more protected than the field or element.
For example, a record field with
r
f
is immutable,
since otherwise information could be leaked by assigning to it.
if
and while
The rules for if
and while
are similar to each other.
Assume that e
is a legal boolean expression, and
that S is an arbitrary statement. The statement ``if e then S end
'' is
legal if S
is legal given the basic block label
Be
. The
same condition guarantees the legality of ``while e then S end
''. The
label e
does not need to be part of the basic block label after the
execution of the if
or while
statement, because we are
not considering covert timing channels or covert channels arising from
non-termination, as discussed in Section 4.2.
A procedure executes with some authority that has been granted to it. Authority may be granted through the principal hierarchy or because the procedure's caller grants the procedure the right to act for other principals.
At any given point within a program, the compiler understands the code to be running with the ability to act for some set of principals, which we call the effective authority of the code at that point. The effective authority can never exceed the true authority at any point during execution.
When a procedure starts running, it has no effective authority.
It may increase its effective authority by testing whether
it has the authority to act for a principal. If the test succeeds, the
effective authority is increased to include that principal.
This test is accomplished by using the if_acts_for
statement:
if_acts_for(P1, P2) then
S1 else
S2
(The brackets around the else
clause indicate that it is
optional.)
In this statement, P2 names a principal in the principal
hierarchy; P1
names the current procedure or the special keyword
caller
. If it names the current procedure, it means the
procedure's principal, as discussed in Section 5.1.
If it names caller
, it denotes the principal(s) that the procedure's
caller has granted it the right to act for, as discussed later in
Section 5.11.
The effect of if_acts_for
is to execute the then
block if the
specified acts-for relationship exists. If the if_acts_for
test fails,
the else
block, if any, is executed with no additional authority. If the
test succeeds, the effective authority of the then
block
is increased to include P2
It is possible that while a procedure is executing the then
part
of an if_acts_for
statement, the principal hierarchy may change
in a way that would cause the test in the statement to fail.
In this case, it may be desirable to revoke the code's
permission to run with that authority, and we assume the underlying
system can do this, by
halting the code's process, at some point after the hierarchy changes.
If a running program is killed by a revocation, information may be leaked about what part of the program was being executed. This constitutes a timing channel, and one that can be made slow enough that it is impractical to use.
A program can explicitly declassify a value. The operation
declassify
(e, L)
relabels the result of an expression e with the label L,
using relabeling Rules 1 and 2 as needed.
Declassification is checked statically, using the effective
authority at the point of declassification; the authorization for
declassification must derive
from a containing if_acts_for
control structure.
Declassification is legal as long as e permits declassification
to L, which implies the following rule.
Let
LA
be a label in which every principal in the effective authority
is an owner, but with an empty reader set.
The most restrictive label that e could
have and still be declassifiable to L is
L LA, so the
declassify()
expression is legal if
e L LA.
For example, if the principal A is part of the effective authority,
the label {A: B,C; D: E}
can be declassified to {A: C; D: E},
since
{A: C; D:E} {A: Ø}
= {A: Ø; D: E},
which is more restrictive than {A: B,C; D: E}.
Consider a library routine such as the cosine function (cos
). It would
be infeasible to declare separate versions of cos
for every label in the
system. Therefore, we allow procedures to be generic with respect to the
labels on their arguments, which means that only one cos
function need
exist.
If a label is omitted on the type of a procedure argument a
,
the argument label becomes an implicit parameter to the
procedure, and may be referred to as a elsewhere in the procedure
signature and body. For example, the cosine function is declared as follows:
cos(x: float)
returns (y: float{x})
cos
is generic with
respect to the label on the argument x
,
and x
is an implicit argument to the routine.
This signature allows cos
to be used on any argument,
and the label
on the return value is always the same as the label on the argument.
Since the code of cos
does not depend on what x really is,
its code need not access the label, so there is no
need either to recompile the
code for each distinct label or to pass the label at runtime.
Therefore, implicit label polymorphism has no run-time overhead.
Implicit labels allow code to be written that is generic with respect to labels on arguments. However, sometimes more power is needed: for example, to model the accounts database of the bank example, where every customer account has a distinct label. To allow such code to be written, we support run-time labels.
A variable of type label
may be used
both as a first-class value and as a label for other values. For
example, procedures can accept arguments with unknown labels, as
in the following procedure declaration:
compute(x: int, lb: label) returns (float{xlb})
To simplify static analysis, first-class label variables are immutable
after initialization.
When a label variable is used as a label, it represents an unknown but
fixed label. Because labels form a lattice and obey the simple rules
of a lattice, static reasoning about this label is straightforward. For
example, if the procedure compute
contains the assignment z := x + y
,
where y
has type int{lb}
,
the assignment is valid as long as it can be statically determined
that
zlbx
.
This condition can be checked
even when z
and y
do not declare their labels explicitly,
as discussed in Section 6.
Since label
is a type, it too must be labeled wherever it is used. Constant
labels and implicit label parameters
have type label
{}.
Declarations of run-time labels
can indicate the label's label explicitly; if the label's label is
omitted, it treated like any other type: it is inferred
in the case of a local variable, and it is implicit
in the case of an argument (this is the situation for
the lb
argument of compute
).
In principle, code that is written in terms of implicit labels
can be expressed in terms of run-time labels.
We provide implicit labels because when they
provide adequate power, they are easier and cheaper to use
than run-time labels. For example, without implicit labels the
signature of the cos
function would be the following:
cos(x: float{lx}, lx: label{}) returns (y: float{lx})
The labelcase
statement is used to determine the run-time
label on a value. It effectively allows a program to examine
variables of type label
. For example, compute
might use
labelcase
to match the label lx
against other labels it has
available. A labelcase statement has the following form:
labelcase e as v when L1 do S1 when L2 do S2 ... [ else S3 ] end
The effect of this statement is to execute the first statement Si such that e Li, introducing a new variable v containing the same value as e, but with the label Li.
The block label in the arm of the labelcase
does not reflect the label of
e
, but it does reflect the label of e
's label.
Suppose the type of e
is T{Le} where Le
has the type label
{Le}. Similarly, suppose
the labels Li
have respective types Li.
The labelcase statement is
valid only if each of the statements Si
is valid given a basic block label
Bi:
This formula says that selecting which statement to
execute reveals information about the labels, but does not reveal anything
about information protected by the labels.
Therefore, the basic block labels Bi depend
on Lj
but not on Lj.
The reason for joining all the
Lj
up to i is that the arms of the labelcase
are checked
sequentially, so each arm that fails conveys some information to
its successors.
In a labelcase
, neither the tested
expression e
nor the labels Li
on any of the arms may
mention the implicit labels on procedure arguments; rather, the
labelcase
is limited to
constant labels and run-time labels.
Implicit procedure argument labels are only intended to
help write code that doesn't care what the labels are on the arguments.
This restriction on the labelcase
avoids the need to pass implicit labels to the generic code that
uses them, as discussed in Section 5.8.
Since
labels may be passed as explicit arguments, and values of type label
may be used to label types, no power is lost through this
restriction.
A procedure definition has the following syntax:
The optional authority clause indicates whether the procedure
may be granted some authority by its caller.
A procedure with an authority clause may try to claim this authority by
using caller
in an if_acts_for
statement.
The ai are the names of the arguments to the procedure.
The arguments have
types Ti and optional
labels Li.
As in variable declarations, labels in the arguments and
results of a procedure signature may be simple label expressions,
including joins of other labels.
A call to a procedure is legal only if each actual argument to the
procedure is assignable to the corresponding formal argument.
This means that the formals must have labels that are more restrictive than
the block label of the caller at the time of the
call, i.e., the normal rule for assignment
given in Section 5.4
applies here.
Additionally, bindings
must be found for all the implicit label parameters such that for
each explicit or implicit formal argument label Li and actual
argument label Lai,
the constraint LaiB Li holds, where B
denotes the basic block label of the call site. Determining
the bindings for the implicit labels that will satisfy all these constraints
is not trivial, since the formal argument labels may be
join expressions that mention other implicit argument labels, as in
the signature f(a: int, b:int {aX})
, where X
is
some
other label. The efficient solution of
such constraint systems is considered in Section 6.
Furthermore, if the authority clause is present, the caller
may provide one or more principals that it acts for.
Such a call can occur only within the scope of an if_acts_for
statement, since otherwise the effective authority of the caller is
nil.
For example, the following call from the procedure p
grants the authority of the principal
my_principal
to the procedure doit
:
if_acts_for(p, my_principal) then doit<<my_principal>>(...) end
This model for granting authority protects the caller of a procedure because it can select what part of its authority to grant to the procedure. The implementor of the called procedure is also protected, because the procedure uses only the authority its implementor wishes to claim, and only where needed. (This is similar to the CACL model [RSC92], but provides more protection for the implementor.)
protected[T]
Run-time label checking is conveniently accomplished by using the
special type protected[T]
. A protected[T]
is an immutable object
that contains two things: a value of type T
, and a label that
protects the value.
The type protected[T]
is particularly useful for implementing structures
like file systems, where the information flow policies of
individual objects cannot be deduced statically. For example, a file system
directory is essentially an array of protected
file objects. Navigating
the file system requires a series of run-time label checks, but this
work is unavoidable.
protected[T]
has two methods: get
, which extracts the contained value,
and get_label
, which extracts the contained label.
It also has a constructor,
protect
, which creates a new protected value. The get
method requires an expected label as an argument.
The value is returned only if the expected label is
at least as restrictive as the contained label, which is determined at run time.
The expected label must be either a constant label or a variable label;
implicit labels are not allowed.
The procedural signatures of these operations are as follows:
protect(T: type, lb: label, v: T{lb}) returns (p: protected[T]{lb}) % Create a new "protected[T]" containing % the value "v" and label "lb". get_label(p: protected[T]) returns (lb: label{p}) % Return the label that is contained in "p" get (p: protected[T], expect: label) returns (success: bool{p expect}, v: T{expect expect p}) % Return the value that is contained in "p" if % the label "expect" matches the contained label. % Set "success" accordingly.
If the language is extended to support some form of data abstraction
with encapsulation, the type protected
can be
implemented in a natural manner by using the labelcase
statement. Without
these extensions, the closest approximation to protected[T]{lb}
is
the following type:
record[lb: label, x: T{lb}]{lb}
Like this type, the type protected[T]
has the special
property that the label on the reference to a protected[T]
(the label lb
)
is assumed to be the label on the contained label, lb
. This
constraint can be maintained
because protected[T]
is immutable, and because
L1L2
implies
protected[T]
{L1}
protected[T]
{L2}.
A protected[T]
allows information flow via its contained label:
one could pass information by storing different labels
in protected[T]
objects, and then seeing whether get
operations
succeed.
However, the signature of get
prevents an information leak
because the success
result is labeled both
by the label on the label being passed in (lb
), and by p
's label
(p
), which is the same as the label on the contained label.
Therefore, this information flow does not create a leak.
The bank example from Section 2 is a good example of the
need for run-time labels. Each customer account is owned by the individual
customer, rather than by the bank, which gives the customer more confidence
in the privacy offered by the bank. This example can be conveniently
written using protected[T]
to protect each customer's account with a
label specific to the customer, as shown in Figure 7.
account = record [ name: string{bank_label}, balance: protected[float]{bank_label} ] get_balance (name: string, customer: label, accts: array[account{}]{}) returns (success: bool {name customer customer }, balance: float {name customer customer}) ... % find element "i" containing the right customer s: bool{name customer customer bank_label} b: bool{name customer customer bank_label} s, b := get(accts[i].balance, customer) if_acts_for (get_balance, Bank) then success := declassify(s, {name customer customer}) balance := declassify(b, {name customer customer}) end end get_balance
For example, the customer's account can be a simple record type, where
the customer's name is protected by (and accessible to) the bank, but
balance is protected by both the bank label and a customer label that is
stored inside the protected[float]
. This design gives the
customers protection against dissemination of their balance information.
In this code, Bank
represents a principal with the
ability to declassify the bank_label
label.
The current balance of an account is obtained by the procedure
get_balance
, which accepts a first-class label as an argument. If
the label customer
that is passed to get_balance
is at least as restrictive as the
label in the account, the balance is returned as a float labeled by
customer
. The procedure fails either if no customer exists by that
name, or if the label passed in is insufficiently protected.
Output channels show up as the special opaque type ``channel
'' in the
language. The type channel
denotes an output channel with a hidden
reader set, whose members denote the principals who are reading from the
channel. Information can be written to a channel using the ``write
''
procedure:
write(c: channel{lb}, s: string{lb}, lb: label{lb})
When a write is attempted, the label lb
is compared against
the hidden reader set within the channel. If it passes the
tests of the effective reader set that are described in
Section 3.5, the write is successful. Otherwise,
it silently fails.
It is important that lb
capture all the possible information flows that
the write will cause, since otherwise write
would not perform a sufficiently
stringent test against the channel's reader set.
Because lb
is used to label s
,
the channel cannot leak information through the contents of the data
that is sent out. Because lb
is used to label the argument c
as
well, the channel cannot be used to
leak information by choosing among a set of
channels to write to. Finally, because lb
labels itself, the channel
cannot be used to leak information
by changing the label that is passed in as the
lb
argument, and transmitting information by the fact of the write's
success or failure.
Next: 6 Verification and Label Up: A Model for Decentralized Previous: 4 Checking Labels
Andrew C. Myers, Barbara Liskov