Varargs

10 Varargs

C functions that take a variable number of arguments (vararg functions) are syntactically convenient for the caller, but C makes it very difficult to ensure safety. The callee has no fool-proof way to determine the number of arguments or even their types. Also, there is no type information for the compiler to use at call-sites to reject bad calls.

Cyclone provides three styles of vararg functions that provide different trade-offs for safety, efficiency, and convenience.

First, you can call C vararg functions just as you would in C:

extern "C" void foo(int x, ...);
void g() { 
  foo(3, 7, "hi", 'x');
}

However, for the reasons described above, foo is almost surely unsafe. All the Cyclone compiler will do is ensure that the vararg arguments at the call site have some legal Cyclone type.

Actually, you can declare a Cyclone function to take C-style varargs, but Cyclone provides no way to access the vararg arguments for this style. That is why the example refers to a C function. (In the future, function subtyping could make this style less than completely silly for Cyclone functions.)

The second style is for a variable number of arguments of one type:

void foo(int x, ...string_t args);
void g() {
  foo(17, "hi", "mom");
}

The syntax is a type and identifer after the ``...''. (The identifier is optional in prototypes, as with other parameters.) You can use any identifier; args is not special. At the call-site, Cyclone will ensure that each vararg has the correct type, in this case string_t.

Accessing the varargs is simpler than in C. Continuing our example, args has type string_t *@fat `foo in the body of foo. You retrieve the first argument ("hi") with args[0], the second argument ("mom") with args[1], and so on. Of course, numelts(args) tells you how many arguments there are.

This style is implemented as follows: At the call-site, the compiler generates a stack-allocated array with the array elements. It then passes a ``fat pointer'' to the callee with bounds indicating the number of elements in the array. Compared to C-style varargs, this style is less efficient because there is a bounds-check and an extra level of indirection for each vararg access. But we get safety and using vararg functions is just as convenient. No heap allocation occurs.

A useful example of this style is in the list library:

list_t<`a> list(... `a argv) {
  list_t result = NULL;
  for (int i = numelts(argv) - 1; i >= 0; i--) 
    result = new List{argv[i],result};
  return result;
}

Callers can now write list(1,2,3,4,5) and get a list of 5 elements.

The third style addresses the problem that it's often desirable to have a function take a variable number of arguments of different types. For example, printf works this way. In Cyclone, we could use a datatype in conjunction with the second style. The callee then uses an array subscript to access a vararg and a switch statement to determine its datatype variant. But this would not be very convenient for the caller---it would have to explicitly ``wrap'' each vararg in the datatype type. The third style makes this wrapping implicit. For example, the type of printf in Cyclone is:

extern datatype PrintArg<`r::R> {
  String_pa(const char ? *@notnull @nozeroterm`r);
  Int_pa(unsigned long);
  Double_pa(double);
  LongDouble_pa(long double);
  ShortPtr_pa(short *@notnull `r);
  IntPtr_pa(unsigned long *@notnull `r);
};
typedef datatype PrintArg<`r> *@notnull `r parg_t<`r>;
int printf(const char *@fat fmt, ... inject parg_t);

The special syntax ``inject'' is the syntactic distinction for the third style. The type must be a datatype type. In the body of the vararg function, the array holding the vararg elements have this datatype type, with the function's region. (That is, the wrappers are stack-allocated just as the vararg array is.)

At the call-site, the compiler implicitly wraps each vararg by finding a datatype variant that has the expression's type and using it. The exact rules for finding the variant are as follows: Look in order for a variant that carries exactly the type of the expression. Use the first variant that matches. If none, make a second pass and find the first variant that carries a type to which the expression can be coerced. If none, it is a compile-time error.

In practice, the datatype types used for this style of vararg tend to be quite specialized and used only for vararg purposes.

Compared to the other styles, the third style is less efficient because the caller must wrap and the callee unwrap each argument. But everything is allocated on the stack and call sites do everything implicitly. A testament to the style's power is the library's implementation of printf and scanf entirely in Cyclone (except for the actual I/O system calls, of course).