===========================================================================
ICFP 2017 Review #13A
---------------------------------------------------------------------------
Paper #13: Revisiting Parametricity: Inductives and Uniformity of Propositions
---------------------------------------------------------------------------
Overall merit: C. Weak paper, though I will not fight
strongly against it
Confidence: Y. I am knowledgeable in this area,
but not an expert
===== Paper summary =====
It is a well-known fact that parametricity for system F gives us theorems for free. What is less known is how to extend this result to richer languages. This paper presents a parametricity translation for a large fragment of Coq, and outlines the considerations and limitations of such translation, even fixing a bug in an already published result.
===== Comments to authors =====
My main concern about this work is that it is staying in an uncomfortable middle ground: from a theoretical standpoint, it does not provide a proof of the abstraction theorem (even for a restricted set of terms); from a practical standpoint it does not provide enough information to quickly test the ideas presented in another setting.
About the second point, the paper mentions a translation that is otherwise not outlined in the paper, and is not included in the supplementary material. Moreover, in the one example included (Section 5) is not clear, at least to me, what is the free theorem being used and how.
As a last relevant point, I found the presentation hard to follow. The reader is asked to care about problems that are not yet in scope.
Minor observations:
===========================================================================
ICFP 2017 Review #13B
---------------------------------------------------------------------------
Paper #13: Revisiting Parametricity: Inductives and Uniformity of Propositions
---------------------------------------------------------------------------
Overall merit: C. Weak paper, though I will not fight
strongly against it
Confidence: X. I am an expert in this area
===== Paper summary =====
Various translations exist in the literature that take judgements of dependent type theory and translate them to their relationally parametric interpretation in the same dependent type theory.
These translations split every variable x : T
of the original judgement, producing two (for binary; or n, for n-ary parametricity) variables x : T
and x_2 : T_2
, plus a third variable x_r : [[T]] x x_2
which witnesses that x and x_2 are related according to the relation [[T]], which is the interpretation of the type T.
Terms t : T
are interpreted to terms [[t]] : [[T]] t t_2
. In particular, types T : Set are interpreted to [[T]] : T -> T_2 -> Sêt, where Sêt is some universe.
The parametric interpretation can then be used as a shortcut to prove theorems that would otherwise be tedious to prove, such as (see §5 of the paper) alpha-equivalence.
A disappointing property of existing translations is that the translation of a proposition P : Prop
is useless: it takes the from [[P]] : P -> P_2 -> Prop
, which is a rather uninformative (proof-relevant) theorem to have.
Indeed, regardless of what P and P_2 are, we have λ p p_2.True : P -> P_2 -> Prop
.
Of course, we would prefer to have instead a translation [[P]] witnessing that P and P_2 are in fact isomorphic (i.e. logically equivalent).
The authors demonstrate with a simple counterexample that this is quite simply not the case if we do not make additional assumptions, and so they investigated which additional assumptions we can make to automatically prove logical equivalence of P and P_2.
They introduce two additional properties that we can assume for the relational interpretations [[T]] of the types T that P depends on: [[T]] may be assumed to be one-to-one (essentially an injective partial function T -> T_2) or total (every element of T and T_2 is related to some element of T_2 / T respectively).
It is investigated for Pi-types, inductive propositions, and inductive types, which assumptions one needs to impose on the input types.
===== Comments to authors =====
For this paper, I asked a sub-review, included below (except for the summary which is included above).
I also read the paper myself and will include my own assessment above.
Strong points:
Interesting topic: the problem being adressed is real and important
Practical applicability of the presented solution is convincingly demonstrated with a useful application in a real (and quite typical) Coq codebase.
Weak points:
Writing is unstructured and often too technical (e.g. I found many parts of section 3 impossible to follow)
Technical choices are often motivated in a "x doesn't work, so we have to do y"-style, without much reflection about the big picture (see more about this below).
No formalisation of the translation, no proof of well-typedness of translated terms. Not even a comment on whether such a proof has been attempted or why it failed.
Some aspects of the approach feel hacky, particularly the post-processing "let's now remove every assumption that our generated proof isn't actually using" phase.
Missing related work. See below.
Summary assessment:
This paper presents a solution to an important problem, that is already practically useful.
However, I think there is still some work left on (1) finding ways to better motivate and explain the design, (2) developing more insights into the big picture of what you are doing and (3) improving the structure of the paper and the comparison to related work.
As such, I think this paper should not be accepted although I believe that if the authors are able to improve the above-listed weaknesses, a future version of the paper will stand a much better chance.
Major remarks:
You don't present a proof of well-typedness of translated terms, or even a formalisation of the translation.
Some of the work you build on does come with such a formalisation and proof (Bernardy et al., ICFP 2010), although IIUC you claim they missed a case.
The fact that they missed a case, shows the limits of on-paper proofs, but can also be seen as an argument that it's worth doing the exercise, to try and reduce the odds of it happening.
Anyway, have you attempted such a proof?
How hard do you expect it to be?
Neelakantan R. Krishnaswami and Derek Dreyer. 2013. Internalizing Relational Parametricity in the Extensional Calculus of Constructions. In Computer Science Logic (CSL).
You only discuss the last of the above related papers in one paragraph in the related work.
You say "Also, they enforced the property in the metatheory, whereas we build internally expressed proofs as a program translation.", but it's not clear whether you think this approach (metatheoretic parametricity proofs, sometimes with parametricity axioms available internally) is better or worse or whether each has their own advantages.
Requiring relationships that are both total and one-to-one (i.e. isomorphisms) is a very strong requirement.
For example, one-to-one-ness doesn't seem to be satisfied in the Bessel-Descartes fable, right?
In some sense, this suggests to me that you're no longer proving parametricity but rather a form of univalence.
It's not clear to me from your paper whether you think these requirements are just sufficient or also necessary.
In other words: are they needed because of how your particular translation works or is there a reason why (for some example?) they're required in any possible translation?
General remarks:
I dislike your use of "x, x_2, x_r" in translated terms and would much prefer to see "x_1, x_2 and x_r".
This would distinguish "x_1" from "x" (as used in untranslated terms) and would be nicely symmetric.
p.3,l.22: "polymorphic propositions": what does this mean?
p.3, l.41-p.4,l.2: I find it weird that you seem to be saying certain sorts of proofs don't need higher universes.
In my opinion, whether you need higher universes depends on how you are proving something, not what you are proving.
p.4,l.11: "well-suited for defining logical relations": where have you talked about this before?
p.5, l.34: "show that having ... break the abstraction theorem for...": I think it's worthwhile to explain better why this is the case.
p.9, discussion of the index-equality proofs required: I wonder whether this requirement turns up if you do the regular translation after you desugar the indexed data type to a parametrised data type with embedded equality proofs for the indices?
I believe this is a way to try and make more sense of your requirements, similar to what my sub-reviewer suggested about Church encodings.
p.12: what do you actually generate: Coq or Gallina?
p.14, l.7: a trivial consequence of the proof-irrelevance axiom: I thought you previously said your translation doesn't rely on axioms?
p.16, l.7-8: There's a reason Coq allows you to leave some arguments implicit.
ICFP papers are also allowed to do this, you know...
p.20, l.4: "our reifier": what's a reifier?
p.23, l.39-..: I'm skeptical about some of the things you say about HoTT, like the fact that it doesn't support proof irrelevance.
IIRC, HoTT defines a notion of h-Props as precisely those sets that satisfy proof irrelevance and the HoTT book has quite a bit of discussion about this?
Also the suggestion that it's hard to use dependent pairs without inj_pair2 strikes me as a bit strong.
Typos and general nitpicking:
"typehood judgements": never heard this term before. You seem to mean well-typedness judgements?
p.7, l.11: " to only create proofs": "only to create proofs"
p.7, l.27: "resultant": resulting
p.8, l.38: the second O on this line should be O_2, I think.
p.9, l.34: break line before "If an inductive ...": here, but also elsewhere, you often put a lot of stuff into one paragraph, but I recommend you apply the principle of "one paragraph, one idea"
p.11, l.12: "3rd" is typeset weird here. You should use "3\textsuperscript{rd}" instead of "3$^{rd}$".
p.12: "Using reification and reflection (Malecha and Sozeau)": this citation sounds to me as if these guys invented reification and reflection, which they did not IIUC.
In principle I am enthousiastic about this direction of research. The authors make the correct and important observation that the current interpretation of propositions is useless. (Also for multi-element types, we will sometimes want to find an isomorphism, rather than a mere relation, but it is legitimate to not consider that in a first approach of the problem.) This has to my knowledge not been adequately addressed so far in any publication. Homotopy type theory (HoTT) does automatically produce equivalences, but also requires equivalences as input, which is more than sometimes necessary. The authors show that additional assumptions are needed and choose, I believe, the right direction when they investigate the structure of P : Prop to find out what assumptions are needed precisely. I find it surprising that considering the one-to-one and totality properties yields good results here; this approach is to my knowledge new not only in published work but also in the WIP that I am aware of.
The reasons why I give a C rating nonetheless, are:
In short, I think this is a relatively weak presentation of a valuable piece of work.
While reading the paper, I repeatedly found myself wondering in puzzlement: what are we doing now and why? So I made an outline of the structure and found out that some section titles do not match the content of the section. I include the outline, with structure-related and other remarks, below.
This is related to point 1, of course. My primary issue was that it is next to impossible to quickly reread something that you recall having read earlier.
The text on p7, ln 41-42, is very important. Yet after having read the full paper, I had much trouble retrieving it, and even more trouble in retrieving its argumentation. There are two claims here: (1) the deductive-style translation is necessary for inductive types if Sêt = Prop, and (2) the inductive-style translation is necessary for inductive propositions. Each claim has its own argumentation. If you would, e.g., state these claims as two lemmas (\begin{lemma} ... \end{lemma}), each with a proof, then they influence the page lay-out and allow me to quickly find them back.
In order to prove logical equivalence of related propositions P and P_2, the authors investigate the structure of P and make assumptions based on that. So the structure of P gives rise to the assumptions that are made. The specifics of this interaction are the central theoretical contribution of this paper, yet they are spread throughout, in subsentences in the middle of proofs that do not have a corresponding theorem featured as such. If I want to check the assumptions made for the universal quantifier, I have to reread the entire reasoning to find e.g. the clause 'Using the CompleteRel property of B_r, ...' (p14, ln 36).
I suggest instead to anounce every proof in the paper with some theorem environment, in which the assumptions are clearly listed. Since the interaction between proposition structure and assumptions is so central to the paper, I would moreover recommend to add a clear table or (if the results turn out to be too capricious to fit a table) a dedicated section that reviews the interaction and discusses it.
Some key recommendations:
In general, a proof serves two purposes: to establish that a claim is true, and to allow the reader to understand why. I think an important virtue of proof-assistants is that they allow to separate those concerns: you can convince Coq that your results are true, and dedicate your paper to teaching the reader a feel for why they are true. This may mean that instead of giving a fully general and technical proof, you give an illuminating example that makes the general theorem credible.
It especially means that you should not talk to your reader the way you talk to Coq. A good example where you do this is on p16, ln 9-10. Here, you explicitly 'introduce' a variable and explicitly invoke the substitution rule. In Coq, in order to prove a universally quantified theorem, you use Intro. But when explaining to a human that something holds for all y, you say 'pick any y', not 'we introduce y'. Similarly, you need not tell a human reader when to substitute one thing for another; instead you can just explain why indices are equal and perhaps remind us of the equality when the substitution needs to happen.
The entire proof about IWP (which, by the way, could use a corresponding theorem) - as well as many proofs in the paper - reads like a direct English-language transcript of a Coq proof. I think instead of giving a solid but extremely technical proof, you should seize the opportunity to help your reader understand why you need all the assumptions, and only the assumptions, that you make.
The HoTT-book may serve as a guide here: it does an excellent job of doing type theory without treating its reader like a proof assistant.
As mentioned before, I think you need to gather somewhere the specifics of the interaction between proposition structure and assumptions. But I suggest you moreover discuss these gathered results: what do they mean, how do they arise, what are the patterns and how can we understand this? In rare occasions, truth is just ugly and there is nothing to say, but I am not convinced that this is the case here. In fact I think interesting further structure can be identified, see also my comments on §4.
One thing that I think may be illuminating, is to consider Church encoding. For types, this is problematic as Set is predicative, but indexed inductive propositions may (if I'm not mistaken) be Church-encoded using only Pi-types and Prop. E.g. we can encode le : nat -> nat -> Prop as
le m n = (P : nat -> nat -> Prop) -> ((k : nat) -> P k k) -> ((k l : nat) -> (P k l) -> (P k (S l))) -> P m n
This may allow you to explain the assumptions made for indexed inductive propositions, in terms of the ones made for Pi-types. For inductive Sets, the same would be possible if Set were impredicative, see e.g. Robert Atkey, Neil Ghani, Patricia Johann, POPL 2014, 'A relationally parametric model of dependent type theory'.
Ok.
My summary of the section:
- you review Keller&Lasson's and Bernardy et al.'s translations
- you show that deductive-style is more suitable for inductive types (and necessary with Sêt = Prop)
- you show that inductive style is necessary for inductive props
My summary of the section:
you fix the subtle flaw and add an additional equality constraint
p9, ln 40: This example is (1) proof-irrelevant, which is not a requirement here, and (2) contrived and practically useless, disallowing the reader to appeal to their intuition. I suggest to use for example (forgive me my flawed Coq grammar)
Inductive Contains (A : Set) : \forall (n : nat) (v : Vec A n) (a : A), Set :=
hdContains : \forall (n : nat) (a : A) (v : Vec A n), Contains (S n) (consV a v) a
tlContains : \forall (n : nat) (a : A) (v : Vec A n) (Contains n v b), Contains (S n) (consV a v) b
p10, ln 11: So 'singleton' is Coq terminology for a type with at most one element?
As for the depIndRec example, I would suggest the following (corresponding to the Contains example):
Definition getIndex (A : Set) (n : nat) (v : Vec A n) (a : A) (c : Contains n v a) : Fin n :=
match c with
| hdContains _ _ _ => fzero
| tlContains _ _ _ c' => fsucc (getIndex _ _ _ c')
Figure 1 could be made more readable by using underscores or implicit arguments.
My summary: how to assert IffProps and CompleteRel for universally quantified propositions
My summary: We will require props to be total & one-to-one rather than equivalent & complete
In fact, you explicitly say so: "In this section, we consider the subtyping rule"
My summary: When is a Pi-type one-to-one or total?
My summary: When is an indexed inductive type one-to-one or total?
I'll admit that I didn't work through the details here.
My summary: This section explains the implementation approach you took.
See my remarks above
This can be omitted in my opinion.
===========================================================================
ICFP 2017 Review #13C
---------------------------------------------------------------------------
Paper #13: Revisiting Parametricity: Inductives and Uniformity of Propositions
---------------------------------------------------------------------------
Overall merit: B. OK paper, but I will not champion
it
Confidence: X. I am an expert in this area
===== Paper summary =====
Free theorems for functional programs are all very well, but one might
expect that they would really be useful when generating proofs in a
proof assistant. This paper continues a line of work initiated by
Bernardy et al in synactic generation of proofs of Reynolds'
abstraction theorem for dependently typed proofs and programs. The
core of this paper is a discussion of a Coq tactic that generates
proofs given terms. The major contributions of this paper are:
Extension of previous translations to handle a larger range of
Coq's indexed inductive types. A detailed comparison between two
methods of relational translation is made, with discussion of what
works and what does not. A special, but representative case (W types)
is formalised in Coq to show that the translation should always work
(the actual tactic may fail during proof generation).
A new translation (the "IsoRel" translation) that has the desirable
property that two propositions (i.e. objects of the Coq sort 'Prop')
are related only when they are are equivalent propositions. This is in
contrast to the original translation (Bernardy et al, Lasson and
Keller), which stated that all propositions were related. This is a
much weaker property, and makes the translation effectively useless
when translating propositions. Unfortunately, the cost of this extra
strength is that relations between normal types are restricted to
isomorphisms. A slightly ad-hoc claim is made that some of these
assumptions of isomorphism can often be dropped in practice.
An application of the generation of abstraction proofs is given,
demonstrating its use in a compiler correctness project. It is claimed
that this use (would have) saved many hours of work.
===== Comments to authors =====
Things I liked:
The new translation properly handles (indexed) inductive types, and
there is a detailed discussion of the issues involved.
The discussion in the paper is based on a working tool for
constructing Coq proofs, which has been shown to work in practice.
The ability to transfer proofs between situations is really
powerful, and gives, concretely some of the promised abilities of
homotopy type theory. This is illustrated by an example in Section 5.
The paper is generally quite clear, though I found the language a
little too roundabout in places.
Things that could be improved:
The custom equality types in section 2.3 (on page 10) seem to be
solving the same problem as McBride's heterogeneous (or "John Major")
equality:
https://coq.inria.fr/distrib/current/stdlib/Coq.Logic.JMeq.html
JMEq
allows the statement of equalities between types that are not
yet known to be definitionally equal. It is specifically designed for
this kind of case of telescoping dependent indexes in inductive
types. Would using JMEq
simplify your translation. JMEq
supposes
proof irrelevance, but you are already doing this.
Some possibly related work:
Edmund Robinson: Parametricity as
Isomorphism. Theor. Comput. Sci. 136(1): 163-181 (1994)
The introduction keeps referring to "undecidable relations" when
talking about relations that are expressed in Coq as functions into
Prop. I understand that the point is that the 'AnyRel' translation
does the right thing for relations with codomain 'bool', and that
these are decidable. But the fact that the relations in Prop might be
undecidable isn't what makes the translation difficult, right?
The "Eliminating Unused Hypotheses" process described in Section
4.2 seems very ad-hoc.
The authors appear to have made everything fit the page limit by
removing all vertical space around displayed pieces of Coq code. I
found this quite difficult to read, despite the syntax highlighting.
In general, the writing is a bit wordy in places and could really
do with some editing. For example, page 2, lines 39 to 41 has a
sentence that starts with "As a result, .." and ends with "... true
and false" and contains two colons and several clauses. It could be
simplified by (a) deleting the word "much"; (b) deleting "which is not
a universe, and"; (c) deleting "they are mere ...".
I found that the prolix writing style made it quite difficult to
determine what was going on in some places. The constant back and
forth references to "explain below" (where?) and so on were also
distracting.
Also, I think the sectioning could be improved. The discussion of the
translation of inductive types (Sections 2.2-2.6) should be split out
from the basic 'AnyRel' translation in 2.1 in order to cleanly
separate what is new from what is not.
===========================================================================
Comment
---------------------------------------------------------------------------
PAGE LIMIT EXCEEDED: The submitted version of this paper exceeds the limit of 24 pages (not including references) that was specified in the call for papers. However, the main body of the paper does fit within the specified limit. For this reason, reviewers are instructed to treat the appendix to this paper as it had been submitted as supplementary material. In particular, this means that reviewers may choose not to look at the material in the appendix.