MPRI PRFA — Proof Assistants
Go back to the course homepage.
Inductive types — part 1
This week, we will talk about inductive types. At their basis, inductive types are similar to algebraic data types from OCaml or Haskell. In Rocq, they are however more expressive, and can for instance also be used to express certain logical predicates.
We first make sure that the use of bullets is enforced:
Set Default Goal Selector "!".
Inductive types can be defined using the Inductive keyword.
Below, we give the definition of natural numbers nat. They are exactly the same as the ones from the standard library which we have seen in the first lecture.
The type nat is made of two constructors: O and S. We have O : nat and S n : nat if n : nat. (Notice how this is O and not 0 as names cannot start with a numeral. The use of 0 is allowed thanks to a notation mechanism.) Rocq uses BNF (Backus-Naur-form) notation for inductive types:
If you look carefully, you will see that Rocq says that nat is of type Set when we wrote nat : Type. This is mostly for historical reasons and you can roughly conflate the two. We encourage you to use Type though.
Elements of the type nat now consist exactly of O, S O, S (S O), and so on:
The Inductive command allows syntactic flexibility. We could also have written the following:
Inductive nat := | O : nat | S (n : nat) : nat.
or
Inductive nat := O | S (n : nat).
In practice you get to pick the variant that suits you the most as they are anyway equivalent: they merely differ in how you write them, but produce the same definition.
As the word inductive type hints, one can do proofs by induction about them. Mathematically, an inductive type is the smallest type closed under the type's constructors. You are already familiar with induction on natural numbers from week 1. However, make sure that you are also familiar with the exact formulation of the induction principle:
As you have seen in week 1, one can define recursive functions operating on inductive arguments with the Fixpoint keyword:
Rocq ensures that recursive functions are terminating, for reasons we will discuss later.
To ensure termination, recursion has to be performed on a structurally smaller argument. In the case of double from last week, there was only one argument so it was obvious which one had to decrease. This time, there is an ambiguity: it could be either n or m. The {struct n} annotation above tells Rocq that n is the one that is structurally decreasing.
Of course, it won't blindly trust us, there is a complex termination checker (sometimes called guard checker) which will make sure this is indeed the case. For instance, if you were to suggest that m was structurally decreasing, Rocq would rightfully complain:
The important bit of the error message is the one saying:
Recursive call to add' has principal argument equal to "m" instead of a subterm of "m".
In other words it's complaining about the recursive call being performed on m again.
Note that there are other ways to define functions using more complex recursion patterns — you will learn about them soon.
Evaluating functions
It is possible to evaluate a function, such as the add function above by using the Eval command.
Note how the following evaluates to fun m => m (the identity function):
However, the following expression cannot be simplified using computation.
Instead, we can prove this equation:
n: natadd n O = nn: natadd n O = nadd O O = On: nat
IHn: add n O = nadd (S n) O = S nreflexivity.add O O = On: nat
IHn: add n O = nadd (S n) O = S nn: nat
IHn: add n O = nS (add n O) = S nreflexivity.n: nat
IHn: add n O = nS n = S n
We can introduce infix notations as follows:
Infix "+" := add.
See how Rocq prints the lemma statement using + now!
Now, even if we still use add, Rocq will print the goal using +.
n, m: natn + S m = S (n + m)n, m: natn + S m = S (n + m)m: natO + S m = S (O + m)n, m: nat
IHn: n + S m = S (n + m)S n + S m = S (S n + m)reflexivity.m: natO + S m = S (O + m)n, m: nat
IHn: n + S m = S (n + m)S n + S m = S (S n + m)n, m: nat
IHn: n + S m = S (n + m)S (n + S m) = S (S (n + m))reflexivity.n, m: nat
IHn: n + S m = S (n + m)S (S (n + m)) = S (S (n + m))
We define (truncating) subtraction on natural numbers:
Note that we did not provide a {struct _} argument here because Rocq can infer it automatically. On paper, we will however often ask you to tell us which argument is structurally decreasing, so it is good to think for yourself about which argument decreases when writing a function.
Instead of using Infix to give a notation to sub we can also use the Notation command. It asks for more information and is sometimes tricky to use but it is very powerful. We will see more examples over the course of the lecture so don't worry too much about it.
Notation "n - m" := (sub n m).
An alternative to Eval simpl in e is Compute e. See what it does to the following two expressions.
We can now prove a lemma involving addition and subtraction that we just defined.
n, m: natn + m - n = mn, m: natn + m - n = mm: natO + m - O = mn, m: nat
IHn: n + m - n = mS n + m - S n = mm: natO + m - O = mm: natm - O = mall: reflexivity.O - O = Om: natS m - O = S massumption.n, m: nat
IHn: n + m - n = mS n + m - S n = m
Predicates, propositions and injectivity of constructors
In Rocq, everything is computational, including propositions.
We can define the following Boolean predicate testing whether a natural number is 0 or not.
But we can also produce directly a proposition asserting asserting whether the natural number is 0 or not. Pay attention to the use of Prop instead of bool and of capital True and False.
Of course, a proposition stating that n is 0 could equivalently be written as n = 0 as we have seen already. Similarly, we can express what it means to be different from 0 by writing n <> 0, which is Rocq's way of writing n ≠ 0 (in fact, if you use Unicode with Rocq you can also write n ≠ 0 directly, it is mostly a matter of taste and getting used to, see our FAQ to know more). This notation is nothing more than the negation of n = 0, in other words it unfolds to (n = 0) -> False.
We can use this to prove the fact that, in general, constructors are distinct. For nat this means that 0 and S n can always be distinguished.
n: natS n <> On: natS n <> O
This can be proved manually
n: nat
E: S n = OFalse
We use the change tactic to change the goal to something equivalent w.r.t. computation.
n: nat
E: S n = Ois_zero (S n)
We can simplify again to get the old goal. This is in fact the reason why change succeeded in the first place.
n: nat
E: S n = OFalsen: nat
E: S n = Ois_zero (S n)
Now that our goal is is_zero (S n) we want to use our hypothesis E : S n = 0 to get is_zero 0 instead. We can do this using the rewrite tactic which will replace occurrence of S n in the goal by the right-hand side of the equation, namely 0.
n: nat
E: S n = Ois_zero O
Computation is now handy again, because is_zero 0 is much easier to prove than False. We simplify and then conclude.
split.n: nat
E: S n = OTrue
There is also a tactic to automatically discharge such goals: discriminate.
n: natS n <> Odiscriminate.n: natS n <> O
Say now we want to prove that S is injective. We can achieve this by using an auxiliary function pred such that pred (S n) = n (in other words, it is a left inverse to S). By congruence (f_equal) we can thus go from S n = S m to pred (S n) = pred (S m) which is then the same as n = m.
You can see here that it doesn't matter what value we choose for pred 0. We're just going to pick 0.
We can now prove that S is injective. It is in fact true for all constructors.
n, m: natS n = S m -> n = mn, m: natS n = S m -> n = mn, m: nat
E: S n = S mn = mn, m: nat
E: S n = S mpred (S n) = mreflexivity.n, m: nat
E: S n = S mpred (S m) = m
We could in fact do it without pred.
n, m: natS n = S m -> n = mn, m: natS n = S m -> n = mn, m: nat
E: S n = S mn = mn, m: nat
E: S n = S mmatch S n with | O => True | S k => k = m endreflexivity.n, m: nat
E: S n = S mm = m
We can also use the f_equal lemma (not the tactic!).
n, m: natS n = S m -> n = mn, m: natS n = S m -> n = mn, m: natS n = S m -> n = mn, m: nat
E: S n = S mn = massumption.n, m: nat
E: pred (S n) = pred (S m)n = m
Finally, we have tactics that do this automatically such as inversion.
n, m: natS n = S m -> n = mn, m: natS n = S m -> n = mn, m: nat
E: S n = S mn = mreflexivity.n, m: nat
E: S n = S m
H0: n = mm = m
Generalising induction hypotheses
Let us now try and prove things about functions operating on inductive types. We're going to start by defining a function for deciding equality between natural numbers.
We're going to prove that it indeed decides equality of natural numbers. Below, we make use of the notation P <-> Q which corresponds to logical equivalence, in other words the conjunction of P -> Q and Q -> P.
n, m: nateq_nat n m = true <-> n = mn, m: nateq_nat n m = true <-> n = mm: nateq_nat O m = true <-> O = mn', m: nat
IH: eq_nat n' m = true <-> n' = meq_nat (S n') m = true <-> S n' = mm: natmatch m with | O => true | S _ => false end = true <-> O = mn', m: nat
IH: eq_nat n' m = true <-> n' = mmatch m with | O => false | S y' => eq_nat n' y' end = true <-> S n' = m
In case you are wondering, we made use of all: simpl. here to apply the simpl tactic to all subgoals, before even focusing them. all is called a goal selector and can be useful in certain situations.
m: natmatch m with | O => true | S _ => false end = true <-> O = mtrue = true <-> O = Om: natfalse = true <-> O = S mtrue = true <-> O = Om: natfalse = true <-> O = S mtrue = true <-> O = Oall: reflexivity.true = true -> O = OO = O -> true = true
Notice reflexivity is able to conclude A -> 0 = 0 for any A.
m: natfalse = true <-> O = S mm: natfalse = true -> O = S mm: natO = S m -> false = trueall: discriminate E.m: nat
E: false = trueO = S mm: nat
E: O = S mfalse = true
Here we provided E to discriminate to tell it which assumption was inconsistent (equating two distinct constructors).
n', m: nat
IH: eq_nat n' m = true <-> n' = mmatch m with | O => false | S y' => eq_nat n' y' end = true <-> S n' = mn': nat
IH: eq_nat n' O = true <-> n' = Ofalse = true <-> S n' = On', m: nat
IH: eq_nat n' (S m) = true <-> n' = S meq_nat n' m = true <-> S n' = S mn': nat
IH: eq_nat n' O = true <-> n' = Ofalse = true <-> S n' = Oall: discriminate.n': nat
IH: eq_nat n' O = true <-> n' = Ofalse = true -> S n' = On': nat
IH: eq_nat n' O = true <-> n' = OS n' = O -> false = truen', m: nat
IH: eq_nat n' (S m) = true <-> n' = S meq_nat n' m = true <-> S n' = S m
We are now stuck. The IH talks about S m, but we need it for m. Note that this always happens when a recursive function changes other arguments apart from its structural argument.
Abort.
The proper way to deal with this is by generalising the induction hypothesis. We have to do this before we start perfoming induction. The reason we have to do this is because Rocq is trying to infer the property we prove by induction on its own: we only say we want to perform induction on n and it then tries to infer the property from the goal. Fortunately, the induction tactic also supports generalisation with the keyword in, as follows.
n, m: nateq_nat n m = true <-> n = mn, m: nateq_nat n m = true <-> n = mm: nateq_nat O m = true <-> O = mn', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = meq_nat (S n') m = true <-> S n' = m
The syntax is bit heavy: m |- * represents which information should Rocq take into account when generating the induction hypothesis. The turnstile |- is supposed to represent the whole proof goal, with the hypotheses on the left, and the proposition to prove on the right. We use * to indicate that we want to include the goal (which you always will want to do) and m on the left to indicate we generalise over it, and nothing else.
m: nateq_nat O m = true <-> O = meq_nat O O = true <-> O = Om: nateq_nat O (S m) = true <-> O = S mtrue = true <-> O = Om: natfalse = true <-> O = S mtrue = true <-> O = Oall: reflexivity.true = true -> O = OO = O -> true = truem: natfalse = true <-> O = S mall: discriminate.m: natfalse = true -> O = S mm: natO = S m -> false = truen', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = meq_nat (S n') m = true <-> S n' = m
See how the IH is quantified for all m.
n': nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = meq_nat (S n') O = true <-> S n' = On', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = meq_nat (S n') (S m) = true <-> S n' = S mn': nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = mfalse = true <-> S n' = On', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = meq_nat n' m = true <-> S n' = S mn': nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = mfalse = true <-> S n' = Oall: discriminate.n': nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = mfalse = true -> S n' = On': nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = mS n' = O -> false = truen', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = meq_nat n' m = true <-> S n' = S mn', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = meq_nat n' m = true -> S n' = S mn', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = mS n' = S m -> eq_nat n' m = truen', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = m
E: eq_nat n' m = trueS n' = S mn', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = m
E: S n' = S meq_nat n' m = truen', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = m
E: eq_nat n' m = trueS n' = S mn', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = m
E: n' = mS n' = S mreflexivity.n', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = m
E: n' = mS m = S mn', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = m
E: S n' = S meq_nat n' m = truen', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = m
E: S n' = S mn' = mreflexivity.n', m: nat
IH:= forall m: nat, eq_nat n' m = true <-> n' = m
E: S n' = S m
H0: n' = mm = m
Booleans
Booleans are even simpler.
Since bool is not recursive, the induction principle is a lot simpler. In this case, it makes sense to talk about case anlysis instead.
Again, since it's an inductive type, the constructors are disjoint.
true <> falsediscriminate.true <> false
Cartesian product
We can define more types, such as the cartesian product.
We define product types inductively with one constructor taking two arguments.
Note how the constructor pair takes 4 arguments:
Namely types X and Y, and elements x : X and y : Y.
The first two arguments are real arguments that can be passed explicitly: types are first-class in Rocq.
Furthermore, types can appear at any position in arguments, unlike OCaml where the quantification is necessarily prenex.
Also, the type arguments have to be passed.
However, we can declare arguments as implicit using the Arguments command by marking implicit arguments with braces {} (sometimes called squigly brackets).
Arguments pair' {X} x {Y} y.
When an argument is implicit, Rocq tries to infer what it has to be, based on the other passed arguments. If we build the pair (O,O) then both types have to be nat and Rocq is able to figure it out.
We can use the following command to force Rocq to print implicit arguments.
Set Printing Implicit.
There we can see the nat that have been inferred.
We go back to not printing implicit arguments.
Unset Printing Implicit.
We can define the first projection by pattern matching.
Note that we mark arguments as implicit directly.
Remark: In the pattern matching above, we had to also talk about the type parameters of pair even though they are not necessary. By making them implicit in the constructor pair, we can write nicer pattern matching.
Arguments pair {X Y} x y.
Lists
The type of list is both recursive and has a type parameter.
We declare arguments as implicit. Note here we can specify a prefix if we want.
Arguments nil {A}. Arguments cons {A}. Set Printing Implicit.Unset Printing Implicit.
Check now the induction principle we get for lists. It's saying that to prove P l for some list l, you only need to prove
P nil
P (cons a l') for any a and l' such that P l'
Have a look at its type:
Proofs by induction
We reimport all these datatypes from Rocq, to be able to use functions from the standard library.
From Stdlib Require Import Datatypes List. Open Scope nat.
Like for nat, list in the standard library comes with nice notations to be able to write x :: l or [ a ; b ; c ; d ]. We can load them as follows.
Import ListNotations.
Below we check the types of concatenation and reversal of lists.
Lemmas about lists we typically proven by induction as we did for natural numbers.
A: Type
l: list Al ++ [] = lA: Type
l: list Al ++ [] = lA: Type[] ++ [] = []A: Type
a: A
l: list A
IHl: l ++ [] = l(a :: l) ++ [] = a :: lA: Type[] = []A: Type
a: A
l: list A
IHl: l ++ [] = la :: l ++ [] = a :: lreflexivity.A: Type[] = []A: Type
a: A
l: list A
IHl: l ++ [] = la :: l ++ [] = a :: lassumption.A: Type
a: A
l: list A
IHl: l ++ [] = ll ++ [] = lA: Type
l1, l2: list Arev (l1 ++ l2) = rev l2 ++ rev l1A: Type
l1, l2: list Arev (l1 ++ l2) = rev l2 ++ rev l1A: Type
l2: list Arev ([] ++ l2) = rev l2 ++ rev []A: Type
a: A
l, l2: list A
IHl: rev (l ++ l2) = rev l2 ++ rev lrev ((a :: l) ++ l2) = rev l2 ++ rev (a :: l)A: Type
l2: list Arev l2 = rev l2 ++ []A: Type
a: A
l, l2: list A
IHl: rev (l ++ l2) = rev l2 ++ rev lrev (l ++ l2) ++ [a] = rev l2 ++ rev l ++ [a]A: Type
l2: list Arev l2 = rev l2 ++ []reflexivity.A: Type
l2: list Arev l2 = rev l2A: Type
a: A
l, l2: list A
IHl: rev (l ++ l2) = rev l2 ++ rev lrev (l ++ l2) ++ [a] = rev l2 ++ rev l ++ [a]
For rewrite we can separate two lemmas to rewrite by a comma.
reflexivity.A: Type
a: A
l, l2: list A
IHl: rev (l ++ l2) = rev l2 ++ rev l(rev l2 ++ rev l) ++ [a] = (rev l2 ++ rev l) ++ [a]A: Type
l: list Arev (rev l) = lA: Type
l: list Arev (rev l) = lA: Typerev (rev []) = []A: Type
a: A
l: list A
IHl: rev (rev l) = lrev (rev (a :: l)) = a :: lA: Type[] = []A: Type
a: A
l: list A
IHl: rev (rev l) = lrev (rev l ++ [a]) = a :: lreflexivity.A: Type[] = []A: Type
a: A
l: list A
IHl: rev (rev l) = lrev (rev l ++ [a]) = a :: lA: Type
a: A
l: list A
IHl: rev (rev l) = lrev [a] ++ rev (rev l) = a :: lA: Type
a: A
l: list A
IHl: rev (rev l) = la :: rev (rev l) = a :: lreflexivity.A: Type
a: A
l: list A
IHl: rev (rev l) = la :: l = a :: l
The rev defined above is not very smart because it uses concatenation on the right which is expensive. However, it works as a specification for a smarter version using an accumulator.
We can show the two behave the same.
A: Type
l: list Afast_rev l [] = rev lA: Type
l: list Afast_rev l [] = rev lA: Typefast_rev [] [] = rev []A: Type
a: A
l: list A
IHl: fast_rev l [] = rev lfast_rev (a :: l) [] = rev (a :: l)A: Type[] = []A: Type
a: A
l: list A
IHl: fast_rev l [] = rev lfast_rev l [a] = rev l ++ [a]reflexivity.A: Type[] = []A: Type
a: A
l: list A
IHl: fast_rev l [] = rev lfast_rev l [a] = rev l ++ [a]
We are stuck again, because the acc argument is changed.
Abort.A: Type
l: list Afast_rev l [] = rev lA: Type
l: list Afast_rev l [] = rev l
We prove a more general lemma locally. We're using forall here to introduce a parameter.
A: Type
l: list Aforall acc : list A, fast_rev l acc = rev l ++ accA: Type
l: list A
H:= forall acc: list A, fast_rev l acc = rev l ++ accfast_rev l [] = rev lA: Type
l: list Aforall acc : list A, fast_rev l acc = rev l ++ accA: Typeforall acc : list A, fast_rev [] acc = rev [] ++ accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ accforall acc : list A, fast_rev (a :: l) acc = rev (a :: l) ++ accA: Typeforall acc : list A, acc = accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ accforall acc : list A, fast_rev l (a :: acc) = (rev l ++ [a]) ++ accA: Type
acc: list Aacc = accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ acc
acc: list Afast_rev l (a :: acc) = (rev l ++ [a]) ++ accreflexivity.A: Type
acc: list Aacc = accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ acc
acc: list Afast_rev l (a :: acc) = (rev l ++ [a]) ++ accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ acc
acc: list Arev l ++ a :: acc = (rev l ++ [a]) ++ accreflexivity.A: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ acc
acc: list Arev l ++ a :: acc = rev l ++ [a] ++ accA: Type
l: list A
H:= forall acc: list A, fast_rev l acc = rev l ++ accfast_rev l [] = rev lA: Type
l: list A
H:= forall acc: list A, fast_rev l acc = rev l ++ accrev l ++ [] = rev lA: Type
l: list A
H:= forall acc: list A, fast_rev l acc = rev l ++ accrev l ++ [] = rev lreflexivity.A: Type
l: list A
H:= forall acc: list A, fast_rev l acc = rev l ++ accrev l = rev l
We can prove things in a different way.
A: Type
l: list Afast_rev l [] = rev lA: Type
l: list Afast_rev l [] = rev l
We are going to generalise the empty list to any acc. We write @nil A to provide the type argument A explicitly.
A: Type
l: list Afast_rev l [] = rev l ++ []A: Type
l: list Aforall acc : list A, fast_rev l acc = rev l ++ accA: Typeforall acc : list A, fast_rev [] acc = rev [] ++ accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ accforall acc : list A, fast_rev (a :: l) acc = rev (a :: l) ++ accA: Typeforall acc : list A, acc = accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ accforall acc : list A, fast_rev l (a :: acc) = (rev l ++ [a]) ++ accA: Type
acc: list Aacc = accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ acc
acc: list Afast_rev l (a :: acc) = (rev l ++ [a]) ++ accreflexivity.A: Type
acc: list Aacc = accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ acc
acc: list Afast_rev l (a :: acc) = (rev l ++ [a]) ++ accA: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ acc
acc: list Arev l ++ a :: acc = (rev l ++ [a]) ++ accreflexivity.A: Type
a: A
l: list A
IHl:= forall acc: list A, fast_rev l acc = rev l ++ acc
acc: list Arev l ++ a :: acc = rev l ++ [a] ++ acc