Vouivre Digital

Dependent type infrastructure for machine learning

2024/04/04 – Type System Update

It has been a while since the last update, but this is a good one. First, you can visit this news section to stay up-to-date on the latest developments. Second, use the Git repository to access the latest source code. Clone with git://git.vouivredigital.com/vouivre.git. Third, the main topic of this post, progress towards a dependent-type system worthy of its name.

Your Type, My Type, Everyone's Type

The type system presented in the last release is just a placeholder for something better. Starting today, the wip branch enables cool new features:

There are two pieces: system.scm and rules.scm. The former contains meta stuff to create type systems, while the latter defines the type system presented in E. Rijke, Introduction to Homotopy Type Theory. 2022. We only go over some examples here; look at the code for more. Here are five rules presented in the book and defined with our system that we will use next:

> (define-rule formation-3 () ((Γ ⊢ (type B)) (Γ ⊢ (≐ A B))))
> (define-rule type-symmetric () ((Γ ⊢ (≐ B A)) (Γ ⊢ (≐ A B))))
> (define-rule variable-conversion () ((Γ (: x A') Δ ⊢ J) (Γ ⊢ (≐ A A')) (Γ (: x A) Δ ⊢ J)))
> (define-rule substitution () ((Γ (/ Δ a x) ⊢ (/ J a x)) (Γ ⊢ (: a A)) (Γ (: x A) Δ ⊢ J)))
> (define-rule generic-element () ((Γ (: x A) ⊢ (: x A)) (Γ ⊢ (type A))))

The syntax is simple. To define a rule, we must give it a name, a list of literals (ignored at the moment), and a s-expression with the conclusion at the root and the hypotheses as leaves. The four kinds of judgments are expressed using the reserved symbols `:', `≐', `⊢', and `type'. Substitution uses the reserved symbol `/', and the generic judgment thesis (`J' here) can be any unreserved symbol as long as it is used as above. For example, the formation-3 rule says that, given two judgmentally equal types `A' and `B', we conclude that `B' is a type.

Now, let's see how to derive the "element conversion rule" that says: "In some context, if you have an element of a type and that type is judgmentally equal to another, then, in conclusion, the element must also be an element of that other type." We start by creating a "derivation" with the conclusion:

> (d '(Γ ⊢ (: a A')))
* = #<derivation tree: ((Γ ⊢ (: a A'))) points: (((Γ ⊢ (: a A'))))>

A derivation consists of a tree and a list of derivation points, i.e., incomplete points of the derivation. Then, we apply various rules, thus expanding the derivation tree and adding new points. When we achieve a hypothesis that suits our needs, we invoke `exact' to terminate the current derivation point and move on to the next one:

> (apply-rule * 'substitution 'a 'a 'x 'a' 'A 'A)
* = #<derivation tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A))) ((Γ (: a' A) ⊢ (: a' A')))) points: (((Γ ⊢ (: a A))) ((Γ (: a' A) ⊢ (: a' A'))))>
> (apply-rule * 'exact)
* = #<derivation tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) . #{.}#) ((Γ (: a' A) ⊢ (: a' A')))) points: (((Γ (: a' A) ⊢ (: a' A'))))>
> (apply-rule * 'variable-conversion #:index 1 'A 'A')
* = #<derivation tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) . #{.}#) ((Γ (: a' A) ⊢ (: a' A')) ((Γ ⊢ (≐ A' A))) ((Γ (: a' A') ⊢ (: a' A'))))) points: (((Γ ⊢ (≐ A' A))) ((Γ (: a' A') ⊢ (: a' A'))))>
> (apply-rule * 'type-symmetric)
* = #<derivation tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) . #{.}#) ((Γ (: a' A) ⊢ (: a' A')) ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')))) ((Γ (: a' A') ⊢ (: a' A'))))) points: (((Γ ⊢ (≐ A A'))) ((Γ (: a' A') ⊢ (: a' A'))))>
> (apply-rule * 'exact)
* = #<derivation tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) . #{.}#) ((Γ (: a' A) ⊢ (: a' A')) ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')) . #{.}#)) ((Γ (: a' A') ⊢ (: a' A'))))) points: (((Γ (: a' A') ⊢ (: a' A'))))>
> (apply-rule * 'generic-element #:index 1)
* = #<derivation tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) . #{.}#) ((Γ (: a' A) ⊢ (: a' A')) ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')) . #{.}#)) ((Γ (: a' A') ⊢ (: a' A')) ((Γ ⊢ (type A')))))) points: (((Γ ⊢ (type A'))))>
> (apply-rule * 'formation-3 'A 'A)
* = #<derivation tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) . #{.}#) ((Γ (: a' A) ⊢ (: a' A')) ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')) . #{.}#)) ((Γ (: a' A') ⊢ (: a' A')) ((Γ ⊢ (type A')) ((Γ ⊢ (≐ A A'))))))) points: (((Γ ⊢ (≐ A A'))))>
> (apply-rule * 'exact)
* = #<derivation tree: ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) . #{.}#) ((Γ (: a' A) ⊢ (: a' A')) ((Γ ⊢ (≐ A' A)) ((Γ ⊢ (≐ A A')) . #{.}#)) ((Γ (: a' A') ⊢ (: a' A')) ((Γ ⊢ (type A')) ((Γ ⊢ (≐ A A')) . #{.}#))))) points: ()>

When no points remain, we are done. We can summarize, using the same format as `define-rule', the hypotheses that lead to the conclusion:

> (list (conclusion *) (list-hypotheses *))
* = ((Γ ⊢ (: a A')) ((Γ ⊢ (: a A)) (Γ ⊢ (≐ A A'))))

Finally, we can internalize a derivation as a new rule with a name that can then be used in other derivations:

> (define-derivation element-conversion ((Γ ⊢ (: a A')) (Γ ⊢ (: a A)) (Γ ⊢ (≐ A A'))) (substitution a a x a' A A) (exact) (variable-conversion #:index 1 A A') (type-symmetric) (exact) (generic-element #:index 1) (formation-3 A A) (exact))

The difference between `define-rule' and `define-derivation' is that the former interns a fundamental rule that cannot be derived from anything else, whereas the latter checks, prior to adding a new rule, that the provided conclusion is derivable from the listed hypotheses.

Bools and Whistles

In this section, we show how to create common non-dependent programming constructs (these will be generalized once dependent types are fully working to the corresponding type indicated in between parenthesis): ordinary functions (dependent product type), booleans, pairs (dependent sum type), and records (dependent records). To define these, we need four kinds of rules: formation, introduction, elimination, and computation. Often, these come associated with some congruence rules, ignored in this short exposé.

Ordinary Functions

Ordinary functions are your usual function type in languages like Haskell. It has one rule of formation, introduction, and elimination and two computation rules:

> (define-rule → () ((Γ ⊢ (type (→ A B))) (Γ ⊢ (type A)) (Γ ⊢ (type B))))
> (define-rule λ () ((Γ ⊢ (: (λ x (b x)) (→ A B))) (Γ ⊢ (type B)) (Γ (: x A) ⊢ (: (b x) B))))
> (define-rule ev () ((Γ (: x A) ⊢ (: (f x) B)) (Γ ⊢ (: f (→ A B)))))
> (define-rule β () ((Γ (: x A) ⊢ (≐ ((λ y (b y)) x) (b x) B)) (Γ ⊢ (type B)) (Γ (: x A) ⊢ (: (b x) B))))
> (define-rule η () ((Γ ⊢ (≐ (λ x (f x)) f (→ A B))) (Γ ⊢ (: f (→ A B)))))

These (or their dependent generalizations) are used ubiquitously in type theory. See the source code for a derivation of the ordinary identity function.

Booleans

> (define-rule bool-formation () ((⊢ (type Bool))))
> (define-rule bool-introduction-1 () ((⊢ (: true Bool))))
> (define-rule bool-introduction-2 () ((⊢ (: false Bool))))
> (define-rule bool-elimination () ((⊢ (: if (→ Bool (→ A (→ A A))))) (⊢ (type A))))
> (define-rule bool-computation-1 () ((⊢ (≐ (((if true) x) y) x Bool))))
> (define-rule bool-computation-2 () ((⊢ (≐ (((if false) x) y) y Bool))))

Now we can derive the `not' function!

> (define-derivation not ((⊢ (: (λ x (((if x) false) true)) (→ Bool Bool)))) (λ) (bool-formation) (substitution a true x y A Bool) (weakening #:index 0) (bool-formation) (bool-introduction-1) (ev #:index 1) (substitution a false x y A Bool) (weakening #:index 0) (bool-formation) (bool-introduction-2) (ev #:index 1) (ev #:index 0) (bool-elimination) (bool-formation))

Pairs

> (define-rule pair-formation () ((⊢ (type (Pair A B))) (⊢ (type A)) (⊢ (type B))))
> (define-rule pair-introduction () ((⊢ (: (pair a b) (Pair A B))) (⊢ (: a A)) (⊢ (: b B))))
> (define-rule pair-elimination-1 () ((⊢ (: first (→ (Pair A B) A))) (⊢ (type A)) (⊢ (type B))))
> (define-rule pair-elimination-2 () ((⊢ (: second (→ (Pair A B) B))) (⊢ (type A)) (⊢ (type B))))
> (define-rule pair-computation-1 () ((⊢ (≐ (first (pair a b)) a A)) (⊢ (: a A)) (⊢ (: b B))))
> (define-rule pair-computation-2 () ((⊢ (≐ (second (pair a b)) b B)) (⊢ (: a A)) (⊢ (: b B))))

Records

Record types can be created in the same way as pairs, with the intended types and accessors for each slot.