Skip to content

deemp/arbitrary-rank-tutorial

Repository files navigation

CI

A Tutorial Implementation of a Lambda Calculus with Parametric Predicative Arbitrary-Rank Polymorphism

Arralac - Arbitrary-rank + lambda calculus.

Interactive features

  • Language server
    • Supports showing types of variables on hover and error diagnostics when something goes wrong.
    • It is implemented via the lsp Haskell package 1.
  • Command-line interface
    • Supports running a language server, typechecking programs, and evaluating them.
    • It is implemented via the optparse-applicative Haskell package 2.

Examples

Initial program

arralac/test/data/Program1.arralac

let
  applyMyShow =
    (\x. \y. x y)
      :: forall a. (forall b. b -> String) -> a -> String
in
let
  myShow = \x. "Hello"
in
applyMyShow myShow

VS Code extension

TypeHover+TermVarError

Typecheck

nix run .#arralac -- typecheck arralac/test/data/Program1.arralac
(
  let
    applyMyShow_0 =
      (
        (
          (
            (
              \(x_1 :: forall b_4. b_4 -> String).
                (
                  (
                    \(y_2 :: a_9).
                      (
                        (
                          (
                            (x_1 :: a_9 -> String)
                          )
                            (y_2 :: a_9)
                        ) :: String
                      )
                  ) :: a_9 -> String
                )
            ) :: (forall b_4. b_4 -> String) -> a_9 -> String
          ) :: {forall a_3. forall b_4. b_4 -> String -> a_3 -> String}
        ) :: (forall b_4. b_4 -> String) -> a_Unsolved_11 -> String
      ) :: (forall b_4. b_4 -> String) -> a_Unsolved_11 -> String
  in
    (
      let
        myShow_7 =
          (
            (
              \(x_8 :: b_13).
                (
                  "Hello"
                )
            ) :: b_13 -> String
          ) :: b_13 -> String
      in
        (
          (
            (applyMyShow_0 :: (forall b_4. b_4 -> String) -> a_Unsolved_11 -> String)
          )
            (myShow_7 :: b_13 -> String)
        ) :: a_Unsolved_11 -> String
    ) :: a_Unsolved_11 -> String
) :: a_Unsolved_11 -> String

Evaluate

nix run .#arralac -- evaluate whnf arralac/test/data/Program1.arralac
\y_2. (\x_8. "Hello") (y_2)

AST representations

Each step of the pipeline works with at most two different AST representations.

Here are these representations.

Parser.Generated.Abs

Parser.Generated.Abs is generated by BNFC 3.

Syntax

Trees that Grow

The core idea of the Trees that Grow 4 approach is to parameterise a data type with a type variable and use type-level expressions involving type families and that type parameter in the fields of the data type constructors.

Then, it will be possible to produce different variants of the data type by instantiating the parameter and the type families.

On the one hand, data types become extremely flexible and reusable.

On the other hand, there appears a significant overhead in terms of lines of code written to define type family instances.

Syntax.TTG

Syntax.TTG provides Trees that Grow 4 (TTG) representations of the AST and the Type (see Type.TTG.Type for a discussion of why use TTG for Type).

GHC also uses the TTG representation for its syntax 5 but not for its Type 6.

Pass.Types

Pass.Types provides types that are used for instantiating TTG definitions to concrete types. These types denote the following passes:

  1. Renamed - after the renamer ran on the intial AST.
  2. Typechecked - after the typechecker ran on a renamed term.
  3. Zonked - after the zonker ran on a fully typechecked term.

Syntax.Local

Syntax.Local instantiates type families defined in Syntax.TTG depending on one of the passes.

Core

  • After typechecking, Haskell programs are to desugared to Core 7, a small language based on System FC 89 that is relatively easy to typecheck and manipulate.

  • The Arralac implementation also uses a Core language based on untyped lambda calculus 10. It only has lambda abstraction, application, let-bindings, and literals.

  • GHC implements capture-avoding substitution based on the Rapier approach 11 12 13.

  • Core.Scoped provides the SCore type that has scoped lambda and let-bindings. This type is constructed via the free-foil library 14. The Free Foil approach 15 makes capture-avoiding substitution type-safe and efficient. It is a significant improvement in terms of type safety over the Foil approach 16, which is an attempt to make the untyped Rapier approach type-safe.

  • Limitations of the Core AST:

    • let-bindings are not recursive. Hence, terms like let binder = <right-hand side contatining the binder> in <body> are not supported.

      There is a way to make the right-hand side and the body scoped under a binder:

      • In the AST type, create a constructor for a let-binding that contains a term scoped under a binder. It will be used in a Node with a ScopedAST.
      • Create another constructor for unscoped (AST) right-hand side and body. It will be wrapped in that ScopedAST that introduced the let-binder.
      • Use the PatternSynonyms extension of GHC to construct and deconstruct let-bindings in the AST.
      • Limitations:
        • Since a let-binding is represented with two constructors now, it will be possible to construct non-well-formed terms.

Pipeline: reading a file to evaluating Core

The system performs several steps to transform an input text into an evaluated term.

Here are these steps and associated system components.

Reading an input file

  • Reader
    • Reads the input file as plain text.

Parsing

  • Parser
    • Produces an initial abstract syntax tree (AST) from the input text.
    • The tree nodes get annotated with positions in the source code.
    • Types for the AST in Parser.Generated.Abs were generated by BNFC 3 from an LBNF 17 description of the language syntax defined in Arralac.cf.
    • The lexer and parser as Haskell code were generated using BNFC 3, happy 18, and alex 19 from the same description.

Renaming

  • Renamer
    • Performs $\alpha$ conversion 20 of the parsed term by assigning a unique identifier to each binder. All occurences of a binder have the same identifier but different information about their position in the source code (see Syntax.Local.Name).
    • Supports shadowing.

Typechecking

I used the bidirectional typechecking algorithm described in the paper Practical type inference for arbitrary-rank types 21 and modified it to gather constraints without immediately solving them.

This approach was suggested in 21 and explained in detail by The Glasgow Haskell Compiler 22 author Simon Peyton Jones at WITS'24 7 where he talked about solving constraints during and after type inference.

Gathering constraints on types

  • Typechecker
    • Gathers "wanted" (constraints that should be solved) 7 equality constraints (where a metavariable equals a monotype).
    • Sets the level for each variable to enable skolem escape checking 7 23.
    • Variables are created at an ambient level.
    • At each skolemization of a lambda abstraction where the binder has a user-given $\sigma$-type 7 21 22:
      • Creates an implication constraint with the ambient level.
      • Converts binders from the forall part of the type into skolems at the ambient level and adds them into the implication constraint.
      • Increments the ambient level and gathers constraints for the body.
      • Writes these constraints into the implication constraint and emits it.
    • Limitations:
      • Does not support let-generalization unlike the implementation in the paper 21.

        The let-generalization feature is relatively easy to implement by running the solver during typechecking at each implication and quantifying over unsolved variables with a greater than ambient level 7.

        GHC does support let-generalization, including that of local let-bindings. These bindings can appear nested in function bodies and other let-bindings 24.

      • Does not yet support "given" 7 constraints (e.g., type class instances).

        The implemented language doesn't have features like type equality constraints, type classes, and GADTs that could produce such constraint 7.

Solving constraints on types

  • Solver
    • Iteratively solves equality constraints.
    • Performs an occurs check for each constraint so that the metavariable on the left-hand side of the constraint doesn't appear in the type on the right-hand sind of the constraint.
    • Identifies skolem escape using variable levels. A metavariable that has the level $n$ may not be unified 21 with another variable or a type containing a variable that has the level $m$ if $m &gt; n$ 7 23.
    • Unsolved metavariables remain unchanged.
    • Limitations:
      • Does not try to float equality constraints outside of implication constraints 25 22.

Getting rid of metavariables

  • Zonker
    • Runs after the Solver.
    • Converts typed terms to a representation where no metavariables may occur.
      • Replaces all solved metavariables with their types.
      • Marks unsolved metavariables.

Enabling safe term transformations

  • Core.ConvertZonked
    • Provides functions for converting a fully zonked term into the Core representation.

      Some functions from the free-foil 14 library were copied and modified locally:

      1. to support binders with richer information such as Core.CoreNameBinder;
      2. to use variable identifiers from the zonked term.

Evaluating terms

  • Evaluator
    • Uses free-foil 14 functions for calculating the weak head normal form 26 of a Core term.

Get arralac

Install arralac

nix profile install github:deemp/arbitrary-rank-tutorial#arralac

Remove arralac

nix profile remove arralac

Use arralac temporarily

Run devshell.

nix develop github:deemp/arbitrary-rank-tutorial#demo

Run arralac in that devshell.

arralac
Usage: arralac COMMAND [--stdio]

  Work with Arralac programs.

Available options:
  -h,--help                Show this help text
  --version                Show version information
  --stdio                  Use stdio. This flag is required by the
                           `vscode-languageclient' library.

Available commands:
  language-server          Run Arralac language server.
  typecheck                Typecheck an Arralac program.
  evaluate                 Typecheck and evaluate an Arralac program.
arralac <TAB>
evaluate         --help           --stdio          --version        
-h               language-server  typecheck

Use VS Code extension

Supported platforms: Linux, MacOS with Nix installed (NixOS isn't necessary).

The following instructions were tested on Ubuntu 24.04 with Nix installed.

  1. Install arralac (see Install arralac).

  2. Clone and enter the repository.

    git clone https://github.com/deemp/arbitrary-rank-tutorial
    cd arbitrary-rank-tutorial
  3. Open the directory containing the VS Code extension code.

    code vscode-extension
  4. Type Fn + F5 to start debugging the extension.

  5. In the new VS Code window that opens automatically, find and open the arbitrary-rank-tutorial/vscode-extension/demo/Program.arralac file.

  6. Hover over an identifier. You should see the type of that identifier.

  7. Try to edit the code. You should see error messages when the program cannot be parsed or some identifiers dont exist where you mention them (unbound variables).

Develop the project

Clone the repository

git clone https://github.com/deemp/arbitrary-rank-tutorial
cd arbitrary-rank-tutorial

Set up direnv (optional)

Direnv caches flake devshell evaluation results.

Install direnv 27.

Run in the repo:

direnv allow

Set up VS Code (optional)

Install recommended extensions (listed in .vscode/extensions.json).

Build and run with Nix

Build arralac.

nix build .#arralac

Run arralac.

nix run .#arralac -- typecheck arralac/test/data/Program1.arralac

nix run .#arralac -- evaluate whnf arralac/test/data/Program1.arralac

Build and run with Cabal

Start a Nix devShell.

nix develop

Update the Hackage index.

nix run .#cabalUpdate

Build arralac.

cabal build arralac

Run arralac.

cabal run arralac -- typecheck arralac/test/data/Program1.arralac

cabal run arralac -- evaluate whnf arralac/test/data/Program1.arralac

Build and run with Stack

Start a Nix devShell.

nix develop

Build arralac.

stack build arralac

Run arralac.

stack run -- arralac typecheck arralac/test/data/Program1.arralac

Conventions

  1. Use implicit parameters for passing context.
  2. In monadic code, use let .. do when setting implicit parameters. This approach makes functions in the do block see implicit parameters that are closest to this block, i.e. defined right before that block.

Example:

runSolver ::
  (CtxDebug, CtxPrettyVerbosity) =>
  Int -> WantedConstraints -> IO WantedConstraints
runSolver iterations constraints = do
  let ?metaTvScope = Set.empty
  do
    constraints' <- go iterations constraints
    pure constraints'

Statistics

nix develop -c cloc arralac/src/ --exclude-dir Generated
      86 text files.
      86 unique files.                              
       1 file ignored.

github.com/AlDanial/cloc v 2.04  T=0.02 s (4017.4 files/s, 266126.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Haskell                         86            705           1085           3907
-------------------------------------------------------------------------------
SUM:                            86            705           1085           3907
-------------------------------------------------------------------------------

Footnotes

  1. https://hackage.haskell.org/package/lsp

  2. https://hackage.haskell.org/package/optparse-applicative

  3. https://bnfc.readthedocs.io/en/latest/# 2 3

  4. https://www.cs.tufts.edu/comp/150FP/archive/simon-peyton-jones/trees-that-grow.pdf 2

  5. https://gitlab.haskell.org/ghc/ghc/-/wikis/implementing-trees-that-grow

  6. https://github.com/ghc/ghc/blob/ed38c09bd89307a7d3f219e1965a0d9743d0ca73/compiler/GHC/Core/TyCo/Rep.hs#L124

  7. https://www.youtube.com/watch?v=OISat1b2-4k 2 3 4 5 6 7 8 9

  8. https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/compiler/fc

  9. https://gitlab.haskell.org/ghc/ghc/blob/master/docs/core-spec/core-spec.pdfs

  10. https://arxiv.org/abs/1503.09060

  11. https://www.microsoft.com/en-us/research/wp-content/uploads/2002/07/inline.pdf

  12. https://github.com/ghc/ghc/blob/ed38c09bd89307a7d3f219e1965a0d9743d0ca73/compiler/GHC/Core/Subst.hs#L332

  13. https://github.com/ghc/ghc/blob/ed38c09bd89307a7d3f219e1965a0d9743d0ca73/compiler/GHC/Types/Var/Env.hs#L214

  14. https://hackage.haskell.org/package/free-foil 2 3

  15. https://github.com/fizruk/free-foil

  16. https://doi.org/10.1145/3587216.3587224

  17. https://bnfc.readthedocs.io/en/latest/lbnf.html

  18. https://haskell-happy.readthedocs.io/en/latest/

  19. https://haskell-alex.readthedocs.io/en/latest/

  20. https://wiki.haskell.org/Alpha_conversion

  21. https://www.cambridge.org/core/journals/journal-of-functional-programming/article/practical-type-inference-for-arbitraryrank-types/5339FB9DAB968768874D4C20FA6F8CB6 2 3 4 5

  22. https://gitlab.haskell.org/ghc/ghc 2 3

  23. https://dl.acm.org/doi/10.1145/3729338 2

  24. https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/let_generalisation.html

  25. https://www.cambridge.org/core/journals/journal-of-functional-programming/article/outsideinx-modular-type-inference-with-local-assumptions/65110D74CF75563F91F9C68010604329

  26. https://wiki.haskell.org/Weak_head_normal_form

  27. https://direnv.net/#basic-installation

About

A Tutorial Implementation of a Lambda Calculus with Parametric Predicative Arbitrary-Rank Polymorphism

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •