このページは http://www.slideshare.net/ekmett/bound-making-de-bruijn-succ-less の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

約4年前 (2012/07/26)にアップロードinテクノロジー

This talk covers a novel approach to "name binding" in syntax trees for programming languages tha...

This talk covers a novel approach to "name binding" in syntax trees for programming languages that makes it much easier to write compilers and interpreters with a higher degree of assurance.

- Making de Bruijn Succ Less

Bound

Edward Kmett - What is in a Name?

We use names in lots of contexts, but

any program that deals with names has

to deal with a number of issues such as

capture avoidance

deciding alpha equivalence

… and others that will come up as we go. - An Untyped Lambda Calculus

The dumbest thing that could possibly work:

type Name = String

data Exp

= Var Name

| Exp :@ Exp

| Lam Name Exp

Var “x”

Lam “x” (Var “x”)

Lam “x” (Lam “y” (Var “x”)) - Capture Avoidance

Blindly Substituting Lam “x” (Var “y”) into

Lam “y”( Var “z”)

for “z” would yield

Lam “y” (Lam “x” (Var “y”))

which now causes the free variable to

reference the “y” bound by the outer lambda. - Alpha Equivalence

Lam “x” (Var “x”)

and

Lam “y” (Var “y”)

both mean the same thing and it’d be

nice to be able to check this easily, make

them hash the same way for CSE, etc. - Common Solutions

There is a cottage industry of solutions ot the naming problem.

Naïve substitution

Barendregt Convention

HOAS

Weak HOAS / PHOAS

“I am not a Number: I am a Free Variable!”

Locally Nameless Syntax with de Bruijn Indices

Unbound, mixing Barendregt and Locally Nameless.

etc.

I will not be addressing all of these here, just a few. - Naïve Substitution

Just go look for names that avoid capture.

Pros:

Pretty syntax trees

Easy to get started with

Cons:

Easy even for experts to make mistakes!

Alpha Equivalence checking is tedious.

REALLY SLOW - Naïve Substitution

subst :: Name -> Exp -> Exp -> Exp

subst x s = sub where

sub e@(Var v)

| v == x = s

| otherwise = e

sub e@(Lam v e')

| v == x = e

| v `elem` fvs = Lam v' (sub e'’)

| otherwise = Lam v (sub e’)

where v' = newId vs

e'' = subst v (Var v') e’

sub (f :@ a) = sub f :@ sub a

fvs = freeVars s

vs = fvs `union` allVars b

newId :: [Name] -> Name

newId vs = head (someEnormousPoolOfNames \\ vs)

– go find a name that isn’t taken!

(based on code by Lennart Augustsson) - Barendregt Convention

Make sure that every binder binds a globally unique

name.

Pros:

“Secrets of the GHC Inliner” describes ‘the Rapier’ which can

make this Fast.

Cons:

Easy even for experts to screw up

Alpha Equivalence is tedious

Need a globally unique variable supply (e.g. my concurrent-

supply)

The obvious implementation technique chews through a

scarily large number of variable IDs. - Higher Order Abstract

Syntax

Borrow substitution from the host

language!

data Exp a

= Var a

| Lam (Exp a -> Exp a)

| Exp a :@ Exp a - Higher Order Abstract

Syntax

Pros:

Provides _really_ fast substitution

Cons:

Doesn’t work in theorem provers

(Exp occurs in negative position)

Hard to work under Binders!

Exotic terms

Alpha equivalence checking is tedious

Variants such as Weak HOAS/PHOAS exist to address

some of these issues at the expense of other problems. - The Cylon Detector

M’colleague Bob Atkey once memorably described

the capacity to put up with de Bruijn indices as a

Cylon detector, the kind of reverse Turing Test that

the humans in Battlestar Galactica invent, the better

to recognize one another by their common

inadequacies. He had a point.

—Conor McBride

“I am not a number, I am a classy hack” - The Cylon Detector

Split variables into Bound and Free.

data Exp a

= Free a

| Bound !Int

| Exp a :@ Exp a

| Lam (Exp a)

Bound variables reference the variable being bound by the

lambda n lambdas out. Substitution has to renumber all the

variables.

abstract :: Eq a => a -> Exp a -> Exp a

instantiate :: Exp a -> Exp a -> Exp a - The Cylon Detector

Split variables into Bound and Free.

newtype Scope f a = Scope (f a)

data Exp a

= Free a

| Bound !Int

| Exp a :@ Exp a

| Lam (Scope Exp a)

Bound variables reference the variable being bound by the lambda

n lambdas out. Substitution has to renumber all the variables.

abstract :: Eq a => a -> Exp a -> Scope Exp a

instantiate :: Exp a -> Scope Exp a -> Exp a - The Cylon Detector

abstract :: Eq a => a -> Exp a -> Scope Exp a

abstract me expr = Scope (letmeB 0 expr) where

letmeB this (F you)

| you == me = B this

| otherwise = F you

letmeB this (B that) = B that

letmeB this (fun :@ arg) =

letmeB this fun :@ letmeB this arg

letmeB this (Lam (Scope body)) =

Lam (Scope (letmeB (succ this) body))

(Based on code by Conor McBride from “I am not a number: I am a free variable”) - The Cylon Detector

instantiate :: Exp a -> Scope Exp a -> Exp a

instantiate what (Scope body) = what'sB 0 body

where

what'sB this (B that)

| this==that = what

| otherwise = B that

what'sB this (F you) = F you

what'sB this (fun :@ arg) =

what'sB this fun :@ what'sB this arg

what'sB this (Lam (Scope body)) =

Lam (Scope (what'sB (succ this) body))

(Based on code by Conor McBride from “I am not a number: I am a free

variable”) - The Cylon Detector

newtype Scope f a = Scope (f a)

data Exp a

= Free a

| Bound !Int

| Exp a :@ Exp a

| Lam (Scope a)

deriving (Functor, Foldable,Traversable)

We can make an instance of Monad for Exp, but

it is an awkward one-off experience. - The Cylon Detector

Pros:

Scope, abstract, and instantiate make it harder to screw up

walking under binders.

Alpha equivalence is just (==)

We can make a Monad for Exp.

We can use Traversable to find free variables, close

terms, etc.

Cons:

This succ’s a lot. (Slow)

Illegal terms such as Lam (Scope (Bound 2))

Have to define abstract/instantiate for each type.

The Monad for Exp is a one-off deal. - Just: don’t succ

data Exp a

= Var a

| Exp a :@ Exp a

| Lam (Exp (Maybe a))

(based on Bird and Paterson) - Just: don’t succ

data Incr a = Z | S a

data Exp a

= Var a

| Exp a :@ Exp a

| Lam (Exp (Incr a))

(based on Bird and Paterson) - Just: don’t succ

data Incr a = Z | S a

newtype Scope f a = Scope (f (Incr a))

data Exp a

= Var a

| Exp a :@ Exp a

| Lam (Scope Exp a)

instance MonadTrans Scope where

lift = Scope . fmap Just

-- Scope is just MaybeT a Monad transformer in its own

right, but lift is slow. - Just: don’t succ

instance Monad Exp where

Var a >>= f = f a

x :@ y >>= f = (x >>= f) :@ (y >>= f)

Lam b >>= f = Lam (b >>= lift . f)

You can derive Foldable and Traversable.

Then Data.Foldable.toList can obtain the free

variables in a term, and (>>=) does capture

avoiding substitution! - Just: don’t succ

Pros:

The Monad is easy to define

Foldable/Traversable for free variables

Capture avoiding substitution for free

Cons:

It still succs a lot. lift is O(n). - Succ the whole thing

If we could succ an entire expression instead of on each

individual variable we would succ less.

Instantiation wouldn’t have to walk into that expression at

all, and we could lift an Exp into Scope in O(1) instead of

O(n).

This requires polymorphic recursion, but we support that.

Go Haskell!

This is the ‘generalized de Bruijn’ as described by Bird and

Paterson without the rank-2 types mucking up the

description and abstracted into a monad transformer. - Succ the whole thing

data Incr a = Z | S a

newtype Scope f a = Scope { unscope :: f (Incr (f a) }

instance Monad f => Monad (Scope f) where

return = Scope . return . S . return

Scope e >>= f = Scope $ e >>= \v -> case v of

Z -> return Z

S ea -> ea >>= unscope . f

instance MonadTrans Scope where

lift = Scope . return . S - Succ the whole thing

Pros:

The Monad is easy to define

Foldable/Traversable for Free

Variables

Capture avoiding substitution for free

Cons:

Alpha equivalence is slightly harder,

because you have to quotient out the

position of the ‘Succ’s. - Abstracting abstraction

abstract :: (Monad f, Eq a) => a -> f a -> Scope f a

abstract x e = Scope (liftM k e) where

k y | x == y

= Z

| otherwise = S (return y)

instantiate :: Monad f => f a -> Scope f a -> f a

instantiate r (Scope e) = e >>= \v -> case v of

Z -> r

S a-> a

We can define these operations once and for all,

independent of our expression type! - Complex Binders

Not every language is the untyped lambda

calculus. Sometimes you want to bind

multiple variables at the same time, say for a

pattern or recursive let binding, or to

represent all the variables boundby a single

quantifier in a single pass.

So lets go back and enrich our binders so

they an bind multiple variables by

generalizing generalized de Bruijn. - Bound

data Var b a = B b | F a

data Scope b f a = Scope { unscope :: f (Var b (f a) }

instance Monad f => Monad (Scope b f)

instance MonadTrans (Scope b)

abstract :: Monad f => (a -> Maybe b) -> f a -> Scope b f a

instantiate :: Monad f => (b -> f a) -> Scope b f a -> f a

fromScope :: Monad f => Scope b f a -> f (Var b a)

toScope :: Monad f => f (Var b a) -> Scope b f a

substitute :: (Monad f, Eq a) => a -> f a -> f a -> f a

class Bound t where

(>>>=) :: Monad m => t m a -> (a -> m b) -> a -> t m b

instance Bound (Scope b) - Using Bound

data Exp a

= V a

| Exp a :@ Exp a

| Lam (Scope () Exp a)

| Let [Scope Int Exp a] (Scope Int Exp a)

deriving

(Eq,Ord,Show,Read,Functor,Foldable,Traversable)

Instance Monad Exp where

V a >>= f = f a

(x :@ y) >>= f = (x >>= f) :@ (y >>= f)

Lam e

>>= f = Lam (e >>>= f)

Let bs b

>>= f = Let (map (>>>= f) bs) (b >>>= f) - Smart Constructors with Abstract

abstract1 :: (Monad f, Eq a) => a -> f a -> Scope () f a

abstract :: Monad f => (a -> Maybe b) -> f a -> Scope b f a

lam :: Eq a => a -> Exp a -> Exp a

lam v b = Lam (abstract1 v b)

let_ :: Eq a => [(a,Exp a)] -> Exp a -> Exp a

let_ bs b = Let (map (abstr . snd) bs) (abstr b)

where abstr = abstract (`elemIndex` map fst bs)

infixr 0 !

(!) :: Eq a => a -> Exp a -> Exp a

(!) = lam - Evaluation with Instantiate

instantiate :: Monad f => (b -> f a) -> Scope b f a -> f a

instantiate1 :: Monad f => f a -> Scope () f a -> f a

whnf :: Exp a -> Exp a

whnf e@V{} = e

whnf e@Lam{} = e

whnf (f :@ a) = case whnf f of

Lam b -> whnf (instantiate1 a b)

f' -> f' :@ a

whnf (Let bs b) = whnf (inst b)

where es = map inst bs

inst = instantiate (es !!) - Walking Under Binders

fromScope :: Monad f => Scope b f a -> f (Var b a)

toScope :: Monad f => f (Var b a) -> Scope b f a

nf :: Exp a -> Exp a

nf e@V{} = e

nf (Lam b) = Lam $ toScope $ nf $ fromScope b

nf (f :@ a) = case whnf f of

Lam b -> nf (instantiate1 a b)

f' -> nf f' :@ nf a

nf (Let bs b) = nf (inst b)

where es = map inst bs

inst = instantiate (es !!) - Closed Terms

closed :: Traversable f => f a -> Maybe (f

b)

closed = traverse (const Nothing)

A closed term has no free variables, so

you can

Treat the free variable type as anything

you want. - Working with Exp as a DSL

cooked :: Exp a

cooked = fromJust $ closed $ let_

[ ("False", "f" ! "t" ! V"f")

, ("True", "f" ! "t" ! V"t")

, ("if", "b" ! "t" ! "f" ! V"b" :@ V"f" :@ V"t")

, ("Zero", "z" ! "s" ! V"z")

, ("Succ", "n" ! "z" ! "s" ! V"s" :@ V"n")

, ("one", V"Succ" :@ V"Zero")

, ("two", V"Succ" :@ V"one")

, ("three", V"Succ" :@ V"two")

, ("isZero", "n" ! V"n" :@ V"True" :@ ("m" ! V"False"))

, ("const", "x" ! "y" ! V"x")

, ("Pair", "a" ! "b" ! "p" ! V"p" :@ V"a" :@ V"b")

, ("fst", "ab" ! V"ab" :@ ("a" ! "b" ! V"a"))

, ("snd", "ab" ! V"ab" :@ ("a" ! "b" ! V"b"))

, ("add", "x" ! "y" ! V"x" :@ V"y" :@ ("n" ! V"Succ" :@ (V"add" :@ V"n" :@ V"y")))

, ("mul", "x" ! "y" ! V"x" :@ V"Zero" :@ ("n" ! V"add" :@ V"y" :@ (V"mul" :@ V"n" :@ V"y")))

, ("fac", "x" ! V"x" :@ V"one" :@ ("n" ! V"mul" :@ V"x" :@ (V"fac" :@ V"n")))

, ("eqnat", "x" ! "y" ! V"x" :@ (V"y" :@ V"True" :@ (V"const" :@ V"False")) :@ ("x1" ! V"y" :@ V"False" :@ ("y1" ! V"eqnat" :@ V"x1" :@

V"y1")))

, ("sumto", "x" ! V"x" :@ V"Zero" :@ ("n" ! V"add" :@ V"x" :@ (V"sumto" :@ V"n")))

, ("n5", V"add" :@ V"two" :@ V"three")

, ("n6", V"add" :@ V"three" :@ V"three")

, ("n17", V"add" :@ V"n6" :@ (V"add" :@ V"n6" :@ V"n5"))

, ("n37", V"Succ" :@ (V"mul" :@ V"n6" :@ V"n6"))

, ("n703", V"sumto" :@ V"n37")

, ("n720", V"fac" :@ V"n6")

] (V"eqnat" :@ V"n720" :@ (V"add" :@ V"n703" :@ V"n17")) - Working with Exp as a DSL

ghci> nf cooked == (“F” ! “T” ! “T”)

> True - Complex Binders

data Exp a

= V a

| Exp a :@ Exp a

| Lam !Int (Pat Exp a) (Scope Int Exp a)

| Let !Int [Scope Int Exp a] (Scope Int Exp a)

| Case (Exp a) [Alt Exp a]

deriving (Eq,Ord,Show,Read,Functor,Foldable,Traversable)

data Pat f a

= VarP

| WildP

| AsP (Pat f a)

| ConP String [Pat f a]

| ViewP (Scope Int f a) (Pat f a)

deriving (Eq,Ord,Show,Read,Functor,Foldable,Traversable)

data Alt f a = Alt !Int (Pat f a) (Scope Int f a)

deriving (Eq,Ord,Show,Read,Functor,Foldable,Traversable) - Complex Binders

instance Monad Exp where

return = V

V a >>= f = f a

(x :@ y) >>= f = (x >>= f) :@ (y >>= f)

Lam n p e >>= f = Lam n (p >>>= f) (e >>>= f)

Let n bs e >>= f = Let n (map (>>>= f) bs) (e >>>= f)

Case e as >>= f = Case (e >>= f) (map (>>>= f) as)

instance Bound Pat where

VarP >>>= _ = VarP

WildP >>>= _ = WildP

AsP p >>>= f = AsP (p >>>= f)

ConP g ps >>>= f = ConP g (map (>>>= f) ps)

ViewP e p >>>= f = ViewP (e >>>= f) (p >>>= f)

instance Bound Alt where

Alt n p b >>>= f = Alt n (p >>>= f) (b >>>= f) - Smart Patterns

data P a = P { pattern :: [a] -> Pat Exp a, bindings :: [a] }

varp :: a -> P a

varp a = P (const VarP) [a]

wildp :: P a

wildp = P (const WildP) []

conp :: String -> [P a] -> P a

conp g ps = P (ConP g . go ps) (ps >>= bindings)

where

go (P p as:ps) bs = p bs : go ps (bs ++ as)

go [] _ = []

lam :: Eq a => P a -> Exp a -> Exp a

lam (P p as) t = Lam (length as) (p []) (abstract (`elemIndex` as) t)

ghci> lam (varp "x") (V "x”)

Lam 1 VarP (Scope (V (B 0)))

ghci> lam (conp "Hello" [varp "x", wildp]) (V "y”)

Lam 1 (ConP "Hello" [VarP,WildP]) (Scope (V (F (V "y")))) - What Did I Sweep Under the

Rug?

Deriving Eq, Ord, Show and Read requires some tomfoolery. The issue is that

Scope uses polymorphic recursion.

So the most direct way of implementing Eq (Scope b f a) would require

Instance (Eq (f (Var b (f a)), Eq (Var b (f a), Eq (f a), Eq a) => Eq (Scope b f a)

And then Exp would require:

instance (Eq a, Eq (Pat Exp a), Eq (Scope Int Exp a), Eq (Alt

Exp a)) => Eq (Exp a)

Plus all the things required by Alt, Pat, and Scope!

Moreover, these would require flexible contexts, taking us out of Haskell 98/2010.

Blech! - Prelude.Extras

My prelude-extras package defines a number of boring typeclasses like:

class Eq1 f where

(==#) :: Eq a => f a -> f a -> Bool

(/=#) :: Eq a => f a -> f a -> Bool

class Eq1 f => Ord1 f where

compare1 :: Ord a => f a -> f a -> Ordering

class Show1 f where

showsPrec1 :: Show a => Int -> f a -> ShowS

class Read1 f where

readsPrec1 :: Read a => Int -> ReadS (f a)

readList1 :: Read a => ReadS [f a] - Hidden polymorphic recursion

Bound defines:

instance (Functor f, Show b, Show1 f, Show a) => Show (Scope b f a)

instance (Functor f, Read b, Read1 f, Read a) => Read (Scope b f a)

instance (Monad f, Ord b, Ord1 f, Ord a) => Ord (Scope b f a)

instance (Monad f, Eq b, Eq1 f, Eq a) => Eq (Scope b f a)

So you just need to define

instance Eq1 Exp where (==#) = (==)

instance Ord1 Exp where compare1 = compare

instance Show1 Exp where showsPrec1 = showsPrec

instance Read1 Exp where readsPrec1 = readsPrec

Why do some use Monad? Ord and Eq perform a non-structural equality

comparison so that (==) is alpha-equality! - Future Directions

We can define languages that have strongly typed variabes by

moving to much scarier types. =)

type Nat f g = forall x. f x -> g x

class HFunctor t where

hmap :: Nat f g -> Nat (t f) (t g)

class HFunctor t => HTraversable t where

htraverse :: Applicative m => (forall x. f x -> m (g x)) -> t f a -> m (t g

a)

class HFunctor t => HMonad t where

hreturn :: Nat f (t f)

(>>-) :: t f a -> Nat f (t g) -> t g a - Future Directions - Higher Order

data Equal a b where

Refl :: Equal a a

class EqF f where

(==?) :: f a -> f b -> Maybe (Equal a b)

data Var b f a where

B :: b a -> Var b f a

F :: f a -> Var b f a

newtype Scope b t f a = Scope { unscope :: t (Var b (t f)) a }

abstract :: HMonad t =>

(forall x. f x -> Maybe (b x)) -> Nat (t f) (Scope b t f)

instantiate :: HMonad t => Nat b (t f) -> Nat (Scope b t f) (t f)

class HBound s where

(>>>-) :: HMonad t => s t f a -> Nat f (t g) -> s t g a - Future Directions -

HashConsing

Dependently typed languages build up a

lot of crap in memory. It’d be nice to

share memory for it, since most of it is

very repetitive. - Summary

Bound provides a small API for dealing with

abstraction/instantiation for complex binders that

combines the nice parts of “I am not a number: I am

a free variable” with the “de Bruijn notation as a

nested data type” while avoiding the complexities of

either.

You just supply it a Monad and Traversable

No variable supply is needed, no pool of names

Substitution is very efficient

Introduces no exotic or illegal terms

Simultaneous substitution for complex binders

Your code never sees a de Bruijn index - Any Questions?
- Extra Slides
- Future Directions - Higher

Order

data Ix :: [*] -> * -> * where

Z :: Ix (a ': as) a

S :: Ix as b -> Ix (a ': as) b

data Vec :: (* -> *) -> [*] -> * where

HNil :: Vec f '[]

(:::) :: f b -> Vec f bs -> Vec f (b ': bs)

data Lit t where

Integer :: Integer -> Lit Integer

Double :: Double -> Lit Double

String :: String -> Lit String

data Remote :: (* -> *) -> * -> * where

Var :: f a -> Remote f a

Lit :: Lit a -> Remote f a

Lam :: Scope (Equal b) Remote f a -> Remote f (b -> a)

Let :: Vec (Scope (Ix bs) Remote f) bs -> Scope (Ix bs) Remote f a -> Remote f a

Ap :: Remote f (a -> b) -> Remote f a -> Remote f b - Future Directions - Higher

Order

lam_ :: EqF f => f a -> Remote f b -> Remote f (a -> b)

lam_ v f = Lam (abstract (v ==?) f)

-- let_ actually winds up becoming much trickier to

define

-- requiring a MonadFix and a helper monad.

two12121212 = let_ $ mdo

x <- def (cons 1 z)

z <- def (cons 2 x)

return z