Lambda-calculus (λ-calculus)


Church-Turing thesis is about the
  expressive completeness of models of computation

A *computable function* is (intuitively)
  a function that can be implemented by
   an algorithm that is:
   - correct (never gives the wrong result)
   - terminating (never runs forever)

By "implement an algorithm" we usually understand
  "define as a Turing machine"

Church-Turing thesis (intuitively) says that
  the functions that are implementable with Turing machines,
  are the same as those that are implementable in λ-calculus (this is a thm),
  and are also the same as *any other* model of computation you might invent
  
A problem is *decidable* if it can be solved by
 an algorithm that terminates with the correct result
 for every possible instance of the problem.

- Syntax

*Formal grammar* of the λ-calculus:

                     
  t ::=          -- A *term* t is either...
         v       -- .. a variable name,
      |  c       -- .. or a constant,
      |  (t t)   -- or an application (t t) of one term to another
      |  (λx. t) -- or an abstraction (λx. t), where x is a variable name,
                                               and t is a term

^ This is a definition of the well-formed terms
  in the λ-calculus

It comprises a number of *production rules* separated
  by |. Read the | as *either*

Arithmetic expressions:

 exp ::=   v
       |   n
       |   exp + exp
       |   exp * exp
       |   ...


Associativity and bracketing:

- Function application is left associative

     f x y   =   (f x) y


     (λx. λy. x + y) 4 5 =
     ((λx. λy. x + y) 4) 5 ↦β (beta-reduction, aka calling a function)
     (λy. 4 + y) 5 ↦
     4 + 5

   ^ the λ-calculus only has single argument functions,
     but we can use nested lambdas to encode functions that
     take multiple arguments. (currying, Haskell Curry)

- When we have several lambdas after each other,
  we truncate them like this:

    (λx. (λy. (λz. t))) =
    λx. λy. λz. t = (sugar this to:)
    λx y z. t

- The scope of lambdas extend as far to the right
  as possible:

    λx. x y   =    (λx. (x y))
        
- Free variables, substitution

  We distinguish between *free* and *bound* variable occurences
  a *bound* variable is a variable that in the scope of a binder
  a *binder* is, e.g. λ, ∀, ∃

      ∀x.                      x       >     y
       ^binding occurence      ^ bound       ^ free

      (λx. x y) (λy. y x)
        1  2 3    4  5 6

         1: Binding
         2: Bound
         3: Free
         4: Binding
         5. Bound
         6: Free

Note: different occurences of the same variable in a term,
  can have different functions

We can define a *recursive function* that computes the
  set of free variables in a λ-term.   FV(t)

    FV(v)     = {v}
    FV(c)     = ∅
    FV(t t')  = FV(t) ∪ FV(t')
    FV(λx. t) = FV(t) - {x}

FV(λx. x y) =
FV(x y) - {x} =
(FV(x) ∪ FV(y)) - {x} =
({x} ∪ {y}) - {x} =
{y}

Substitution:

    t[x:=t']
    t{t'/x}
    ^ same thing

    t, but with all free occurences of x replaced with t'

Variable capture

    (λx. y)[y:=z] = (λx. z)
         
    (λx. y)[y:=x] = (λx. x)??? - variable capture: our free x has been capture by a bounder!

   ^ this is undesirable, because the choice of names for placeholder should never matter

     f(x) = x + x             f(y) = y + y

      ^ these f:s should be *the same* function,
        and it shouldn't matter if one used x and one used y
        for a purely local name

Capture avoiding substitution:
  if a free name and a bound name collide,
  we rename the bound name before putting in the free in one

  (λx. y)[y:=x] = (λz. y)[y:=x] = (λz. x)

There are different variants of capture-avoiding substitutions.
We'll do one that invents new names as required.

  v[x:=t]        =  t          if v = x
                    v          otherwise
  c[x:=t]        =  c
  (t u)[x:=t']   =  (t[x:=t']) (t[x:=t'])
  (λy. t)[x:=t'] =  (λy. t)        if y = x
                    (λy. t[x:=t']) if y ≠ x and y ∉ FV(t')
                    (λz. t[y:=z][x:=t']) if y ≠ x and y ∈ FV(t')
                                            and z ∉ FV(t') ∪ FV(t) ∪ {y}
                                                ^ z is a "fresh name"
                                                  doesn't occur in any
                                                  term under consideration

- β-reduction -- how you compute in the lambda calculus: by calling functions

Inuitively, a β-reduction is a function call.

A *β-redex* is an expression of the following form:
  (λx. t) t'
    
A *β-reduction* rewrites a term by replacing a redex
with

  t[x:=t']

More formally, we define β-reduction by inference rules:

                                        
  ______________________
  (λx. t) t' ↦β t[x:=t']


       t' ↦β t''
  ______________________
     t t' ↦β t t''

       t ↦β t''
  ______________________
     t t' ↦β t'' t'

         t ↦β t'
  ______________________
     (λx. t) ↦β (λx. t')


   ((λx. λy. x + y) 3 4) ↦β
   ((λy. x + y)[x:=3]) 4 =
   (λy. 3 + y) 4 ↦β
   (3 + y)[y:=4] =
   3 + 4

   t ↦β t'  means  "t can be β-reduced to t'"

  Beta reduction: computation 

- α and η-conversion

  α-conversion: renaming
  β-reduction:  computation
  η-conversion: extensionality


  α-conversion means replace a bound variable with
                     a fresh name


   (λx. t)  ↦α   (λy. t[x:=y])       if y ∉ FV(t)

   Also structural rules:

    t ↦α t''
    _________
    t t' ↦α t'' t'

    ...and similar ones to what we saw for β

  η-conversion (only sometimes included)

  Should we consider two function *equal*
   if they always return the same value,
   but have different implementation?

  This is called *extensional* equality:
     f = g       iff   ∀x.  f x = g x

  In the λ-calculus this is captured
   by η-conversion:

    (λx. f x) ↦η f

    ^ a function that given an argument x,
      calls f with x;
      is the same thing as f

Equivalence:

   α-equivalence ≡α
     is the equivalence closure of ↦α

   Intuitively: two terms are α-equivalent, written
     t ≡α t', if we can apply a sequence of α-conversions
              to both t and t' that reduces them to a
              common term.

   ≡β-equivalence: the equivalence closure of ↦β

   ≡η-equivalence: the equivalence closure of ↦η

   (λx. f x) ≡α (λy. f y)
             ≡η f

Terms in Isabelle are λ-calculus terms,
  terms in Isabelle are considered equal
  module αβη-equivalence
             
- Confluence

 Real programming languages have a fixed evaluation order
 (this presentation of) the λ-calculus doesn't,


 (λx. (λy. y) x) ((λx. x) z)
  ^ there are several possible β-reductions here
    we could do the outermost function application first (lazy evaluation),
      (λx. (λy. y) x) ((λx. x) z) ↦β (λy. y) ((λx. x) z)
    we could reduce in the argument first (strict evaluation),
      (λx. (λy. y) x) ((λx. x) z) ↦β (λx. (λy. y) x) z
    we could reduce in the function first (insane evaluation)
      (λx. (λy. y) x) ((λx. x) z) ↦β (λx. x) ((λx. x) z)
 
 Q: does it matter which one we pick?
 A: no*

 β-reduction in the λ-calculus is *confluent*

 Suppose there are two different reductions for a term t:
    t ↦β t'

    t ↦β t''

 Then there always exists sequences of reductions from t' and t''
   that reduce to equal terms:

    t'  ↦β .. ↦β t'''
    t'' ↦β .. ↦β t'''

Intuitively: it doesn't matter what we do first, because we can
  get to the same place either way.

The consequence is that if a λ-term terminates,
  it always terminates in the same state.

We consider a λ-term *terminated* if it contains no β-redexes


   (λx. y) x   <--- not terminated
   y           <--- terminated, because we can't do more β-reductions

How on earth can I possibly write something that doesn't terminate?
 - we don't have loops
 - we *do* have function calls, but we don't have recursion (because functions don't have names)
Nonetheless:

  (λx. x x) (λx. x x)
  ↦β x x[x:=(λx. x x)]
  =  (λx. x x) (λx. x x)

^ the above is an example of a λ-term that β-reduces to itself (an infinite loop!)
                    
- Expressive power

How on earth do you program in the λ-calculus?
  All we have are functions and function calls, so how you *do* anything?

Church numerals: how to encode numbers as functions

  Intuitively, we encode a number as a function with two arguments.
  With Church encodings, we encode data as functions,
    so the data is represented as the function you'd typically run on it.
  What you typically do with a number (in programming) is
    you do something n times (as an iterator)
  A number is something that given a function, repeats that function n times


    0   ≡    λf x. x
    1   ≡    λf x. f x
    2   ≡    λf x. f(f x)
    ...

  Define addition:

    add n m ≡ λf x. n f (m f x)

  Exercise: see if you can use β reduction to show
    add 1 1 ≡β 2
  Exercise: can you use add to define multiplication?
        
Church booleans:

  The use of Booleans in programming languages is
   to choose between two alternatives (if-then-else)
                
  true    ≡  λx y. x
  false   ≡  λx y. y

  not     ≡  λb. λx y. b y x

  not true =  (λb. λx y. b y x) (λx y. x)
           ↦β (λx y. (λx y. x) y x)
           ↦β (λx y. (λz. y) x)
           ↦β (λx y. y)
           =  false

Fixpoints combinators:

   A fixpoint combinator in the lambda-calculus,
    is a lambda-term F with the property that

        F f   ≡αβη    f(F f)

   With fixpoint operators we can encode recursive
    functions and loops.

   C ≡ (λf. (λx. f (x x)) (λx. f (x x)))

   C f =
   (λf. (λx. f (x x)) (λx. f (x x))) f ↦β
   (λx. f (x x)) (λx. f (x x)) ↦β
   f ((λx. f (x x)) (λx. f (x x))) ≡η
   f ((λf. (λx. f (x x)) (λx. f (x x))) f) =
   f (C f)

   ^ I'm pretty sure this C is the Y-combinator,
     the most popular fixpoint combinator
 
- Termination

- Inconsistency :(

  If you want to use λ-calculus as the foundation of
  mathematics, it would be nice if it was a *consistent*
  foundation: one where you couldn't prove false.

  We could try to use λ-calculus as a unifying framework
  for all math, using (say) Church booleans as our truth
  values.

  We then run into the awkward situation that there exists
  terms that are equal to their own negation:

     Y not ≡αβη  not(Y not)

  Y not is a term that is equal to its own negation :(
  Bad news if you want your thing to be the foundation
  of mathematics.
                
Stretch goals:

- Simply typed λ-calculus

By bolting a type system onto the λ-calculus, we can
  rescue its usefulness as a foundation for logic

*And* arrive at precisely the foundation that
  Isabelle/HOL uses!