Lambda-calculus
Review: programming modelsMaterial adapted from CSCI 3136 Principles of Programming Languages at Dalhousie University
Imperative programming: Computation is a sequence of side effects.
Functional Programming: Computation by function evalution. There are no side effects.
This requires functions to be first-class objects: they can be passed as function arguments and returned as the results of computations.
Functions that operate on functions and/or return functions are called higher-order functions.
Iteration is impossible in functional languages because iteration inherently requires side effects. Iteration can be simulated using recursion.
Other common features of functional programming languages
-
Strong reliance on list types and strong support for manipulating them
-
Structured function returns
-
Structured data constructors
-
Garbage collection
Functional programming and the real world
Advantages of functional programming
-
The lack of side effects makes it easier to reason about program:
– Formal proofs of program correctness become easier.
– Tests always produce the same result.
-
Higher level of abstraction \(\to\) more expressive
Real-world concerns
-
The interaction with the real world (file I/O, user I/O, …) happens through side effects.
-
Most functional programming languages have support for side effects to accommodate this.
– Scheme: very flexible: supports mixing of different programming styles
– Haskell: Controlled side effects through monads
\(\lambda\)-calculus (Alonzo Church 1941)
. . . is the theoretical foundation of functional programming. It expresses computation entirely using function application.
A context-free grammar for lambda expressions
Numbers are not really part of this, but we introduce them here to make the exposition easier.
Examples
-
\(x\), times, plus, 7
-
\(\lambda x.\)x
-
\(\lambda y.\)times
-
\(\lambda x.x y\)z
-
\(\lambda y.\)times \(y y\)
-
\(\lambda x.(\)times \(x\lambda y.y)\)
Interpretation
A variable is bound if it is in the scope of a \(\lambda\) and free otherwise.
Examples
-
x is bound in \(\lambda x.x y\) but free in the subexpression x y
-
x is free in \(\lambda y . x y\)
\(\lambda x.Expr\) (function definition)
- is a function with formal parameter x, which becomes a bound variable in Expr
\((\lambda x.Expr)A\equiv Expr[A\setminus x]\) (function application)
- replaces every free occurrence of x in Expr with A
\((\lambda x.\lambda y.Expr)AB\equiv(\lambda y.Expr[A\setminus x])B\equiv Expr[A\setminus x\) and \(B\setminus y]\)
Computing using \(\lambda\)-calculus
Computation is done using three transformations:
\(\beta\)-reduction
-
\((\lambda x.E) M \to_\beta E[M \setminus x\)]
-
Permissible only if no free variable of M becomes bound in \(E[M \setminus x]\)
\(\alpha\)-conversion
-
\(\lambda x.E\to_\alpha\lambda y.E[y\setminus x]\), where y is a variable not occurring in E
-
Needed to make \(\beta\)-reduction possible
\(\eta\)-reduction
-
\(\lambda x.F x \to_\eta F\), where F contains no free occurrences of x
-
Eliminates “surplus” lambda abstractions
The need for \(\alpha\)-conversion
\[(\lambda x.\lambda y.x y)(\lambda z.y)c\]Substitution without \(\alpha\)-conversion
\[\begin{align} (\lambda x.\lambda y.x y)(\lambda z.y)c &\to_\beta (\lambda y.(\lambda z.y)y)c\\ &\to_\beta(\lambda z.c)c\\ &\to_\beta c \end{align}\]Substitution with \(\alpha\)-conversion
\[\begin{align} (\lambda x.\lambda y.x y)(\lambda z.y)c &\to_\alpha (\lambda x.\lambda w.xw)(\lambda z.y)c\\ &\to_\beta(\lambda w.(\lambda z.y)w)c\\ &\to_\eta(\lambda z.y)c\\ &\to_\beta y \end{align}\]Associativity in \(\lambda\)-expressions
Function definition is right-associative:
\[\lambda x.\lambda y.\lambda z.E\equiv \lambda x.(\lambda y.(\lambda z.E)))\]Function application is left-associative:
\[ABCD\equiv((AB)C)D\]\(\lambda\) extends as far to the right as possible (to the end of the expression or to the next parenthesis):
\[\lambda x.ABCD\equiv\lambda x.(ABCD)\]Labels
Labels can be created as shortcuts for long expressions. This is kind of like a global function definition in a programming language.
Examples
\[\begin{align} square&\equiv\lambda x.times\ x\ x\\ id&\equiv\lambda x.x\\ const&\equiv\lambda x.\lambda y.x\\ hypot&\equiv\lambda x.\lambda y.sqrt(plus(square\ x)(square\ y)) \end{align}\]Logic and control flow (1)
\[\begin{align} true&\equiv\lambda x.\lambda y.x\\ false&\equiv\lambda x.\lambda y. y\\ not&\equiv\lambda t.t\ false\ true\\ if&\equiv\lambda c.\lambda t.\lambda e.cte \end{align}\]Example 1
\[\begin{align} not\ true&\equiv(\lambda t.t(\lambda x.\lambda y.y)(\lambda x.\lambda y.x))(\lambda x.\lambda y.x)\\ &\to_\beta(\lambda x.\lambda y.x)(\lambda x.\lambda y.y)(\lambda x.\lambda y.x)\\ &\to_\beta(\lambda y.(\lambda x.\lambda y.y))(\lambda x.\lambda y.x)\\ &\to_\beta(\lambda x.\lambda y.y)\\ &\equiv false \end{align}\]Logic and control flow (2)
Example 2
\[\begin{align} if\ true\ 3\ 4&\equiv(\lambda c.\lambda t.\lambda e.cte)(\lambda x.\lambda y.x)\ 3\ 4\\ &\to_\beta(\lambda t.\lambda e.(\lambda x.\lambda y.x)te)\ 3\ 4\\ &\to_\beta(\lambda e.(\lambda x.\lambda y.x)3e) 4\\ &\to_\beta(\lambda x.\lambda y.x)\ 3\ 4\\ &\to_\beta(\lambda y.3)\ 4\\ &\to_\beta3 \end{align}\]What about “and” and “or”?
Representing natural numbers using functions

Building data structures: lists (1)
\[\begin{align} cons&\equiv\lambda h.\lambda t.\lambda x.xht&\textrm{(prepend single element h to list t)}\\ car&\equiv\lambda l.l(\lambda x.\lambda y.x)&\textrm{(return head of list l)}\\ cdr&\equiv\lambda l.l(\lambda x.\lambda y.y)&\textrm{(return tail of list l)}\\ nil&\equiv\lambda x.true&\textrm{(empty list)}\\ null?&\equiv\lambda l.l(\lambda x.\lambda y.false)&\textrm{(is list l empty?)} \end{align}\]Example 1
\[\begin{align} null?nil&\equiv(\lambda l.l(\lambda x.\lambda y.false))(\lambda x.true)\\ &\to_\beta(\lambda x.true)(\lambda x.\lambda y.false)\\ &\to_\beta true \end{align}\]Building data structures: lists (2)
\[\begin{align} cons&\equiv\lambda h.\lambda t.\lambda x.xht&\textrm{(prepend single element h to list t)}\\ car&\equiv\lambda l.l(\lambda x.\lambda y.x)&\textrm{(return head of list l)}\\ cdr&\equiv\lambda l.l(\lambda x.\lambda y.y)&\textrm{(return tail of list l)}\\ nil&\equiv\lambda x.true&\textrm{(empty list)}\\ null?&\equiv\lambda l.l(\lambda x.\lambda y.false)&\textrm{(is list l empty?)} \end{align}\]Example 2
\[\begin{align} null?(cons\ 1\ nil)&\equiv(\lambda l.l(\lambda x.\lambda y.false))((\lambda h.\lambda t.\lambda x.xht)1(\lambda x.true))\\ &\to_\beta(\lambda h.\lambda t.\lambda x.xht)1(\lambda x.true)(\lambda x.\lambda y.false)\\ &\to_\beta(\lambda t.\lambda x.x1t)(\lambda x.true)(\lambda x.\lambda y.false)\\ &\to_\beta(\lambda x.x1(\lambda x.true))(\lambda x.\lambda y.false)\\ &\to_\beta(\lambda x.\lambda y.false)1(\lambda x.true)\\ &\to_\beta(\lambda y.false)(\lambda x.true)\\ &\to_\beta false \end{align}\]Building data structures: lists (3)
Example 3
\[\begin{align} car(cdr(cons\ 1\ (cons\ 2\ (cons\ 3\ nil)))) &\equiv(\lambda l.l(\lambda x.\lambda y.x))(cdr(cons\ 1\ (cons\ 2\ (cons\ 3\ nil))))\\ &\to_\beta cdr(cons\ 1\ (cons\ 2\ (cons\ 3\ nil)))(\lambda x.\lambda y.x)\\ &\equiv(\lambda l.l(\lambda x.\lambda y.y))(cons\ 1\ (cons\ 2\ (cons\ 3\ nil)))(\lambda x.\lambda y.x)\\ &\to_\beta cons\ 1\ (cons\ 2\ (cons\ 3\ nil))(\lambda x.\lambda y.y)(\lambda x.\lambda y.x)\\ &\equiv(\lambda h.\lambda t.\lambda x.xht)\ 1\ (cons\ 2\ (cons\ 3\ nil))(\lambda x.\lambda y.y)(\lambda x.\lambda y.x)\\ &\to_\beta^* (\lambda x.\lambda y.y)\ 1\ (cons\ 2\ (cons\ 3\ nil))(\lambda x.\lambda y.x)\\ &\to_\beta^* (cons\ 2\ (cons\ 3\ nil))(\lambda x.\lambda y.x)\\ &\equiv(\lambda h.\lambda t.\lambda x.xht)\ 2\ (cons\ 3\ nil)(\lambda x.\lambda y.x)\\ &\to_\beta^* (\lambda x.\lambda y.x)\ 2\ (cons\ 3\ nil)\\ &\to_\beta^* 2 \end{align}\]Currying
(Haskell Curry, logician)
Currying: No need for multi-argument functions. Instead, represent such functions as functions returning functions returning functions …
Currying makes “partial function application” possible:
-
times\(\equiv\lambda x.\lambda y.x*y\) (a function with two arguments)
-
Full application: times 2 3 \(\equiv\) 6 (a number)
-
Partial application: times 2 \(\equiv \lambda y .2 * y\) (a function doubling its argument)
Currying is cool (1)

Currying Is Cool (2)

Currying is cool (3)
Scheme (Anonymous functions and higher-order functions for list processing, but no currying)
(if cond
(map (lambda (x) (* 10 x)) A)
(map (lambda (x) (+ 2 x)) A))
Haskell (Anonymous functions + currying)
if cond then map (* 10) A
else map (+ 2) A
Recursion in \(\lambda\)-calculus (1)
Example: Computing factorials
Attempt 1 (Does not work)
\[fac\equiv\lambda n.if(= n\ 0)\ 1\ (*n(fac(−\ n\ 1)))\]The reason this is not correct is because “fac” is just a shorthand for the \(\lambda\)-expression on its right-hand side. In order to obtain a valid \(\lambda\)-expression, all such shorthands need to be replaced with the \(\lambda\)-expressions they represent.
This leads to an infinite process here.
Towards a correct definition
\[fac\equiv(\lambda f .\lambda n.if(= n\ 0)\ 1\ (*n(f(−\ n\ 1)))fac\]In other words,
\[fac\equiv F\ fac\]where
\[F\equiv(\lambda f.\lambda n.if(= n\ 0)\ 1\ (*\ n(f (−\ n\ 1)))\]“fac” is a fixed point of F.
Recursion in \(\lambda\)-calculus (2)
The Y-combinator (fixed-point operator)
\[Y\equiv\lambda h.(\lambda x.h(x\ x))(\lambda x.h(x\ x))\]Whenever we would normally write a recursive function
\[f\equiv\lambda x.\ ... f\ ...\]we now write
\[F\equiv Y(\lambda f .\lambda x.\ ... f\ ...)\]and then evaluate F x instead of f x.
Back to factorials:
-
Incorrect: \(fac\equiv\lambda n.if(=\ n\ 0)\ 1\ (*\ n\ (fac(−\ n\ 1)))\)
-
Correct: \(Fac\equiv Y(\lambda f .\lambda n.if(= n\ 0)\ 1\ (*\ n(f (−\ n\ 1))))\)
Recursion in \(\lambda\)-calculus (3)
\[\begin{align} Fac\ 3&\equiv&Y(\lambda f .\lambda n.if(=\ n\ 0)1(*\ n(f (−\ n\ 1))))3\\ &\equiv&(\lambda h.(\lambda x.h(x\ x))(\lambda x.h(x\ x)))\\ &&(\lambda f .\lambda n.if(=\ n\ 0)\ 1\ (*\ n(f (−\ n\ 1))))3\\ &\to_\beta&(\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)\ 1\ (*\ n(f (−\ n\ 1))))(x\ x))\\ &&(\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)\ 1\ (*\ n(f (−\ n\ 1))))(x\ x))3\\ &\to_\beta&(\lambda f .\lambda n.if(=\ n\ 0)\ 1\ (*\ n\ (f (−\ n\ 1))))\\ &&((\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)\ 1\ (*\ n(f (−\ n\ 1))))(x\ x))\\ &&(\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)\ 1\ (*\ n(f (−\ n\ 1))))(x\ x)))3\\ &\to_\beta^*&if(= 3\ 0)1\\ &&(*\ 3(((\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(*\ n(f (−\ n\ 1))))(x\ x))\\ &&(\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(*\ n(f (−\ n\ 1))))(x\ x)))(−\ 3\ 1)))\\ &\to_\beta^*&3(((\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)\ 1\ (*\ n(f (−\ n\ 1))))(x\ x))\\ &&(\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(*\ n(f (−\ n\ 1))))(x\ x)))2)\\ &\to_\beta^*&3(*\ 2(*\ 1(((\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(*\ n(f (−\ n\ 1))))(x\ x))\\ &&(\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(*\ n(f (−\ n\ 1))))(x\ x)))0)))\\ &\to_\beta^*&*3(*2(*1(((\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(*n(f (−\ n\ 1))))(x\ x))\\ &&(\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(* n(f (−\ n\ 1))))(x\ x)))0)))\\ &\to_\beta^*&*3(*2(*1(if(=\ 0\ 0)1\\ &&(*0(((\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(*\ n(f (−\ n\ 1))))(x\ x))\\ &&(\lambda x.(\lambda f .\lambda n.if(=\ n\ 0)1(*\ n(f (−\ n\ 1))))(x\ x)))(−\ 0\ 1))))))\\ &\to_\beta^*&*3(*2(*11))\\ &\to_\beta^*&6 \end{align}\](Non-)termination of reductions
Some reductions never terminate:
\[(\lambda x.x\ x)(\lambda x.x\ x)\to_\beta (\lambda x.x\ x)(\lambda x.x\ x)\]Others may produce longer and longer expressions:
\[\begin{align} (\lambda x.x\ x\ x)(\lambda x.x\ x\ x)&\to_\beta (\lambda x.x\ x\ x)(\lambda x.x\ x\ x)(\lambda x.x\ x\ x)\\ &\to_\beta(\lambda x.x\ x\ x)(\lambda x.x\ x\ x)(\lambda x.x\ x\ x)(\lambda x.x\ x\ x)\\ &\to_\beta... \end{align}\]This is the same as infinite loops or infinite recursion in incorrect programs.
Normal-order vs applicative-order reduction (1)
Normal-order reduction: Subtitute the argument into the function unevaluated and continue reducing the resulting expression.
\[(\lambda x.h)((\lambda x.x\ x)(\lambda x.x\ x))\to_beta h\]Applicative-order reduction: Reduce both the function and the argument expressions to the simplest possible form before substituting the argument into the function. Then continue to reduce the resulting expression.
\[(\lambda x.h)((\lambda x.x\ x)(\lambda x.x\ x))\to_\beta^* ...\textrm{(infinite expansion)}\]Normal-order vs applicative-order reduction (2)
Why does this matter?
Normal-order reduction allows us to construct infinite objects and use them, as long as we only ever use a finite part of them.
Example: Fibonacci numbers in Haskell
fibonacci = 1:1:(zipWith (+) fibonacci (tail fibonacci))
> take 10 fibonacci
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
Church-Rosser theorem
If two reductions of the same \(\lambda\)-expression terminate, they terminate with the same result.
If some reduction of a \(\lambda\)-expression terminates, the normal-order reduction terminates.