Haskell - part 3
Higher-order functions, functional parsers, and interactive programsMaterial adapted from Erik Meijer’s Functional ProgrammingA Jupyter notebook version of this material is available online in Google Colab. Although Colab does not support Haskell, the notebook can be downloaded and run in Jupyter Notebook, if the IHaskell extension is installed.
Higher-order functions
Introduction
A function is called higher-order if it takes a function as an argument or returns a function as a result.
twice :: (a → a) → a → a
twice f x = f (f x)
twice
is higher-order because it takes a function as its first argument.
Why are they useful?
-
Common programming idioms can be encoded as functions within the language itself
-
Domain specific languages can be defined as collections of higher-order functions
-
Algebraic properties of higher-order functions can be used to reason about programs
The map
function
The higher-order library function called map
applies a function to every element of a list.
map :: (a → b) → [a] → [b]
For example:
> map (+1) [1,3,5,7]
[2,4,6,8]
map (+1) [1,3,5,7]
The map function can be defined in a particularly simple manner using a list comprehension:
map f xs = [f x | x ← xs]
Alternatively, for the purposes of proofs, the map function can also be defined using recursion:
map f [] = []
map f (x:xs) = f x : map f xs
The filter
function
The higher-order library function
filter selects every element from a list that satisfies a predicate.
filter :: (a → Bool) → [a] → [a]
For example:
> filter even [1..10]
[2,4,6,8,10]
filter even [1..10]
filter
can be defined using a list comprehension:
filter p xs = [x | x ← xs, p x]
Alternatively, it can be defined using recursion:
filter p [] = []
filter p (x:xs)
| p x = x : filter p xs
| otherwise = filter p xs
The foldr
function
A number of functions on lists can be defined using the following simple pattern of recursion:
f [] = v
f (x:xs) = x ⊕ f xs
f
maps the empty list to some value v
, and any non-empty list to some function ⊕
applied to its head
and f
of its tail
.
For example:
sum [] = 0 -- v = 0
sum (x:xs) = x + sum xs -- ⊕ = +
product [] = 1 -- v = 1
product (x:xs) = x * product xs -- ⊕ = *
and [] = True -- v = True
and (x:xs) = x && and xs -- ⊕ = &&
sum [] = 0 -- v = 0
sum (x:xs) = x + sum xs -- ⊕ = +
product [] = 1 -- v = 1
product (x:xs) = x * product xs -- ⊕ = *
and [] = True -- v = True
and (x:xs) = x && and xs -- ⊕ = &&
The higher-order library function foldr
(fold right) encapsulates this simple pattern of recursion, with the function ⊕
and the value v
as arguments.
For example:
sum = foldr (+) 0
product = foldr (*) 1
or = foldr (||) False
and = foldr (&&) True
foldr
itself can be defined using recursion:
foldr :: (a → b → b) → b → [a] → b
foldr f v [] = v
foldr f v (x:xs) = f x (foldr f v xs)
However, it is best to think of foldr
non-recursively, as simultaneously replacing each (:)
in a list by a given function, and []
by a given value.
For example:
sum [1,2,3]
= foldr (+) 0 [1,2,3]
= foldr (+) 0 (1:(2:(3:[])))
= 1+(2+(3+0))
= 6
Replace each (:)
by (+)
and []
by 0
.
For example:
product [1,2,3]
= foldr (*) 1 [1,2,3]
= foldr (*) 1 (1:(2:(3:[])))
= 1*(2*(3*1))
= 6
Replace each (:)
by (*)
and []
by 1
.
Other foldr
examples
Even though foldr
encapsulates a simple pattern of recursion, it can be used to define many more functions than might first be expected.
Recall the length
function:
length :: [a] → Int
length [] = 0
length (_:xs) = 1 + length xs
For example:
length [1,2,3]
= length (1:(2:(3:[])))
= 1+(1+(1+0))
= 3
Replace each (:)
by λ_ n → 1+n
and []
by 0
.
Hence, we have:
length = foldr (λ_ n → 1+n) 0
Now recall the reverse
function:
reverse [] = []
reverse (x:xs) = reverse xs ++ [x]
For example:
reverse [1,2,3]
= reverse (1:(2:(3:[])))
= (([] ++ [3]) ++ [2]) ++ [1]
= [3,2,1]
Replace each (:)
by λx xs → xs ++ [x]
and []
by []
.
Hence, we have:
reverse =
foldr (λx xs → xs ++ [x]) []
Finally, we note that the append function (++)
has a particularly compact definition using foldr
:
(++ ys) = foldr (:) ys
Replace each (:)
by (:)
and []
by ys
.
Why is foldr useful?
-
Some recursive functions on lists, such as
sum
, are simpler to define usingfoldr
. -
Properties of functions defined using
foldr
can be proved using algebraic properties offoldr
, such as fusion and the banana split rule. -
Advanced program optimizations can be simpler if
foldr
is used in place of explicit recursion.
Other library functions
The library function (.)
returns the composition of two functions as a single function.
(.) :: (b → c) → (a → b) → (a → c)
f . g = λx → f (g x)
For example:
odd :: Int → Bool
odd = not . even
The library function all
decides if every element of a list satisfies a given predicate.
all :: (a → Bool) → [a] → Bool
all p xs = and [p x | x ← xs]
For example:
> all even [2,4,6,8,10]
True
all even [2,4,6,8,10]
Dually, the library function any
decides if at least one element of a list satisfies a predicate.
any :: (a → Bool) → [a] → Bool
any p xs = or [p x | x ← xs]
For example:
> any isSpace "abc def"
True
-- needs to be fixed
any isSpace "abc def"
The library function takeWhile
selects elements from a list while a predicate holds of all the elements.
takeWhile :: (a → Bool) → [a] → [a]
takeWhile p [] = []
takeWhile p (x:xs)
| p x = x : takeWhile p xs
| otherwise = []
For example:
> takeWhile isAlpha "abc def"
"abc"
-- needs to be fixed
takeWhile isAlpha "abc def"
Dually, the function dropWhile
removes elements while a predicate holds of all the elements.
dropWhile :: (a → Bool) → [a] → [a]
dropWhile p [] = []
dropWhile p (x:xs)
| p x = dropWhile p xs
| otherwise = x:xs
For example:
> dropWhile isSpace " abc"
"abc"
-- needs to be fixed
dropWhile isSpace " abc"
Exercises
-
What are higher-order functions that return functions as results better known as?
-
Express the comprehension
[f x | x ← xs, p x]
using the functionsmap
andfilter
. -
Redefine
map f
andfilter p
usingfoldr
.
Functional parsers
What is a parser?
A parser is a program that analyses a piece of text to determine its syntactic structure, as shown below with the equivalent parse tree shown in Figure 1.
Figure 1. The parse tree for the arithmetic expression 2 * 3 + 4. The deeper the operation is in the parse tree, the higher its precedence.
2*3+4
Where are they used?
Almost every real life program uses some form of parser to pre-process its input.
- GHC parses Haskell programs
- Unix parses shell scripts
- Internet Explorer parses HTML documents
The parser type
In a functional language such as Haskell, parsers can naturally be viewed as functions.
type Parser = String → Tree
A parser is a function that takes a string and returns some form of tree.
type Parser = String -> Tree
However, a parser might not require all of its input strings, so we also return any unused input:
type Parser = String → (Tree,String)
A string might be parsable in many ways, including none, so we generalize to a list of results:
type Parser = String → [(Tree,String)]
type Parser = String -> (Tree, String)
type Parser = String -> [(Tree, String)]
Finally, a parser might not always produce a tree, so we generalize to a value of any type:
type Parser a = String → [(a,String)]
Note:
- For simplicity, we will only consider parsers that either fail and return the empty list of results, or succeed and return a singleton list.
type Parser a = String -> [(a, String)]
Basic parsers
- The parser
item
fails if the input is empty, and consumes the first character otherwise:
item :: Parser Char
item = λinp → case inp of
[] → []
(x:xs) → [(x,xs)]
item :: Parser Char
item = \inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)]
- The parser failure always fails:
failure :: Parser a
failure = λinp → []
- The parser
return v
always succeeds, returning the valuev
without consuming any input:
return :: a → Parser a
return v = λinp → [(v,inp)]
failure :: Parser a
failure = \inp -> []
return :: a -> Parser a
return v = \inp -> [(v,inp)]
- The parser
p +++ q
behaves as the parserp
if it succeeds, and as the parserq
otherwise:
(+++) :: Parser a → Parser a → Parser a
p +++ q = λinp → case p inp of
[] → parse q inp
[(v,out)] → [(v,out)]
The function parse
applies a parser to a string:
parse :: Parser a → String → [(a,String)]
parse p inp = p inp
(+++) :: Parser a -> Parser a -> Parser a
p +++ q = \inp -> case p inp of
[] -> parse q inp
[(v,out)] -> [(v,out)]
parse :: Parser a -> String -> [(a,String)]
parse p inp = p inp
Examples
The behavior of the five parsing primitives can be illustrated with some simple examples:
% ghci Parsing
> parse item ""
[]
> parse item "abc"
[('a',"bc")]
parse item ""
parse item "abc"
> parse failure "abc"
[]
> parse (return 1) "abc"
[(1,"abc")]
> parse (item +++ return 'd') "abc"
[('a',"bc")]
> parse (failure +++ return 'd') "abc"
[('d',"abc")]
parse failure "abc"
parse (return 1) "abc"
parse (item +++ return 'd') "abc"
parse (failure +++ return 'd') "abc"
Note:
-
The library file Parsing is available on the web from the Programming in Haskell home page.
-
For technical reasons, the first failure example actually gives an error concerning types, but this does not occur in non-trivial examples.
-
The Parser type is a monad, a mathematical structure that has proved useful for modeling many different kinds of computations.
Sequencing
A sequence of parsers can be combined as a single composite parser using the keyword do
.
For example:
p :: Parser (Char,Char)
p = do x ← item
item
y ← item
return (x,y)
p :: Parser (Char,Char)
p = do x <- item
item
y <- item
return (x,y)
Note:
-
Each parser must begin in precisely the same column. That is, the layout rule applies.
-
The values returned by intermediate parsers are discarded by default, but if required can be named using the
←
operator. -
The value returned by the last parser is the value returned by the sequence as a whole.
-
If any parser in a sequence of parsers fails, then the sequence as a whole fails. For example:
> parse p "abcdef"
[((’a’,’c’),"def")]
> parse p "ab"
[]
- The do notation is not specific to the Parser type, but can be used with any monadic type.
parse p "abcdef"
parse p "ab"
Derived primitives
- Parsing a character that satisfies a predicate:
sat :: (Char → Bool) → Parser Char
sat p = do x ← item
if p x then
return x
else
failure
sat :: (Char -> Bool) -> Parser Char
sat p = do x <- item
if p x then
return x
else
failure
- Parsing a digit and specific characters:
digit :: Parser Char
digit = sat isDigit
char :: Char → Parser Char
char x = sat (x ==)
- Applying a parser zero or more times:
many :: Parser a → Parser [a]
many p = many1 p +++ return []
digit :: Parser Char
digit = sat isDigit
char :: Char -> Parser Char
char x = sat (x ==)
many :: Parser a -> Parser [a]
many p = many1 p +++ return []
- Applying a parser one or more times:
many1 :: Parser a → Parser [a]
many1 p = do v ← p
vs ← many p
return (v:vs)
- Parsing a specific string of characters:
string :: String → Parser String
string [] = return []
string (x:xs) = do char x
string xs
return (x:xs)
many1 :: Parser a -> Parser [a]
many1 p = do v <- p
vs <- many p
return (v:vs)
string :: String -> Parser String
string [] = return []
string (x:xs) = do char x
string xs
return (x:xs)
Example
We can now define a parser that consumes a list of one or more digits from a string:
p :: Parser String
p = do char '['
d ← digit
ds ← many (do char ','
digit)
char ']'
return (d:ds)
p :: Parser String
p = do char '['
d <- digit
ds <- many (do char ','
digit)
char ']'
return (d:ds)
For example:
> parse p "[1,2,3,4]"
[("1234","")]
> parse p "[1,2,3,4"
[]
Note:
- More sophisticated parsing libraries can indicate and/or recover from errors in the input string.
parse p "[1,2,3,4]"
parse p "[1,2,3,4"
Arithmetic expressions
Consider a simple form of expressions built up from single digits using the operations of addition, +
, and multiplication, *
, together with parentheses.
We also assume that:
*
and+
associate to the right;*
has higher priority than+
.
Formally, the syntax of such expressions is defined by the following context free grammar:
expr → term '+' expr | term
term → factor '*' term | factor
factor → digit | '(' expr ')‘
digit → '0' | '1' | … | '9'
However, for reasons of efficiency, it is important to factorize the rules for expr
and term
:
expr → term ('+' expr | ε)
term → factor ('*' term | ε)
Note:
- The symbol
ε
denotes the empty string.
It is now easy to translate the grammar into a parser that evaluates expressions, by simply rewriting the grammar rules using the parsing primitives.
That is, we have:
expr :: Parser Int
expr = do t ← term
do char '+'
e ← expr
return (t + e)
+++ return t
term :: Parser Int
term = do f ← factor
do char '*'
t ← term
return (f * t)
+++ return f
factor :: Parser Int
factor = do d ← digit
return (digitToInt d)
+++ do char '('
e ← expr
char ')'
return e
Finally, if we define
eval :: String → Int
eval xs = fst (head (parse expr xs))
then we try out some examples:
> eval "2*3+4"
10
> eval "2*(3+4)"
14
Exercises
-
Why does factorizing the expression grammar make the resulting parser more efficient?
-
Extend the expression parser to allow the use of subtraction and division, based upon the following extensions to the grammar:
expr → term ('+' expr | '-' expr | ε)
term → factor ('*' term | '/' term | ε)
Interactive programs
Introduction
To date, we have seen how Haskell can be used to write batch programs that take all their inputs at the start and give all their outputs at the end, as shown in Figure 2.
Figure 2. Batch programs take their inputs, process them, and generate the corresponding outputs.
However, we would also like to use Haskell to write interactive programs that read from the keyboard and write to the screen, as they are running, as shown in Figure 3.
Figure 3. Interactive programs take their inputs, interact with the user through the keyboard and screen, process the data, and generate the corresponding outputs.
The problem
Haskell programs are pure mathematical functions:
- Haskell programs have no side effects.
However, reading from the keyboard and writing to the screen are side effects:
- Interactive programs have side effects.
The solution
Interactive programs can be written in Haskell by using types to distinguish pure expressions from impure actions that may involve side effects.
IO a
The type of actions that return a value of type a
.
For example:
IO Char
The type of actions that return a character.
IO ()
The type of purely side effecting actions that return no result value.
Note:
()
is the type of tuples with no components.
Basic actions
The standard library provides a number of actions, including the following three primitives:
- The action
getChar
reads a character from the keyboard, echoes it to the screen, and returns the character as its result value:
getChar :: IO Char
- The action
putChar c
writes the characterc
to the screen, and returns no result value:
putChar :: Char → IO ()
- The action
return v
simply returns the valuev
, without performing any interaction:
return :: a → IO a
Sequencing
A sequence of actions can be combined as a single composite action using the keyword do
.
For example:
a :: IO (Char,Char)
a = do x ← getChar
getChar
y ← getChar
return (x,y)
a :: IO (Char,Char)
a = do x <- getChar
getChar
y <- getChar
return (x,y)
Derived primitives
- Reading a string from the keyboard:
getLine :: IO String
getLine = do x ← getChar
if x == '\n' then
return []
else
do xs ← getLine
return (x:xs)
- Writing a string to the screen:
putStr :: String → IO ()
putStr [] = return ()
putStr (x:xs) = do putChar x
putStr xs
- Writing a string and moving to a new line:
putStrLn :: String → IO ()
putStrLn xs = do putStr xs
putChar '\n'
Example
We can now define an action that prompts for a string to be entered and displays its length:
strlen :: IO ()
strlen = do putStr "Enter a string: "
xs ← getLine
putStr "The string has "
putStr (show (length xs))
putStrLn " characters"
strlen :: IO ()
strlen = do putStr "Enter a string: "
xs <- getLine
putStr "The string has "
putStr (show (length xs))
putStrLn " characters"
For example:
> strlen
Enter a string: abcde
The string has 5 characters
Note:
- Evaluating an action executes its side effects, with the final result value being discarded.
strlen
Hangman
Consider the following version of hangman:
- One player secretly types in a word.
- The other player tries to deduce the word, by entering a sequence of guesses.
-
For each guess, the computer indicates which letters in the secret word occur in the guess.
- The game ends when the guess is correct.
We adopt a top down approach to implementing hangman in Haskell, starting as follows:
hangman :: IO ()
hangman =
do putStrLn "Think of a word: "
word ← sgetLine
putStrLn "Try to guess it:"
guess word
hangman :: IO ()
hangman =
do putStrLn "Think of a word: "
word <- sgetLine
putStrLn "Try to guess it:"
guess word
The action sgetLine
reads a line of text from the keyboard, echoing each character as a dash:
sgetLine :: IO String
sgetLine = do x ← getCh
if x == '\n' then
do putChar x
return []
else
do putChar '-'
xs ← sgetLine
return (x:xs)
sgetLine :: IO String
sgetLine = do x <- getCh
if x == '\n' then
do putChar x
return []
else
do putChar '-'
xs <- sgetLine
return (x:xs)
The action getCh
reads a single character from the keyboard, without echoing it to the screen:
import System.IO
getCh :: IO Char
getCh = do hSetEcho stdin False
c ← getChar
hSetEcho stdin True
return c
import System.IO
getCh :: IO Char
getCh = do hSetEcho stdin False
c <- getChar
hSetEcho stdin True
return c
The function guess
is the main loop, which requests and processes guesses until the game ends.
guess :: String → IO ()
guess word =
do putStr "> "
xs ← getLine
if xs == word then
putStrLn "You got it!"
else
do putStrLn (diff word xs)
guess word
guess :: String -> IO ()
guess word =
do putStr "> "
xs <- getLine
if xs == word then
putStrLn "You got it!"
else
do putStrLn (diff word xs)
guess word
The function diff
indicates which characters in one string occur in a second string:
diff :: String → String → String
diff xs ys =
[if elem x ys then x else '-' | x ← xs]
For example:
> diff "haskell" "pascal"
"-as--ll"
diff :: String -> String -> String
diff xs ys =
[if elem x ys then x else '-' | x <- xs]
diff "haskell" "pascal"
Exercise
Implement the game of nim in Haskell, where the rules of the game are as follows:
- The board comprises five rows of stars:
1: * * * * *
2: * * * *
3: * * *
4: * *
5: *
-
Two players take it turn about to remove one or more stars from the end of a single row.
-
The winner is the player who removes the last star or stars from the board.
Hint:
Represent the board as a list of five integers that give the number of stars remaining on each row. For example, the initial board is [5,4,3,2,1]
.