Haskell - part 3

Higher-order functions, functional parsers, and interactive programsMaterial adapted from Erik Meijer’s Functional ProgrammingA Jupyter notebook version of this material is available online in Google Colab. Although Colab does not support Haskell, the notebook can be downloaded and run in Jupyter Notebook, if the IHaskell extension is installed.

Higher-order functions

Introduction

A function is called higher-order if it takes a function as an argument or returns a function as a result.

twice :: (a → a) → a → a
twice f x = f (f x)

twice is higher-order because it takes a function as its first argument.

Why are they useful?

Common programming idioms can be encoded as functions within the language itself
Domain specific languages can be defined as collections of higher-order functions
Algebraic properties of higher-order functions can be used to reason about programs

The `map` function

The higher-order library function called map applies a function to every element of a list.

map :: (a → b) → [a] → [b]

For example:

> map (+1) [1,3,5,7]
[2,4,6,8]

map (+1) [1,3,5,7]

The map function can be defined in a particularly simple manner using a list comprehension:

map f xs = [f x | x ← xs]

Alternatively, for the purposes of proofs, the map function can also be defined using recursion:

map f []     = []
map f (x:xs) = f x : map f xs

The `filter` function

The higher-order library function filter selects every element from a list that satisfies a predicate.

filter :: (a → Bool) → [a] → [a]

For example:

> filter even [1..10]
[2,4,6,8,10]

filter even [1..10]

filter can be defined using a list comprehension:

filter p xs = [x | x ← xs, p x]

Alternatively, it can be defined using recursion:

filter p []     = []
filter p (x:xs)
    | p x        = x : filter p xs
    | otherwise  = filter p xs

The `foldr` function

A number of functions on lists can be defined using the following simple pattern of recursion:

f []     = v
f (x:xs) = x ⊕ f xs

f maps the empty list to some value v, and any non-empty list to some function ⊕ applied to its head and f of its tail.

For example:

sum []     = 0                   -- v = 0
sum (x:xs) = x + sum xs          -- ⊕ = +

product []     = 1               -- v = 1
product (x:xs) = x * product xs  -- ⊕ = *

and []     = True                -- v = True
and (x:xs) = x && and xs         -- ⊕ = &&

sum []     = 0                   -- v = 0
sum (x:xs) = x + sum xs          -- ⊕ = +

product []     = 1               -- v = 1
product (x:xs) = x * product xs  -- ⊕ = *

and []     = True                -- v = True
and (x:xs) = x && and xs         -- ⊕ = &&

The higher-order library function foldr (fold right) encapsulates this simple pattern of recursion, with the function ⊕ and the value v as arguments.

For example:

sum     = foldr (+) 0

product = foldr (*) 1

or      = foldr (||) False

and     = foldr (&&) True

foldr itself can be defined using recursion:

foldr :: (a → b → b) → b → [a] → b
foldr f v []     = v
foldr f v (x:xs) = f x (foldr f v xs)

However, it is best to think of foldr non-recursively, as simultaneously replacing each (:) in a list by a given function, and [] by a given value.

For example:

sum [1,2,3]
= foldr (+) 0 [1,2,3]
= foldr (+) 0 (1:(2:(3:[])))
= 1+(2+(3+0))
= 6

Replace each (:) by (+) and [] by 0.

For example:

product [1,2,3]
= foldr (*) 1 [1,2,3]
= foldr (*) 1 (1:(2:(3:[])))
= 1*(2*(3*1))
= 6

Replace each (:) by (*) and [] by 1.

Other `foldr` examples

Even though foldr encapsulates a simple pattern of recursion, it can be used to define many more functions than might first be expected.

Recall the length function:

length :: [a] → Int
length []     = 0
length (_:xs) = 1 + length xs

For example:

length [1,2,3]
= length (1:(2:(3:[])))
= 1+(1+(1+0))
= 3

Replace each (:) by λ_ n → 1+n and [] by 0.

Hence, we have:

length = foldr (λ_ n → 1+n) 0

Now recall the reverse function:

reverse []     = []
reverse (x:xs) = reverse xs ++ [x]

For example:

reverse [1,2,3]
= reverse (1:(2:(3:[])))
= (([] ++ [3]) ++ [2]) ++ [1]
= [3,2,1]

Replace each (:) by λx xs → xs ++ [x] and [] by [].

Hence, we have:

reverse =
    foldr (λx xs → xs ++ [x]) []

Finally, we note that the append function (++) has a particularly compact definition using foldr:

(++ ys) = foldr (:) ys

Replace each (:) by (:) and [] by ys.

Why is foldr useful?

Some recursive functions on lists, such as sum, are simpler to define using foldr.
Properties of functions defined using foldr can be proved using algebraic properties of foldr, such as fusion and the banana split rule.
Advanced program optimizations can be simpler if foldr is used in place of explicit recursion.

Other library functions

The library function (.) returns the composition of two functions as a single function.

(.) :: (b → c) → (a → b) → (a → c)
f . g  = λx → f (g x)

For example:

odd :: Int → Bool
odd  = not . even

The library function all decides if every element of a list satisfies a given predicate.

all :: (a → Bool) → [a] → Bool
all p xs = and [p x | x ← xs]

For example:

> all even [2,4,6,8,10]
True

all even [2,4,6,8,10]

Dually, the library function any decides if at least one element of a list satisfies a predicate.

any :: (a → Bool) → [a] → Bool
any p xs = or [p x | x ← xs]

For example:

> any isSpace "abc def"
True

-- needs to be fixed
any isSpace "abc def"

The library function takeWhile selects elements from a list while a predicate holds of all the elements.

takeWhile :: (a → Bool) → [a] → [a]
takeWhile p []     = []
takeWhile p (x:xs)
    | p x           = x : takeWhile p xs
    | otherwise     = []

For example:

> takeWhile isAlpha "abc def"
"abc"

-- needs to be fixed
takeWhile isAlpha "abc def"

Dually, the function dropWhile removes elements while a predicate holds of all the elements.

dropWhile :: (a → Bool) → [a] → [a]
dropWhile p []     = []
dropWhile p (x:xs)
    | p x           = dropWhile p xs
    | otherwise     = x:xs

For example:

> dropWhile isSpace "   abc"
"abc"

-- needs to be fixed
dropWhile isSpace "   abc"

Exercises

What are higher-order functions that return functions as results better known as?
Express the comprehension [f x | x ← xs, p x] using the functions map and filter.
Redefine map f and filter p using foldr.

Functional parsers

What is a parser?

A parser is a program that analyses a piece of text to determine its syntactic structure, as shown below with the equivalent parse tree shown in Figure 1.

⊕ <a name='figure1'>Figure 1</a>. The parse tree for the arithmetic expression 2 * 3 + 4. The deeper the operation is in the parse tree, the higher its precedence.
Figure 1. The parse tree for the arithmetic expression 2 * 3 + 4. The deeper the operation is in the parse tree, the higher its precedence.

2*3+4

Where are they used?

Almost every real life program uses some form of parser to pre-process its input.

GHC parses Haskell programs
Unix parses shell scripts
Internet Explorer parses HTML documents

The parser type

In a functional language such as Haskell, parsers can naturally be viewed as functions.

type Parser = String → Tree

A parser is a function that takes a string and returns some form of tree.

type Parser = String -> Tree

However, a parser might not require all of its input strings, so we also return any unused input:

type Parser = String → (Tree,String)

A string might be parsable in many ways, including none, so we generalize to a list of results:

type Parser = String → [(Tree,String)]

type Parser = String -> (Tree, String)
type Parser = String -> [(Tree, String)]

Finally, a parser might not always produce a tree, so we generalize to a value of any type:

type Parser a = String → [(a,String)]

Note:

For simplicity, we will only consider parsers that either fail and return the empty list of results, or succeed and return a singleton list.

type Parser a = String -> [(a, String)]

Basic parsers

The parser item fails if the input is empty, and consumes the first character otherwise:

item :: Parser Char
item  = λinp → case inp of
    []     → []
    (x:xs) → [(x,xs)]

item :: Parser Char
item  = \inp -> case inp of
    []     -> []
    (x:xs) -> [(x,xs)]

The parser failure always fails:

failure :: Parser a
failure  = λinp → []

The parser return v always succeeds, returning the value v without consuming any input:

return  :: a → Parser a
return v = λinp → [(v,inp)]

failure :: Parser a
failure  = \inp -> []

return  :: a -> Parser a
return v = \inp -> [(v,inp)]

The parser p +++ q behaves as the parser p if it succeeds, and as the parser q otherwise:

(+++) :: Parser a → Parser a → Parser a
p +++ q = λinp → case p inp of
    []        → parse q inp
    [(v,out)] → [(v,out)]

The function parse applies a parser to a string:

parse :: Parser a → String → [(a,String)]
parse p inp = p inp

(+++) :: Parser a -> Parser a -> Parser a
p +++ q = \inp -> case p inp of
    []        -> parse q inp
    [(v,out)] -> [(v,out)]

parse :: Parser a -> String -> [(a,String)]
parse p inp = p inp

Examples

The behavior of the five parsing primitives can be illustrated with some simple examples:

% ghci Parsing

> parse item ""
[]

> parse item "abc"
[('a',"bc")]

parse item ""

parse item "abc"

> parse failure "abc"
[]

> parse (return 1) "abc"
[(1,"abc")]

> parse (item +++ return 'd') "abc"
[('a',"bc")]

> parse (failure +++ return 'd') "abc"
[('d',"abc")]

parse failure "abc"

parse (return 1) "abc"

parse (item +++ return 'd') "abc"

parse (failure +++ return 'd') "abc"

Note:

The library file Parsing is available on the web from the Programming in Haskell home page.
For technical reasons, the first failure example actually gives an error concerning types, but this does not occur in non-trivial examples.
The Parser type is a monad, a mathematical structure that has proved useful for modeling many different kinds of computations.

Sequencing

A sequence of parsers can be combined as a single composite parser using the keyword do.

For example:

p :: Parser (Char,Char)
p  = do x ← item
    item
    y ← item
    return (x,y)

p :: Parser (Char,Char)
p  = do x <- item
    item
    y <- item
    return (x,y)

Note:

Each parser must begin in precisely the same column. That is, the layout rule applies.
The values returned by intermediate parsers are discarded by default, but if required can be named using the ← operator.
The value returned by the last parser is the value returned by the sequence as a whole.
If any parser in a sequence of parsers fails, then the sequence as a whole fails. For example:

> parse p "abcdef"
[((’a’,’c’),"def")]

> parse p "ab"
[]

The do notation is not specific to the Parser type, but can be used with any monadic type.

parse p "abcdef"

parse p "ab"

Derived primitives

Parsing a character that satisfies a predicate:

sat :: (Char → Bool) → Parser Char
sat p = do x ← item
            if p x then
                return x
            else
                failure

sat :: (Char -> Bool) -> Parser Char
sat p = do x <- item
            if p x then
                return x
            else
                failure

Parsing a digit and specific characters:

digit :: Parser Char
digit  = sat isDigit

char  :: Char → Parser Char
char x = sat (x ==)

Applying a parser zero or more times:

many  :: Parser a → Parser [a]
many p = many1 p +++ return []

digit :: Parser Char
digit  = sat isDigit

char  :: Char -> Parser Char
char x = sat (x ==)

many  :: Parser a -> Parser [a]
many p = many1 p +++ return []

Applying a parser one or more times:

many1  :: Parser a → Parser [a]
many1 p = do v  ← p
             vs ← many p
             return (v:vs)

Parsing a specific string of characters:

string       :: String → Parser String
string []     = return []
string (x:xs) = do char x
                   string xs
                   return (x:xs)

many1  :: Parser a -> Parser [a]
many1 p = do v  <- p
             vs <- many p
             return (v:vs)

string       :: String -> Parser String
string []     = return []
string (x:xs) = do char x
                   string xs
                   return (x:xs)

Example

We can now define a parser that consumes a list of one or more digits from a string:

p :: Parser String
p  = do char '['
        d  ← digit
        ds ← many (do char ','
                      digit)
        char ']'
        return (d:ds)

p :: Parser String
p  = do char '['
        d  <- digit
        ds <- many (do char ','
                      digit)
        char ']'
        return (d:ds)

For example:

> parse p "[1,2,3,4]"
[("1234","")]

> parse p "[1,2,3,4"
[]

Note:

More sophisticated parsing libraries can indicate and/or recover from errors in the input string.

parse p "[1,2,3,4]"

parse p "[1,2,3,4"

Arithmetic expressions

Consider a simple form of expressions built up from single digits using the operations of addition, +, and multiplication, *, together with parentheses.

We also assume that:

* and + associate to the right;
* has higher priority than +.

Formally, the syntax of such expressions is defined by the following context free grammar:

expr   → term '+' expr | term

term   → factor '*' term | factor

factor → digit | '(' expr ')‘

digit  → '0' | '1' | … | '9'

However, for reasons of efficiency, it is important to factorize the rules for expr and term:

expr → term ('+' expr | ε)

term → factor ('*' term | ε)

Note:

The symbol ε denotes the empty string.

It is now easy to translate the grammar into a parser that evaluates expressions, by simply rewriting the grammar rules using the parsing primitives.

That is, we have:

expr :: Parser Int
expr  = do t ← term
           do char '+'
              e ← expr
              return (t + e)
           +++ return t

term :: Parser Int
term  = do f ← factor
           do char '*'
              t ← term
              return (f * t)
           +++ return f

factor :: Parser Int
factor  = do d ← digit
             return (digitToInt d)
          +++ do char '('
                 e ← expr
                 char ')'
                 return e

Finally, if we define

eval   :: String → Int
eval xs = fst (head (parse expr xs))

then we try out some examples:

> eval "2*3+4"
10

> eval "2*(3+4)"
14

Exercises

Why does factorizing the expression grammar make the resulting parser more efficient?
Extend the expression parser to allow the use of subtraction and division, based upon the following extensions to the grammar:

expr → term ('+' expr | '-' expr | ε)

term → factor ('*' term | '/' term | ε)

Interactive programs

Introduction

To date, we have seen how Haskell can be used to write batch programs that take all their inputs at the start and give all their outputs at the end, as shown in Figure 2.

⊕ <a name='figure2'>Figure 2</a>. Batch programs take their inputs, process them, and generate the corresponding outputs.
Figure 2. Batch programs take their inputs, process them, and generate the corresponding outputs.

However, we would also like to use Haskell to write interactive programs that read from the keyboard and write to the screen, as they are running, as shown in Figure 3.

⊕ <a name='figure3'>Figure 3</a>. Interactive programs take their inputs, interact with the user through the keyboard and screen, process the data, and generate the corresponding outputs.
Figure 3. Interactive programs take their inputs, interact with the user through the keyboard and screen, process the data, and generate the corresponding outputs.

The problem

Haskell programs are pure mathematical functions:

Haskell programs have no side effects.

However, reading from the keyboard and writing to the screen are side effects:

Interactive programs have side effects.

The solution

Interactive programs can be written in Haskell by using types to distinguish pure expressions from impure actions that may involve side effects.

IO a

The type of actions that return a value of type a.

For example:

IO Char

The type of actions that return a character.

IO ()

The type of purely side effecting actions that return no result value.

Note:

() is the type of tuples with no components.

Basic actions

The standard library provides a number of actions, including the following three primitives:

The action getChar reads a character from the keyboard, echoes it to the screen, and returns the character as its result value:

getChar :: IO Char

The action putChar c writes the character c to the screen, and returns no result value:

putChar :: Char → IO ()

The action return v simply returns the value v, without performing any interaction:

return :: a → IO a

Sequencing

A sequence of actions can be combined as a single composite action using the keyword do.

For example:

a :: IO (Char,Char)
a  = do x ← getChar
        getChar
        y ← getChar
        return (x,y)

a :: IO (Char,Char)
a  = do x <- getChar
        getChar
        y <- getChar
        return (x,y)

Derived primitives

Reading a string from the keyboard:

getLine :: IO String
getLine  = do x ← getChar
              if x == '\n' then
                 return []
              else
                 do xs ← getLine
                    return (x:xs)

Writing a string to the screen:

putStr       :: String → IO ()
putStr []     = return ()
putStr (x:xs) = do putChar x
                   putStr xs

Writing a string and moving to a new line:

putStrLn   :: String → IO ()
putStrLn xs = do putStr xs
                 putChar '\n'

Example

We can now define an action that prompts for a string to be entered and displays its length:

strlen :: IO ()
strlen  = do putStr "Enter a string: "
             xs ← getLine
             putStr "The string has "
             putStr (show (length xs))
             putStrLn " characters"

strlen :: IO ()
strlen  = do putStr "Enter a string: "
             xs <- getLine
             putStr "The string has "
             putStr (show (length xs))
             putStrLn " characters"

For example:

> strlen

Enter a string: abcde
The string has 5 characters

Note:

Evaluating an action executes its side effects, with the final result value being discarded.

strlen

Hangman

Consider the following version of hangman:

One player secretly types in a word.
The other player tries to deduce the word, by entering a sequence of guesses.
For each guess, the computer indicates which letters in the secret word occur in the guess.
The game ends when the guess is correct.

We adopt a top down approach to implementing hangman in Haskell, starting as follows:

hangman :: IO ()
hangman  =
    do putStrLn "Think of a word: "
       word ← sgetLine
       putStrLn "Try to guess it:"
       guess word

hangman :: IO ()
hangman  =
    do putStrLn "Think of a word: "
       word <- sgetLine
       putStrLn "Try to guess it:"
       guess word

The action sgetLine reads a line of text from the keyboard, echoing each character as a dash:

sgetLine :: IO String
sgetLine  = do x ← getCh
               if x == '\n' then
                  do putChar x
                     return []
               else
                  do putChar '-'
                     xs ← sgetLine
                     return (x:xs)

sgetLine :: IO String
sgetLine  = do x <- getCh
               if x == '\n' then
                  do putChar x
                     return []
               else
                  do putChar '-'
                     xs <- sgetLine
                     return (x:xs)

The action getCh reads a single character from the keyboard, without echoing it to the screen:

import System.IO

getCh :: IO Char
getCh  = do hSetEcho stdin False
            c ← getChar
            hSetEcho stdin True
            return c

import System.IO

getCh :: IO Char
getCh  = do hSetEcho stdin False
            c <- getChar
            hSetEcho stdin True
            return c

The function guess is the main loop, which requests and processes guesses until the game ends.

guess     :: String → IO ()
guess word =
    do putStr "> "
       xs ← getLine
       if xs == word then
          putStrLn "You got it!"
       else
          do putStrLn (diff word xs)
             guess word

guess     :: String -> IO ()
guess word =
    do putStr "> "
       xs <- getLine
       if xs == word then
          putStrLn "You got it!"
       else
          do putStrLn (diff word xs)
             guess word

The function diff indicates which characters in one string occur in a second string:

diff      :: String → String → String
diff xs ys =
    [if elem x ys then x else '-' | x  ← xs]

For example:

> diff "haskell" "pascal"
"-as--ll"

diff      :: String -> String -> String
diff xs ys =
    [if elem x ys then x else '-' | x  <- xs]

diff "haskell" "pascal"

Exercise

Implement the game of nim in Haskell, where the rules of the game are as follows:

The board comprises five rows of stars:

* * * * *
* * * *
* * *
* *
*

Two players take it turn about to remove one or more stars from the end of a single row.
The winner is the player who removes the last star or stars from the board.

Hint:

Represent the board as a list of five integers that give the number of stars remaining on each row. For example, the initial board is [5,4,3,2,1].

Back to course home

Haskell - part 3

Higher-order functions

Introduction

Why are they useful?

The map function

The filter function

The foldr function

Other foldr examples

Why is foldr useful?

Other library functions

Exercises

Functional parsers

What is a parser?

Where are they used?

The parser type

Basic parsers

Examples

Sequencing

Derived primitives

Example

Arithmetic expressions

Exercises

Interactive programs

Introduction

The problem

The solution

Basic actions

Sequencing

Derived primitives

Example

Hangman

Exercise

The `map` function

The `filter` function

The `foldr` function

Other `foldr` examples