[This post is part of an ongoing challenge to understand 52 papers in 52 weeks. You can read previous entries, here, or subscribe to be notified of new posts by email]

Fold or foldr is a recursion operator with magical properties commonly used in functional programming. Sometimes known as reduce, inject, or compress it turns a list of values into a single value according to a combining function.

Graham Hutton’s 1999 A tutorial on the expressivenes nad universality of fold does a great job of explaining just how magical it is. In a language with first-order tuples and functions any function can be expressed as a fold, it lets you prove programs without using induction, and you can even generalise it to work on datatypes other than lists.

Great magic indeed. But with great power comes great responsibility. Take heed of the lesser programmer and remember that many functions you rewrite as a fold, you will not be able to read in six months.

The fold operator

Fold as a central concept in a programming language dates from APL‘s reduction operator in 1962 and later FP‘s insertion operator in 1978. In Haskell it’s defined like this:

Fold in Haskell

Fold in Haskell

The paper uses Haskell for all its examples so I’m going to stick to that as well.

This code tells us that fold takes a function f of type α → β → β, a value v of type β, and processes a list of type [α] to return a value of type β. It does this by replacing the nil constructor [] with v. And every cons operator : with the function f.

Just think of lists as long chains of cons followed by a nil. You can think of a list [1,2,3] as prepending each value to an existing list – 1:2:3:[].

When you apply a ‘fold (+) 0′ it becomes ’1+2+3+0′, which just so happens to be the sum function. But that’s essentially all that fold does – it replaces the cons in lists with a supplied function and the nil constructor with a base value.

There’s also a right fold, foldr, which does the same, but right-to-left. Using foldl vs. foldr changes the associativity of the supplied function:

Prelude> foldl (-) 0 [1,2,3] # 0-1-2-3
-6
Prelude> foldr (-) 0 [1,2,3] # 1-(2-(3-0))
2

We can implement all the standard functions using fold using more complex functions. Actually we can implement anything, but we’ll get to that. Here’s a length example that tells us the length of a list.

Length implemented using fold

Length implemented using fold

You see, instead of transforming the string, that lambda function just counts.

The universal property of fold

In 1990 Malcolm was the first to systematically use the universal property of fold in his generalisation of lists to arbitrary datatypes. For finite lists it can be stated as an equivalence between two definitions of g.

Universal property

Universal property

Looking left-to-right, substituting g with g = fold f v gives you the recursive definition of fold, and right-to-left the two equations for g are exactly the assumptions to inductively prove that g equals fold f v.

The universal property is particularly useful as a proof principle because it encapsulates a common pattern of inductive proof. Rather than mess about with induction, you can just appeal to the universal property.

Let’s prove a simple equivalence:

(+1) . sum = fold (+) 1

The left function is a composition of +1 and sum, which sums a list then increments the result. The right function replaces every cons in a list with a + and the empty list with a 1.

We want to prove they will always give the same result when applied to the same list.

The equation matches the right-hand side of the universal property of fold, g = fold f v, with g = (+1) . sum, f = (+), and v = 1. With some simplification, the equation is equivalent to these:

sum   []   + 1 = 1
sum (x:xs) + 1 = x + (sum xs + 1)

Which can be verified by simple calculation:

Proof of sum = fold

Proof of sum = fold

This concludes the proof. Much simpler than explicitly performing induction!

For a less haskelly example, here’s what happened:

var inc = function (x) { return x+1; }
var sum = function (x, xs) {
 return xs.length > 0 ? x + sum(xs[0], xs.slice(1)) : x;
}
 
var plus = function (a, b) { return a+b; }
 
// we try to prove this equivalence, this is NOT something you execute :)
function (xs) { return inc(sum(0, xs)); } == function (xs) { return xs.reduce(plus, 1); }
 
// and then the same proof applies

The fusion property of fold allows us to fuse the composition of an arbitrary function and fold into a single fold. This isn’t true in general, so we use the universal property to calculate the conditions.

h . fold g w = fold f v

Using the same process as above, we expand this equation into two equations.

h (fold g w   [])   = v
h (fold g w (x:xs)) = f x (h (fold g w xs))

Which we simplify with two calculations:

To get:

Conditions for equivalence

Conditions for equivalence

Which basically says that, if the conditions on the left are met, then our composition of an arbitrary function and fold behaves exactly as a fold would.

Now we can greatly simplify our proof of the sum equation:

(+1) . sum = fold (+) 1 
=>
(+1) . fold (+) 0 = fold (+) 1

Which matches the fusion property and we’re done. Simple :)

We can also use universality as a definition principle to help us transform explicit recursion into folds.

Let’s take a recursively defined sum:

Recursively defined sum

Recursively defined sum

We want to rewrite this into a fold. That is, we’re going to solve the equation sum = fold f v for function f and value v. Because it matches the right-hand side of the universal property, we turn it into two equations:

We can see from the first equation that v = 0. From the second equation we calculate a definition for f.

We calculated that f = (+) and sum = fold (+) 0, which is a contrived example because the definition using fold is obvious, but does a good job of showing the idea behind what we’re doing.

The key step (†) above is the generalisation of sum xs to a fresh variable y. This step will feature in most such calculations and isn’t specific to sum.

Using this process, you can even write map as a fold:

map f = fold (λx ys → f x : ys) []

Increasing the power of fold with tuples

We can make fold even more powerful by using it to generate tuples. Say you want a sumlength function that returns a list’s length as well as its sum:

sumlength   :: [Int](Int, Int)
sumlength xs = (sum xs, length xs)

Turning this function into a single fold is more efficient because it only makes a single pass over the list.

sumlength = fold (λn (x, y)(n+x, 1+y)) (0, 0)

In fact, any pair of applications of fold to the same list can be rewritten as a single fold. Even a function such as dropWhile, which drops elements from a list as long as a predicate holds true.

First we need a sister function, dropWhile’, that pairs the argument list together with the result:

dropWhile'

dropWhile’

Which, as usual, we decide is equivalent to these two equations according to the universal property:

It’s obvious from the first equation that v = ([], []). We get the definition for f from a calculation like this:

And finally our dropWhile' function looks like this:

Definition of dropWhile'

Definition of dropWhile’

Composing this with fst gives us the normal dropWhile because the changed list ends up in the first part of the resulting tuple. dropWhile p = fst . dropWhile' p

This tupling technique is so powerful, we can redefine any primitive recursion function in terms of fold. The approach is usually to make a sister function that tuples the result together with the original list and work on that.

As far as I understand it, the idea is to essentially carry an extra argument into the fold so things are easier to work with. Which makes me wonder whether there’s a monadic solution to this that produces cleaner code.

The paper says that if you have a function like this:

You can fold this, by first introducing an extra argument, y, that is processed by a new function f in the base case, but generally passed through without change. Then you redefine h y with fold and introduce xs as an extra argument to g.

Primitive recursion

Primitive recursion

This gives you primitive recursion, which isn’t foldable unless you use the tupling technique.

And after the same sort of shenanigans as before, our function turns into a fold:

Folded recursion

Folded recursion

Which lets us say that h y = fst . k y, which is supposedly better than the definition we had before. I guess it is because we wanted to use fold and now we are.

Using fold to generate functions

Using a language with first-order functions lets us take the power of fold further still – we can generate functions. Similar to how you’d normally replace conses with functions, you can replace them with compositions.

Compose as a fold

Compose as a fold

But that’s not very interesting. I’m going to skip over the paper’s definitions of various suml and foldl and such functions and go straight to the interesting example – Ackermann’s function.

Ackermann’s function is the first known example of a function that is computable, but isn’t primitively recursive.

Ackermann's function

Ackermann’s function

It’s been converted to operate on lists rather than natural numbers so that n is represented as a list of n elements. The only reason we can do this, is that we’re working with a higher order language.

By appealing to the universal property, we find that ack = fold f v is equivalent to these two equations:

A simple calculation tells us that v = (1:), but we can’t use the second equation the same way as before. We must first tackle ack (x:xs) on the left-hand side.

ack (x:xs)

ack (x:xs)

Now the first equation tells us that w = ack xs and we normally apply the universal property on the second equation to get this:

We can now calculate the definition of f, which gives us the final folded ackerman’s function:

It took using the universal property twice, but we did it. We defined ackerman’s function as a fold.

Fin

The paper finishes with a survey of references for even more magical things you can do with folds. Using them for datatypes other than lists, performing monadic folds, redefining fold on relations, and how helpful fold can be in automatic program transformation.

But all of that are topics for another day, today let’s just agree that fold is awesome and we should all be using it much more.

Enhanced by Zemanta

Comments are closed.