Lose the Loops
If you architect your programs in a functional (or even functional-ish) way, you can avoid loops entirely. How? You simply defer to
filter, and especially
It seems to be a common case that many programmers are familiar with the first two, but don’t know what to do with the last one. This is a shame, because
reduce is a very interesting and general construct. This chapter is more-or-less dedicated to how to use
reduce effectively, and how to write code that works with
One major benefit to working with
reduce is that it tends to produce more parallelizable software. You may have heard of Map-Reduce, the paradigm — software written in terms of these operations (considering
filter as a sort of
map) can be automatically parallelized across all
map operations, using
reduce to combine the results.
But I digress. Let’s start from the beginning, with some definitions.
Map accepts a function and a collection, and returns a list that is the result of applying the function to each element of the collection:
map(lambda x: x + 1, [0, 1, 2, 3, 4]) # => [1, 2, 3, 4, 5]
Filter accepts a function and a collection, and returns a list containing only those elements for which the function returned true:
filter(lambda x: x % 2 == 0, [0, 1, 2, 3, 4, 5, 6]) # => [0, 2, 4, 6]
For most use cases of map and filter, it is more concise and readable to use a list comprehension instead. We can write the preceding examples as:
[x + 1 for x in [0, 1, 2, 3, 4]] [x for x in [0, 1, 2, 3, 4, 5, 6] if x % 2 == 0]
However, list comprehensions can’t help us with the most interesting member of the trio, reduce.
Reduce accepts a function of two arguments (call it
f), a collection, and an initial argument. It calls
f with the initial argument and the first item of the collection, then calls
f again with the result of that and the second item, and so on until the collection runs out of items.
This is more easily explained as code:
def f(acc, item): return acc + item reduce(f, [1, 2, 3, 4, 5], 0) # => 15 # Which breaks down as: # f(f(f(f(f(0, 1), 2), 3), 4), 5) # f(f(f(f(1, 2), 3), 4), 5) # f(f(f(3, 3), 4), 5) # f(f(6, 4), 5) # f(10, 5) # 15
Reduce is sometimes called
fold (this is a “left fold”; a “right fold” has the order of arguments reversed). This seems complicated, but it’s probably something you do all the time, with a different syntax. Here’s a typical implementation using a for loop:
total = 0 for item in [1, 2, 3, 4, 5]: total = total += item
Notice any similarities? You can turn any aggregating loop, as above, into a reduce call by following a few simple steps:
1) Make the loop body a function:
total = 0 def inner(item): total += item for item in [1, 2, 3, 4, 5]: inner(item)
2) Move the aggregator variable (here,
total) into the function:
def inner(total, item): return total + item total = 0 for item in [1, 2, 3, 4, 5]: total = inner(total, item)
3) Ignore the rest and just use
reduce on your new aggregator function:
def inner(total, item): return total + item reduce(inner, [1, 2, 3, 4, 5])
Reduce represents a very general operation across a collection. In fact, it’s is so general that it’s straightforward to define
reduce in terms of it. Here’s
map, reimplemented using reduce:
def map(fn, coll): def _inner(out_coll, item): out_coll.append(fn(item)) return out_coll return reduce(_inner, coll, ())
Note that a purer functional implementation would return a modified collection in
_inner without changing
out_coll with append, but that variable only exists within the scope of the reduce, so it’s not a big deal. Here’s another with filter:
def filter(fn, coll): def _inner(out_coll, item): if fn(item): out_coll.append(item) return out_coll return reduce(_inner, coll, )
Of course, this example doesn’t really illustrate anything so great about reduce. What if we were writing a simple calculator, and we needed to implement operations to add OR multiply across a list of inputs?
def adder(coll) # The sum of all elements def multiplier(coll) # The product of all elements
With a traditional loop, we’d have to write two implementations:
def adder(coll): total = 0 for item in coll: total = total + item return total def multiplier(coll): total = 1 for item in coll: total = total * item return total
Each function has two responsibilities: iterate over a list, and perform its operation multiple times. We can also see a lot of repetition here; in fact, the only things that change are the initial value of
total and the operator used. Perhaps we could improve this using classes?
class Calculation(object): def apply(self, x, y): raise NotImplementedError("") def calculate(self, coll): total = 0 for item in coll: total = self.apply(total, item) return total class Addition(Calculation): def apply(self, x, y): return x + y class Multiplication(Calculation): def apply(self, x, y): return x * y adder = Addition().calculate multiplier = Multiplication().calculate
But now, we’ve added a lot of extra text, and more importantly many extra concepts. Let’s try the same using simple functions:
def add(x, y): return x + y def mul(x, y): return x * y def calc(fn, initial): def _inner(coll): return reduce(fn, coll, initial) return _inner adder = calc(add, 0) multiplier = calc(mul, 1)
Here, using a functional style with
reduce has allowed us to pull out just the things that change, the operator and the initial value, and created a sort of constructor function that creates a new function given only those parameters. We can easily add new arguments, too. How about a bitwise xor version?
def xor(x, y): return x ^ y xorer = calc(xor, 0)
We can do much more with reduce, though. Let’s imagine a function called
pipeline that accepts a list of single-argument functions, and returns a function that applies all of those functions. Essentially, a version of
compose that accepts multiple arguments.
Let’s first define some functions to test our pipeline with:
def f1(arg): return arg + "-f1" def f2(arg): return arg + "-f2" def f3(arg): return arg + "-f3"
We want our end result to look like this:
pipe = pipeline(f1, f2, f3, f2, f2) pipe("test") # => "test-f1-f2-f3-f2-f2"
How can we use reduce for this? We’ll need a function of two arguments, as always. The first argument should be a function, the second a function, and the return value a function that combines the two. Sound familiar? Let’s go back and grab the
compose function from earlier:
def compose(f1, f2): return lambda arg: f1(f2(arg))
Now, we just need to take our list of functions and reduce across them with compose:
def pipeline(*fns): return reduce(compose, fns)
Since we’ve provided no initial argument to
reduce, it will make its first iteration with the first two items in the collection. This matches our intuition that a
pipeline of two arguments is simply a
However, we’ll need to remember also that
compose applies its arguments from right to left, where we expected left-to-right application in our proposal. This is easily accomplished by reversing the argument list in
def pipeline(*fns): return reduce(compose, reversed(fns))
We’ll see many more examples of using
reduce in future chapters. For now, just try to imagine, whenever you see or write a loop, how you might accomplish the same thing with
reduce. You’ll be surprised how often a problem can be simplified to a single left fold.
Next article: You Just Need a Little Closure