Functional patterns for the Python maverick

Last updated: Dec 6, 2015

Multiple personalities

The “expression problem” refers to a particular difficulty in extending software. Essentially, software consists of data (or datatypes) and operations (functions) which act on them, and most additions to software consist of adding one or the other of these.

Usually, object-oriented programming style makes it easy to add types without increasing complexity (via new classes with their own implementations of functions). However, it is more difficult to add operations – all classes must be updated.

Functional languages make it easy to add new operations, by simply writing a new function. However, as one adds datatypes with the same operations, existing functions must be updated to handle these new types.

A dumb illustration

To illustrate, presume that we are writing some software to deal with simple geometry. We want to be able to calculate the perimeters of squares and circles respectively.

A naive Object-Oriented solution would probably look like this:

PI = 3.1415926

class Circle(self):
    def __init__(self, radius):
        self.radius = radius

    def perimeter(self):
        return PI * self.radius * 2

class Square(self):
    def __init__(self, side_length):
        self.side_length = side_length

    def perimeter(self):
        return self.side_length * 4

And a naive solution in functions looks like this:

PI = 3.1415926

def make_circle(radius):
    return {'type': 'circle', 'radius': radius}

def make_square(side_length):
    return {'type': 'square', 'side_length': side_length

def perimeter(shape):
    if shape['type'] == 'circle':
        return PI * shape['radius'] * 2
    elif shape['type'] == 'square':
        return shape['side_length'] * 4

It is clearly easier to add a shape to the OO version; just add a Rectangle class:

class Rectangle(object):
    def __init__(self, width, length):
        self.width = width
        self.length = length

    def perimeter(self):
        return (self.width * 2) + (self.length * 2)

While the functional version must go into each implementation and add a new case.

def make_rectangle(width, length):
    return {'type': 'rectangle', 'width': width, 'length': length}

def perimeter(shape):
    # ...
    elif shape['type'] == 'rectangle':
        return (shape['width'] * 2) + (shape['length'] * 2)

On the other hand, the functional version makes it easier to add new operations. Let’s add a way to get the area of a shape:

def area(shape):
    if shape['type'] == 'circle':
        return shape['radius'] ** 2 * PI
    elif shape['type'] == 'square':
        return shape['side_length'] ** 2
    elif shape['type'] == 'rectangle':
        return shape['width'] * shape['height']

The functional version just adds a new function, so we only have to change one place in the code. In the OO version, we have to add a new method to each class:

class Circle(object):
    # ...
    def area(self):
        return self['radius'] ** 2 * PI

class Square(object):
    # ...
    def area(self):
        return self['side_length'] ** 2

class Rectangle(object):
    def area(self):
        return self['width'] * self['height']

The OO code essentially groups by data type, while the functional code groups by operation. However, neither solves this problem in a really satisfactory way; what we would really like is to be able to provide a single implementation for a single datatype at a time.

Many languages (and many programmers) solve this in various ways. Some prefer mixins:

class CircleArea(object):
    def area(self):
        return self['radius'] ** 2 * PI

class CirclePerimeter(object):
    def perimeter(self):
        return self['radius'] * 2 * PI

class Circle(object, CircleArea, CirclePerimeter):
    def __init__(self, radius):
        self.radius = radius

However, this suffers from some excess syntax. More importantly, it relies implicitly on the implementation of the inheriting class; if Circle used r instead of radius, they would fail. Mixins require an interface to work safely in this way, which dynamic OO languages like Python don’t offer.

The contribution that Object-Oriented programming makes to this problem is dispatch — the ability to call a different function depending on its arguments. If you call area with a Circle as its first argument, as you do implicitly calling circle.area(), you are relying on the single-dispatch provided by Python’s classes. But this isn’t the only solution to dispatch. Let’s take a look at an alternative.

Introducing Multimethods

Multimethods are an alternative way to solve issues of dispatch in the face of added types or expressions. Let’s lead with an example implementation of our first problem. We’ll also throw in some namedtuple magic to help with that:

PI = 3.1415926

Circle = namedtuple('Circle', ['radius'])
Square = namedtuple('Square', ['side_length'])

def perimeter(circle):
    return circle.radius * 2 * PI

def perimeter(square):
    return square.radius * 2 * PI

With multimethods, we are free to implement our operations on a piecemeal basis. We can add operations easily:

def area(circle):
    return circle.radius ** 2 * PI

def area(square):
    return square.side_length ** 2

And we can add types easily as well:

Rectangle = namedtuple('Rectangle', ['width', 'height'])

def perimeter(rect):
    return rect.width * 2 + rect.height * 2

def perimeter(rect):
    return rect.width * rect.height

What multimethods accomplish is to annotate functions with dispatch instructions. However, we’re no longer limited to single-dispatch on the first object as we are with classes. Multimethods (or at least our implementation) will allow us to specify the function to be used for dispatch. For example, if we preferred using dicts as in our initial functional implementation, we can facilitate this with a custom dispatch function:

def area(shape):
    return shape['type']

@multimethod(area, 'circle'):
def area(circle):
    return circle['radius'] ** 2 * PI

@multimethod(area, 'square'):
def area(square):
    return square['side_length'] ** 2

There’s only one catch: multimethods are not included with Python, so we’ll have to write our own. Let’s examine how we might accomplish that. In the first multimethod example, we can see that the method decorator above is a special case of multimethod as used here, with the dispatch function simply being the application of type to its arguments. So we’ll start by defining multi and multimethod.

The responsibility of multi is to instantiate the dispatch function with everything it needs to route calls to it correctly. We’ll simply attach an attribute, __multi__, to the function itself, which multimethod will inspect to see if any functions match the signature it is provided. Here’s multi:

def multi(dispatch_fn):
    def _inner(*args, **kwargs):
        return _inner.__multi__.get(
            dispatch_fn(*args, **kwargs),
        )(*args, **kwargs)

    _inner.__multi__ = {}
    _inner.__multi_default__ = lambda *args, **kwargs: None  # Default default
    return _inner

Here, we create a closure called _inner. _inner will call dispatch_fn on whatever arguments it is passed, and attempt to look that up in its __multi__ dictionary containing any methods we add to it, or else a default if no function matches. It will then call that function with the provided arguments.

On the way to returning _inner, we instantiate the __multi__ and __multi_default__ parameters on the _inner function so we won’t have to worry about KeyErrors.

The multimethod function will simply need to add itself to the dispatch function’s __multi__ property. We’ll also say that it sets itself as the default if no dispatch values are provided:

def multimethod(dispatch_fn=None, dispatch_key=None):
    def apply_decorator(fn):
        if dispatch_key is None:
            # Default case
            dispatch_fn.__multi_default__ = fn
            dispatch_fn.__multi__[dispatch_key] = fn
        return dispatch_fn
    return apply_decorator

Note that since dispatch_key is used as a key in the __multi__ dict, we can’t use a list (or anything not hashable) for dispatch. Stick to tuples instead.

After all that, we can write our method shortcut in terms of multi and multimethod:

def method(dispatch_type, *args):
    def get_dispatch_type(t, *args, **kwargs):
        return type(t)

    def not_implemented(*args, **kwargs):
        raise Exception("Method not implemented for %s" % get_dispatch_type(*args, **kwargs))

    def _inner(fn):
        dispatch_fn = globals().get(fn.__name__, None)  # Search for existing multimethod
        if dispatch_fn is None or not hasattr(dispatch_fn, '__multi__'):
            dispatch_fn = multi(get_dispatch_type)
            dispatch_fn = multimethod(dispatch_fn)(not_implemented)  # default

        return multimethod(dispatch_fn, dispatch_type)(fn)
    return _inner

Here, we define our own dispatch function that just returns the type of the first argument, and then call multi on that to get an annotated dispatch function. We also go through the trouble of adding a default that throws an exception. Then, we call multimethod with the dispatch function and the desired dispatch type that we passed in, and apply that to the function that we’re decorating.

Note the need to reach into globals() to fetch an existing method if one exists; the usual multimethod gets around this by accepting the dispatch function as an argument, but we don’t want to make any more work for the user than necessary.

Et voila! We have a fully functioning multimethod implementation that we can use.

Next article coming soon.