Clickbait title aside, I recently had a lot of fun with Advent of Code, and since my weapon of choice is Python I spent a lot of time thinking about how to make it run fast.

Write less code

Every line of code you write is a line that you intended to execute.

This is true in every programming language, but more noticeable for interpreted languages like Python.

A "sufficiently smart compiler" may optimise away redundant code, but it is mostly a myth, and while there is an optimising compiler in the form of PyPy, it requires you write in a restricted subset of Python.

Instead of writing your own classes and functions, see if it already exists in the Python standard library.

There is some overhead involved in calling functions over in-line code, but for big operations it is faster to call outto a function written in C, which most of the Python standard library is.

Use iterators and generators

A lot of the Python standard library functions can be given an iterable, which means they can perform more operations in optimised C code, without having to resume into interpreting Python code.

So if your code can be translated into plugging various generators together, you can make it faster.

For example Advent of Code puzzle 6 was a problem involving a 1000x1000 grid of lights that were either on or off.

This can be thought of as a set of coordinates of lights that are on.

The instructions are of the form (turn on|turn off|toggle) x0,y0 through x1,y1.

Generating coordinates

If we parse that string and get a pair of coordinates, we can turn that into coordinates of lights as follows:

def lightsinrange(start, end):
    xaxis = xrange(start[0], end[0]+1)
    yaxis = xrange(start[1], end[1]+1)
    lights = product(xaxis, yaxis)
    return lights

This uses xrange (range in Python 3) to create an iterable which yields every integer in the range.

This gets turned into every coordinate in both dimensions with product.

If you wanted to support arbitrary dimensions, then you'd instead do:

def lightsinrange(start, end):
    axis_ranges = (xrange(startcoord, endcoord+1)
                   for (startcoord, endcoord) in izip(start, end))
    lights = product(*axis_ranges)
    return lights

This uses izip (just zip in Python 3) to create ranges for each pair of coordinates per dimension, and returns an iterable with a range of coordinates per dimension.

product(*axis_ranges) uses the * operator, which expands an iterable into function parameters.

Toggling lights

Given our lightsinrange function and a parse_operations function left to the reader, we can solve this problem as follows:

lights = set()
def turn_on(iterable):
    for light in iterable:
        lights.add(light)
def turn_off(iterable):
    for light in iterable:
        if light in lights:
            lights.remove(light)
def toggle(iterable):
    for light in iterable:
        if light in lights:
            lights.remove(light)
        else:
            lights.add(light)
ops = {'turn on': turn_on, 'turn off': turn_off, 'toggle': toggle}
for op, start, end in parse_operations(stdin):
    ops[op](lightsinrange(start, end))

Our implementations for turn_on, turn_off and toggle jump into the implementation of the set type at least once per coordinate.

This is sub-optimal, since it would be nice if we could just pass the iterable to the set.

Fortunately the set class has update, difference_update and symmetric_difference_update, so our definitions of turn_on, turn_off and toggle can be simplified.

def turn_on(iterable):
    lights.update(iterable)
def turn_off(iterable):
    lights.difference_update(iterable)
def toggle(iterable):
    lights.symmetric_difference_update(iterable)

We're not done yet, since these functions just call a method on an object, and Python's magic method binding means we can use the methods themselves, so we can throw away our functions and use bound methods.

ops = {'turn on': lights.update, 'turn off': lights.difference_update,
       'toggle': symmetric_difference_update}

So now, we've got a small amount of code mostly throwing iterables at data structures.

Iterate profiling and optimising

If you've followed the previous two tips, you should have a program that is written in a style that works well given constraints of the platform.

This is the low-hanging fruit of making it faster, since it doesn't require domain-specific knowledge of what the program is doing, or the algorithms used to solve the problem.

Now you need to go into a profiling and optimising loop, where you work out what the slowest part of the program is, and work out how to make it faster.

You should be able to use Google to find plenty of profiling guides, but in case your Google bubble is insufficiently developed to find them, here are some links: