Graph coloring is deceptively simple. The idea of coloring a graph is very straightforward, and it seems as if it should be relatively straightforward to find a coloring. It turns out to not be in fact, it’s extremely difficult. A simple algorithm for graph coloring is easy to describe, but potentially extremely expensive to run. In terms of computational complexity – the measure that we computer scientists use to describe how hard it is to compute a solution to a problem – it’s actually NP-hard. What that means that there is no known algorithm for optimal graph coloring which isn’t exponential; and that further, if there *were* a non-exponential algorithm for it, there would be a non-exponential solution for *all* NP-complete problems, but a non-exponential solution to another NP-complete problem wouldn’t necessarily produce a non-exponential time solution for graph coloring.

In fact, the only general solution to finding an optimal graph coloring is exhaustive search: start with one node, give it a color, assign non-conflicting colors to its neighbors, and so on. Try it with two colors, if you get no result, then try with three, and so on. There are a lot of fancy algorithms that try to improve on that – both by reducing the search space, and by using heuristics. With a combination of those two techniques, we can get colorings to be quite efficient in the average case – but if always want the optimal coloring of *any* graph, then there’s no way (or at least, no way that anyone knows about) to always get the optimal result quickly.

For an idea of what I mean by heuristics, here’s one very simple algorithm:

Given G=(V,E):

Compute Degree(v) for all v in V. Set uncolored = V sorted in decreasing order of Degree(v). Set current Color = 0.

While there are uncolored nodes:

set A=first element of uncolored

remove A from uncolored

set Color(A) = current Color

set colored With Current = {A}

for each v in uncolored:

if v is not adjacent to anything in colored With Current:

set Color(v)=current Color.

add v to current Color.

remove v from uncolored.

end if

end for

current Color = current Color + 1.

end while

This is, implicitly, trying out every possible combination of colors: you can read “for each v∈uncolored:” as “try assigning the current color to v, and see if it gives an invalid coloring”. But by stopping the instant it tries to assign an invalid coloring, it can prune that coloring out without completing it. That’s one of its tricks for being fast on average: quickly eliminating invalid colorings. The other trick in this algorithm is a heuristic: *on average*, searching for colorings starting with the highest degree uncolored nodes will find an optimal result faster than assigning colorings to nodes in random order. In the average case, this will perform well – and run in less that exponential time. But there are definitely graphs where this heuristic fails, and this algorithm will be a disaster.

The fortunate thing is that many of the most common applications of graph coloring are running on relatively small graphs, and often, the coloring doesn’t really need to be optimal – just *close*.For example, one of the uses of graph coloring that I’m particularly familiar with is in compilers. In a compiler, one of the tricks to make code run faster is to store values in CPU registers that are part of the CPU itself, rather than in memory which is out on the motherboard. Things stored in registers can be manipulated dramatically faster than things in memory: for example, on the Core Due processor in my MacBook, it takes 10 clock cycles to retrieve a single value from memory into the processor cache; it takes 2 cycles to get memory from the cache into the CPU. Values in a register can be accessed with *no* wait. So naturally, it makes a big difference in the performance of a program which values wind up in registers, and which don’t.