Fast Interstellar Travel with NumPy and C
I absolutely love Python (and NumPy!) for the flexibility it gives me in quickly building up complex data structures and programs to arrive at interesting solutions. Still, sometimes handling larger amounts of data is too slow even after profiling my code and subsequent performance tuning. In those cases, going more lowlevel by using the NumPy ctypes interface can help reduce the runtimes by orders of magnitude.
I recently came upon an example where this runtime improvement by going to native C code was especially drastical and the amount of work necessary to arrive at this improvement was extremely low. Since this example also involves internet spaceships, I thought it might be interesting for people to read about.
Internet spaceships and competitive advantage
In the MMORPG Eve Online, players pilot a vast variety of spaceships through a huge universe, looking for small to gigantic fights or to profit from the completely playerdriven economy. What I love especially about Eve is that the developers make a lot of data about the universe available through static database dumps and an API, so a vast ecosystem of playercreated tools has sprung up around the game. As a programmer playing the game, it's hard to resist the temptation to roll your own tools to give you an advantage, so I soon started working on my own projects.
Fast travel planning through space
A crucial functionality of many Eve programs that deal with the universe as a whole is to compute the shortest possible connections between different solar systems, e.g. to compute travel times. In computer science terms, we want to be able to compute the allpair shortest paths between all ~5000 solar systems in "known space" (i.e. every nonwormhole system).
A classical algorithm for solving this problem is the FloydWarshall algorithm which has a runtime complexity in O(N^3)
. A simple backoftheenvelope calculation of (5.0 * 10^3)^3 = 1.25 * 10^11
operations says that this is still in a very manageable order of magnitude (a Sandy Bridge Core i7 Quadcore processor can do up to 5.0 * 10^10
floating point operations per second), so I started implementing the FloydWarshall algorithm in Python.
Preparing the data
As input, I've processed the static database dumps to create two files:

systems.csv
 a list of all solar systems renumbered so that the first system has an index of 0. Additionally, all wormhole systems are filtered out since their connections change all the time and are not publically known. 
jumps.csv
 a list of all connections between solar systems, denoted as a tuple of two solar system indices.
Using the jumps list, we can initialize two twodimensional matrices of 16 bit unsigned integers distance
and traceback
. The matrix distance[i, j]
will hold the shortest distance (in jumps) between solar systems with index i
and j
while traceback[i, j]
will give us an index k
of a solar system that lies on the shortest path, information that can be used to reconstruct the shortest path after finishing the algorithm. First, we need to fill these matrices with the immediate jump information from jumps
:
import numpy as np
# the maximum value for an uint16 denotes that two solar systems are not connected
NO_ROUTE = 2 ** 16  1
def init_nav(n_systems, jumps):
"""Initialize the navigation matrices"""
# start out with all systems having no connection
distance = np.empty((n_systems, n_systems), dtype=np.uint16)
traceback = np.empty((n_systems, n_systems), dtype=np.uint16)
distance[:] = NO_ROUTE
traceback[:] = 1
# fill direct jumps into navigation matrices
for from_system, to_system in jumps:
distance[from_system, to_system] = 1
distance[to_system, from_system] = 1
traceback[from_system, to_system] = to_system
traceback[to_system, from_system] = from_system
return distance, tracelsback
Version 1: Python+NumPy
With the data in a form that is convenient for the actual FloydWarshall algorithm, we can now compute shortest paths between all of the systems by successively updating the distance
and traceback
matrices with paths that go over intermediate solar systems:
def floyd_warshall_v1(distance, traceback):
n_systems = distance.shape[0]
for k in range(n_systems):
for i in range(n_systems):
for j in range(n_systems):
if distance[i, k] == NO_ROUTE: continue
if distance[k, j] == NO_ROUTE: continue
dist_ikj = distance[i, k] + distance[k, j]
if distance[i, j] > dist_ikj:
# it is shorter to go i>k>j than directly from i>j
distance[i, j] = dist_ikj
traceback[i, j] = k
Looks nice, and we're even using NumPy data structures for good performance, right? Let's see how long this thing will need to compute the navigation data!
(Bioinformaticians might want to replace "compiling code" with "running analyses". Comic from xkcd)
After a while of doing different things to occupy myself, I've actually stopped the benchmark since less than 1% of the computation was done and I didn't feel like waiting any longer. Extrapolating the necessary runtimes, though, it would have taken more than 15 days to compute the optimal routes on my system.
You might say that it would be OK to leave the computer running for a few days and then only use the cached results from then on. By doing this, though, we're sacrificing a lot of flexibility since avoiding certain solar systems or connections would require a recomputation of the navigation network. Besides, even if we're just computing the navigation graph once it would still be worth the time to implement something faster since we can easily be done before the original, slow implementation finishes computing. So let's see if C can help us compute the navigation data more quickly!
Version 2: Python+Numpy+C
Since we've seen that version 1 is too slow to run in practice, let's see if we can get faster by implementing the FloydWarshall algorithm as a simple C program:
#include <stdio.h>;
#define NO_ROUTE 65535
void floydwarshall(int n, unsigned short* distance, unsigned short* traceback) {
for(int k = 0; k < n; k++) {
for(int i = 0; i < n; i++) {
for(int j = 0; j < n; j++) {
unsigned short ik = distance[i * n + k];
unsigned short kj = distance[k * n + j];
if(ik != NO_ROUTE && kj != NO_ROUTE) {
if(distance[i * n + j] > ik + kj) {
distance[i * n + j] = ik + kj;
traceback[i * n + j] = k;
}
}
}
}
}
}
18 lines of code. Pretty straightforward, right? Since NumPy uses C arrays internally to store the array contents, no conversion of data is necessary and we only need to do a supersimple translation of the main Python algorithm to C.
We compile the C code into a shared library, e.g. using gcc
:
gcc shared Wl,soname,libfloydwarshall.so o libfloydwarshall.so floydwarshall.c
Now we can use the NumPy ctypes interface to pass our distance
and traceback
matrices directly to the C code. We only need to let ctypes know the interface of our C method:
import numpy as np
import numpy.ctypeslib as npct
import ctypes
import os.path
# define a new type for a 2D array of uint16
array_2d_uint16 = npct.ndpointer(dtype=np.uint16, ndim=2, flags='C_CONTIGUOUS')
# load shared library
libfw = npct.load_library('libfloydwarshall', os.path.dirname(__file__))
# define result and argument types for the C floydwarshall function
libfw.floydwarshall.restype = ctypes.c_voidp
libfw.floydwarshall.argtypes = [
ctypes.c_int, # n
array_2d_uint16, # distance
array_2d_uint16 # traceback
]
# lightweight python wrapper for the C floydwarshall function
def floyd_warshall_v2(distance, traceback):
libfw.floydwarshall(distance.shape[0], distance, traceback)
The numpy arrays will be passed by reference directly to the C code and their contents will be updated in place. After the C code returns, all navigation data is calculated and stored in distance
and traceback
.
Computing the allpairs shortest paths for all of known space using this code takes 3 minutes on my system so it's approximately 7700 times faster than the Python code. Not bad :)
Note: there is still a lot of potential here for easy optimizations e.g. by using SIMD intrinsics (check out the awesome Intrinsics Guide by Intel and don't forget about AVX/AVX512!) or by running it on multiple CPU cores using e.g. OpenMP. For me, 3 minutes is good enough, though.
Calculating actual paths
In order to be able to use this as a complete Eve Online navigation system, a few more pieces of code are needed and I will give them here for the sake of completeness.
After the FloydWarshall algorithm, we already know the distance between all solar systems in jumps as given by the distance[i, j]
matrix. In many cases, however, the actual optimal route might be of interest. We can calculate it by using the traceback
matrix and the following recursive code:
def get_path(traceback, from_system, to_system):
def inner(frm, to):
mid = traceback[frm, to]
if mid == to:
return [to]
else:
return inner(frm, mid) + inner(mid, to)
return [from_system] + inner(from_system, to_system)
Giving this function the traceback
matrix and two indices for a start and end system will return a list of all systems on the way. Now we can open up our interstellar travel agency!
Summary
Python is an amazingly productive programming language, but for cases where fast runtimes are important, even libraries such as NumPy might not be fast enough. Instead of reimplementing the whole analysis in a faster lowerlevel language, however, it can help to replace only the performance bottlenecks with fast C code and we still get most of the speed benefits. Doing this is a lot easier as you might think when using the wonderful NumPy and ctypes interfaces.
Further Reading
 SciPy cookbook on NumPy/Ctypes
 EVE API Reference
 Try Eve Online for 21 days  If you're curious about the game and sign up with this invite link, you will get a free 21day trial instead of the regular 14day trial. If you end up liking the game and subscribe, my ingame character Zapp Mane will get some free stuff. In any case, feel free to say hi!