In this notebook, we'll take a look at how to implement a simple function using Cython. The operation we'll implement is the first-order diff, which takes in an array of length $n$:
$$\mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n\end{bmatrix}$$
and returns the following:
$$\mathbf{y} = \begin{bmatrix} x_2 - x_1 \\ x_3 - x_2 \\ \vdots \\ x_n - x_{n-1} \end{bmatrix}$$
First we'll import everything we'll need and generate some data to work with.
import numpy as np
x = np.random.randn(10000)
Below is a simple implementation using pure Python (no NumPy). The %timeit
magic command let's us see how long it takes the function to run on the 10,000-element array defined above.
def py_diff(x):
n = x.size
y = np.zeros(n-1)
for i in range(n-1):
y[i] = x[i+1] - x[i]
return y
%timeit py_diff(x)
Now use the exact same function body but add the %%cython
magic at the top of the code cell. How much of a difference does simply pre-compiling make?
%load_ext cython
%%cython
import numpy as np
def cy_diff_naive(x):
n = x.size
y = np.zeros(n-1)
for i in range(n-1):
y[i] = x[i+1] - x[i]
return y
%timeit cy_diff_naive(x)
So it didn't make much of a difference. That's because Cython really shines when you specify data types. We do this by annotating the variables used in the function with cdef <type> ...
. Let's see how much this improves things.
Note: array types (like for the input arg x
) can be declared using the memoryview syntax double[::1]
or using np.ndarray[cnp.float64_t, ndim=1]
.
%%cython
import numpy as np
def cy_diff(double[::1] x):
cdef int n = x.size
cdef double[::1] y = np.zeros(n-1)
cdef int i
for i in range(n-1):
y[i] = x[i+1] - x[i]
return y
%timeit cy_diff(x)
That made a huge difference! There are a couple more things we can do to speed up our diff implementation, including disabling some safety checks. The combination of disabling bounds checking (making sure you don't try access an index of an array that doesn't exist) and disabling wraparound (disabling use of negative indices) can really improve things when we are sure neither condition will occur. Let's try that.
%%cython
from cython import wraparound, boundscheck
import numpy as np
@boundscheck(False)
@wraparound(False)
def cy_diff2(double[::1] x):
cdef int n = x.size
cdef double[::1] y = np.zeros(n-1)
cdef int i
for i in range(n-1):
y[i] = x[i+1] - x[i]
return y
%timeit cy_diff2(x)
Finally, let's see how NumPy's diff
performs for comparison.
def np_diff(x):
return np.diff(x)
%timeit np_diff(x)
Why is NumPy's diff
implementation faster? Maybe use the --annotate
flag to peek at the C code generated by our latest Cython implementation.