Prototyping Compiled Functions with Inlined Callables#
Introduction#
When developing new numerical methods, it is often important to test them in a high-dimensional setting to properly evaluate their performance. However, such high-dimensional environments can lead to very slow evaluation times, especially if the expressions being tested cannot be easily vectorised. Pyccel is a useful tool for accelerating such cases.
A common challenge during development is that only a single line in the numerical method, for example the formula for an integrand, needs to change repeatedly between tests. Rewriting the entire function each time in order to recompile can be tedious.
Pyccel’s context functions allow you to pass in user-defined Python callables that get inlined directly into the compiled code for maximum performance. This allows you to continue developing in an interactive environment without rewriting the entire function repeatedly.
In this tutorial, we’ll show how to write a general-purpose integration function that can be tested with different kernels. We’ll use a simple Python lambda to represent the kernel, which will be embedded into the generated code to avoid any Python overhead during execution.
Integrating User-Defined Kernels with Pyccel#
Here’s a simple example of how you can write a general-purpose 2D integrator in Python, then compile it with Pyccel while injecting an arbitrary kernel function at compile time.
General integration routine#
1def midpoint_rule(nx : int, ny : int, x_start : float, x_end : float, y_start : float, y_end : float):
2 """
3 Integrate the function test_func using the midpoint rule.
4 """
5 dx = (x_end - x_start) / nx
6 dy = (y_end - y_start) / ny
7 xs = linspace(x_start + 0.5*dx, x_end - 0.5*dx, nx)
8 ys = linspace(y_start + 0.5*dy, y_end - 0.5*dy, ny)
9
10 result = 0.0
11 for i in range(nx):
12 for j in range(ny):
13 # result += exp(-(xs[i]**2 + ys[j]**2)) * dx * dy
14 # result += exp(-(xs[i]**3 + ys[j]**2)) * dx * dy
15 # result += exp(-(xs[i]**2 + ys[j]**3)) * dx * dy
16 result += test_func(xs[i], ys[j]) * dx * dy
17 return result
Here we have tested the integration of multiple expressions (seen in comments) but we can see that (especially in an interactive environment) it is simpler to use a Python function to specify what should be integrated during a testing phase.
Compiling with Pyccel#
If we define a free function or a lambda function with the expected name (test_func) we can now use epyccel to get a compiled version of the general integration routine, specific to this test kernel:
1from pyccel import epyccel
2
3test_func = lambda x, y : exp(-(x**2 + y**2))
4compiled_integrator = epyccel(midpoint_rule)
Usage#
The compiled method can be used exactly as the original method was used:
1args = (1000, 1000, -5., 5., -5., 5.)
2assert midpoint_rule(*args) == compiled_integrator(*args)
3
4import timeit
5midpoint_time = timeit.timeit('midpoint_rule(*args)', number=10, globals=globals())
6accelerated_time = timeit.timeit('compiled_integrator(*args)', number=10, globals=globals())
7speedup = midpoint_time / accelerated_time
8print(f"Speed up : {speedup:.6g}")
Generated code#
Using a lambda function (or a function with the [@inline decorator](./decorators#inline)) for the kernel ensures that the method is inlined. For example the Pyccel-generated translation created by the call above is:
1 result_0001 = 0.0_f64
2 do i = 0_i64, nx - 1_i64
3 do j = 0_i64, ny - 1_i64
4 Dummy_0000 = exp(-(xs(i) ** 2_i64 + ys(j) ** 2_i64))
5 result_0001 = result_0001 + Dummy_0000 * dx * dy
6 end do
7 end do
This makes the resulting code faster but it means that epyccel will need to be called again to get an updated version if the lambda function test_func is modified. On the other hand this means that multiple versions of the accelerated function co-exist and are usable simultaneously:
1test_func = lambda x, y : exp(-(x**3 + y**2))
2integrate_test_1 = epyccel(midpoint_rule)
3test_func = lambda x, y : exp(-(x**2 + y**3))
4integrate_test_2 = epyccel(midpoint_rule)