Wrapper stage#
Python is written in C. In order to create a module which can be imported from Python, we therefore have to print a wrapper using the Python-C API. This code must call the Pyccel generated translation of the code. If that code was not generated in C then further wrappers are necessary in order to make the code compatible with C.
For example in Fortran generated code there are often array arguments. These do not exist in C, therefore three or more arguments must be passed to the function instead:
The data
The size(s) of the array in each dimension
The stride(s) between elements in each dimension
When defining a slice in Python it is usual to pass a fourth parameter, namely the start of the slice. Internally NumPy simplifies these three integer quantities (start
, end
, step
) into just a stride and a step. This is why the start is not stored or passed as an argument.
The wrapper stage takes the AST describing the generated code as an input and returns an AST which makes that code available to a target language (C or Python).
It should be noted that when code is translated to Python, no wrapper is required.
The entry point for the class Wrapper
is the function wrap
.
_wrap
#
The _wrap
function internally calls a function named _wrap_X
, where X
is the type of the object.
These functions must have the form:
def _wrap_ClassName(self, stmt):
...
return Y
Each of these _wrap_X
functions should internally call the _wrap
function on each of the elements relevant to the wrapper to obtain AST objects which describe the same information, but in a way that is accessible from the target language.
Name Collisions#
While creating the wrapper, it is often necessary to create multiple variables to fully describe something which was a single object in the original code. In order to facilitate reading the code these objects should have names similar to that of the original variable. As a result a lot of care must be taken to avoid name collisions.
In general the following rules should be respected:
Any variables which we wish to be able to retrieve from the scope using the
Scope.find
function must be added to the scope withScope.insert_symbol
. We cannot do this with 2 variables with the same name so the names must come from the AST where collisions have already been removed.Any names saved via
Scope.insert_symbol
must be accessed usingScope.get_expected_name
(as the name may have been changed to avoid other collisions).Any new names must be created using
Scope.get_new_name
Fortran To C#
The Fortran to C wrapper wraps Fortran code to make it callable from C. This module relies heavily on iso_c_binding
. It is used to ensure that the functions can be called correctly and that types are compatible. We do not use iso_fortran_binding
as it was only introduced in Fortran 2018 and we do not expect compiler support for this in the near future.
Scalar module variables#
Scalar module variables are already accessible from C so no extra work is done here.
Array module variables#
Arrays are not compatible with C. Instead the wrapper prints a function which returns a pointer to the module variable as well as its size information to make the information available from C.
Example#
When the following code is translated to Fortran:
import numpy as np
x = np.empty(6)
The following Fortran translation is obtained:
module tmp
use, intrinsic :: ISO_C_Binding, only : b4 => C_BOOL , f64 => C_DOUBLE &
, i64 => C_INT64_T
implicit none
integer(i64), bind(c) :: n
real(f64), allocatable, target :: x(:)
logical(b4), private :: initialised = .False._b4
contains
!........................................
subroutine tmp__init()
implicit none
if (.not. initialised) then
n = 6_i64
allocate(x(0:n - 1_i64))
initialised = .True._b4
end if
end subroutine tmp__init
!........................................
!........................................
subroutine tmp__free()
implicit none
if (initialised) then
if (allocated(x)) deallocate(x)
initialised = .False._b4
end if
end subroutine tmp__free
!........................................
end module tmp
The array x
is wrapped as follows:
subroutine bind_c_x(bound_x, x_shape_1) bind(c)
use tmp, only: x
implicit none
type(c_ptr), intent(out) :: bound_x
integer(i64), intent(out) :: x_shape_1
x_shape_1 = size(x, kind=i64)
bound_x = c_loc(x)
end subroutine bind_c_x
Functions#
In the simplest case a wrapper around a function takes the same arguments, calls the function and then returns the result. The only difference between this function and the original function is the body (where the original function is called) and the bind(c)
tag which ensures that the function name is accessible from C and is not mangled.
More complex cases include functions with array arguments or results, optional variables and functions as arguments.
In order to create all the nodes necessary to describe the unpacking of the arguments we use functions named _extract_X_FunctionDefArgument
where X
is the type of the object being extracted from the FunctionDefArgument
. This allows such functions to call each other recursively. This is notably useful for container types (tuples, lists, etc) whose elements may themselves be container types. The types of scalars are checked in the same way regardless of whether they are arguments or elements of a container so this also reduces code duplication.
Array arguments and results are handled similarly to array module variables, by adding additional variables for the shape and stride information. Strides are not used for results as pointers cannot be returned from functions.
Optional arguments are passed as C pointers. An if/else block then determines whether the pointer is assigned or not. This can be quite lengthy, however it is unavoidable for compilation with Intel, or NVIDIA. It is also unavoidable for arrays as it is important not to index an array (to access the strides) which is not present.
Finally the most complex cases such as functions as arguments are simply not printed. Instead these cases raise warnings or errors to alert the user that support is missing.
In order to create all the nodes necessary to describe the unpacking of the results we use functions named _extract_X_FunctionDefResult
where X
is the type of the object being extracted from the FunctionDefResult
. This helps with the readability of the code.
Example 1 : function with scalar arguments#
The following function:
def f(x : int):
return x + 3
is translated to the following Fortran code:
function f(x) result(Out_0001)
implicit none
integer(i64) :: Out_0001
integer(i64), value :: x
Out_0001 = x + 3_i64
return
end function f
which is then wrapped as follows:
function bind_c_f(x) bind(c) result(Out_0001)
implicit none
integer(i64) :: Out_0001
integer(i64), value :: x
Out_0001 = f(x = x)
end function bind_c_f
This function has the following prototype in C:
int64_t bind_c_f(int64_t);
Example 2 : function with array arguments#
The following function:
def f(x : 'int[:]'):
return x + 3
is translated to the following Fortran code:
subroutine f(x, Out_0001)
implicit none
integer(i64), allocatable, intent(out) :: Out_0001(:)
integer(i64), intent(in) :: x(0:)
allocate(Out_0001(0:size(x, kind=i64) - 1_i64))
Out_0001 = x + 3_i64
return
end subroutine f
which is then wrapped as follows:
subroutine bind_c_f(bound_x, bound_x_shape_1, bound_x_stride_1, &
bound_Out_0001, Out_0001_shape_1) bind(c)
implicit none
type(c_ptr), intent(out) :: bound_Out_0001
integer(i64), intent(out) :: Out_0001_shape_1
type(c_ptr), value :: bound_x
integer(i64), value :: bound_x_shape_1
integer(i64), value :: bound_x_stride_1
integer(i64), pointer :: x(:)
integer(i64), allocatable :: Out_0001(:)
integer(i64), pointer :: Out_0001_ptr(:)
call C_F_Pointer(bound_x, x, [bound_x_shape_1 * bound_x_stride_1])
call f(x = x(1_i64::bound_x_stride_1), Out_0001 = Out_0001)
Out_0001_shape_1 = size(Out_0001, kind=i64)
allocate(Out_0001_ptr(0:Out_0001_shape_1 - 1_i64))
Out_0001_ptr = Out_0001
bound_Out_0001 = c_loc(Out_0001_ptr)
end subroutine bind_c_f
This function has the following prototype in C:
int bind_c_f(void*, int64_t, int64_t, void*, int64_t*);
Example 3 : function with optional scalar arguments#
The following function:
def f(x : int = None):
if x is None:
return 2
else:
return x + 3
is translated to the following Fortran code:
function f(x) result(Out_0001)
implicit none
integer(i64) :: Out_0001
integer(i64), optional, value :: x
if (.not. present(x)) then
Out_0001 = 2_i64
return
else
Out_0001 = x + 3_i64
return
end if
end function f
which is then wrapped as follows:
function bind_c_f(bound_x) bind(c) result(Out_0001)
implicit none
integer(i64) :: Out_0001
type(c_ptr), value :: bound_x
integer(i64), pointer :: x
if (c_associated(bound_x)) then
call C_F_Pointer(bound_x, x)
Out_0001 = f(x = x)
else
Out_0001 = f()
end if
end function bind_c_f
This function has the following prototype in C:
int64_t bind_c_f(void*);
Example 4 : function with optional array arguments#
The following function:
def f(x : 'float[:]' = None):
import numpy as np
if x is None:
return np.ones(3)
else:
return x + 3
is translated to the following Fortran code:
subroutine f(x, Out_0001)
implicit none
real(f64), allocatable, intent(out) :: Out_0001(:)
real(f64), optional, intent(in) :: x(0:)
if (.not. present(x)) then
allocate(Out_0001(0:2_i64))
Out_0001 = 1.0_f64
return
else
allocate(Out_0001(0:size(x, kind=i64) - 1_i64))
Out_0001 = x + 3_i64
return
end if
if (allocated(Out_0001)) deallocate(Out_0001)
end subroutine f
which is then wrapped as follows:
subroutine bind_c_f(bound_x, bound_x_shape_1, bound_x_stride_1, &
bound_Out_0001, Out_0001_shape_1) bind(c)
implicit none
type(c_ptr), intent(out) :: bound_Out_0001
integer(i64), intent(out) :: Out_0001_shape_1
type(c_ptr), value :: bound_x
integer(i64), value :: bound_x_shape_1
integer(i64), value :: bound_x_stride_1
real(f64), pointer :: x(:)
real(f64), allocatable :: Out_0001(:)
real(f64), pointer :: Out_0001_ptr(:)
if (c_associated(bound_x)) then
call C_F_Pointer(bound_x, x, [bound_x_shape_1 * bound_x_stride_1])
call f(x = x(1_i64::bound_x_stride_1), Out_0001 = Out_0001)
else
call f(Out_0001 = Out_0001)
end if
Out_0001_shape_1 = size(Out_0001, kind=i64)
allocate(Out_0001_ptr(0:Out_0001_shape_1 - 1_i64))
Out_0001_ptr = Out_0001
bound_Out_0001 = c_loc(Out_0001_ptr)
end subroutine bind_c_f
This function has the following prototype in C:
int bind_c_f(void*, int64_t, int64_t, void*, int64_t*);
Class module variables#
Unlike Fortran, C does not have classes in the language. The wrapper therefore cannot pass the class to C via a description. Instead the wrapper prints a function which returns a pointer to the module variable.
Additionally the class method functions are be wrapped as described for functions. The attributes of the class are be exposed via getter and setter wrapper functions.
C To Python#
The C to Python wrapper wraps C code to make it callable from Python. This module relies heavily on the Python-C API.
Functions#
A function that can be called from Python must have the following prototype:
PyObject* func_name(PyObject* self, PyObject* args, PyObject* kwargs);
The arguments and keyword arguments are unpacked into individual PyObject
pointers.
Each of these objects is checked to verify the type. If the type does not match the expected type then an error is raised as described in the C-API documentation.
If the type does match then the value is unpacked into a C object. This is done using custom functions defined in pyccel/stdlib/cwrapper/
(see these files for more details) or using functions provided by Python.h
.
In order to create all the nodes necessary to describe the unpacking of the arguments we use functions named _extract_X_FunctionDefArgument
where X
is the type of the object being extracted from the FunctionDefArgument
. This allows such functions to call each other recursively. This is notably useful for container types (tuples, lists, etc) whose elements may themselves be container types. The types of scalars are checked in the same way regardless of whether they are arguments or elements of a container so this also reduces code duplication.
Similarly in order to create all the nodes necessary to describe the packing of the results we use functions named _extract_X_FunctionDefResult
where X
is the type of the object being packed from the FunctionDefResult
. This allows such functions to call each other recursively. This is notably useful for container types (tuples, lists, etc) whose elements may themselves be container types.
Once C objects have been retrieved the function is called normally.
The wrapper is attached to the module via a PyMethodDef
(see C-API docs).
Example 1#
The following Python code:
def f(x : 'float[:]', y : float = 3):
return x + y
leads to C code with the following prototype:
array_double_1d f(array_double_1d x, double y);
which is then wrapped as follows:
static PyObject* f_wrapper(PyObject* self, PyObject* args, PyObject* kwargs)
{
PyObject* x_obj;
PyObject* y_obj;
void* x_data = NULL;
int64_t x_shape[1];
int64_t x_strides[1];
array_double_1d x_0001 = {0};
array_double_1d x = {0};
double y;
array_double_1d Out_0001 = {0};
PyObject* Out_0001_obj;
y_obj = Py_None;
static char *kwlist[] = {
(char*)"x",
(char*)"y",
NULL
};
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|O", kwlist, &x_obj, &y_obj))
{
return NULL;
}
if (!pyarray_check("x", x_obj, NPY_DOUBLE, INT64_C(1), NO_ORDER_CHECK))
{
return NULL;
}
x_data = PyArray_DATA((PyArrayObject*)(x_obj));
get_strides_and_shape_from_numpy_array(x_obj, x_shape, x_strides);
x_0001 = (array_double_1d)cspan_md_layout(c_ROWMAJOR, x_data, x_shape[INT64_C(0)] * x_strides[INT64_C(0)]);
x = cspan_slice(array_double_1d, &x_0001, {INT64_C(0), c_END, x_strides[INT64_C(0)]});
y = 3.0;
if (y_obj != Py_None)
{
if (PyIs_NativeFloat(y_obj))
{
y = PyDouble_to_Double(y_obj);
}
else
{
PyErr_SetString(PyExc_TypeError, "Expected an argument of type float for argument y");
return NULL;
}
}
Out_0001 = f(x, y);
Out_0001_obj = to_pyarray(INT64_C(1), NPY_DOUBLE, Out_0001.data, Out_0001.shape, 1, 0);
return Out_0001_obj;
}
The function is linked to the module via a PyMethodDef
as follows:
static PyMethodDef tmp_methods[] = {
{
"f", // Function name
(PyCFunction)f_wrapper, // Function implementation
METH_VARARGS | METH_KEYWORDS, // Indicates that the function accepts args and kwargs
"" // function docstring
},
{ NULL, NULL, 0, NULL}
};
If the code was translated to Fortran the prototype is:
int bind_c_f(void*, int64_t, int64_t, double, void*, int64_t*);
which is then wrapped as follows:
static PyObject* bind_c_f_wrapper(PyObject* self, PyObject* args, PyObject* kwargs)
{
PyObject* x_obj;
PyObject* y_obj;
void* x_data = NULL;
int64_t x_shape[1];
int64_t x_strides[1];
double y;
void* Out_0001_data = NULL;
int64_t Out_0001_shape[1];
PyObject* Out_0001_obj;
y_obj = Py_None;
static char *kwlist[] = {
(char*)"x",
(char*)"y",
NULL
};
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O|O", kwlist, &x_obj, &y_obj))
{
return NULL;
}
if (!pyarray_check("x", x_obj, NPY_DOUBLE, INT64_C(1), NO_ORDER_CHECK))
{
return NULL;
}
x_data = PyArray_DATA((PyArrayObject*)(x_obj));
get_strides_and_shape_from_numpy_array(x_obj, x_shape, x_strides);
y = 3.0;
if (y_obj != Py_None)
{
if (PyIs_NativeFloat(y_obj))
{
y = PyDouble_to_Double(y_obj);
}
else
{
PyErr_SetString(PyExc_TypeError, "Expected an argument of type float for argument y");
return NULL;
}
}
bind_c_f(x_data, x_shape[INT64_C(0)], x_strides[INT64_C(0)], y, &Out_0001_data, &Out_0001_shape[INT64_C(0)]);
Out_0001_obj = to_pyarray(INT64_C(1), NPY_DOUBLE, Out_0001_data, Out_0001_shape, 1, 1);
return Out_0001_obj;
}
Example 2 : function with tuple arguments#
The following Python code:
def get_first_element_of_tuple(a : 'tuple[int,...]'):
return a[0]
leads to C code with the following prototype (as homogeneous tuples are treated like arrays):
int64_t get_first_element_of_tuple(array_int64_1d a);
which is then wrapped as follows:
static PyObject* get_first_element_of_tuple_wrapper(PyObject* self, PyObject* args, PyObject* kwargs)
{
PyObject* a_obj;
int64_t a_size_0001;
array_int64_1d a = {0};
int Dummy_0000;
PyObject* Dummy_0001;
bool is_homog_tuple;
int64_t size;
int Dummy_0002;
PyObject* Dummy_0003;
int64_t Out_0001;
int64_t* a_ptr;
PyObject* Out_0001_obj;
static char *kwlist[] = {
(char*)"a",
NULL
};
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O", kwlist, &a_obj))
{
return NULL;
}
if (PyTuple_Check(a_obj))
{
size = PyTuple_Size(a_obj);
is_homog_tuple = 1;
for (Dummy_0002 = INT64_C(0); Dummy_0002 < size; Dummy_0002 += INT64_C(1))
{
Dummy_0003 = PyTuple_GetItem(a_obj, Dummy_0002);
is_homog_tuple = is_homog_tuple && PyIs_NativeInt(Dummy_0003);
}
}
else
{
is_homog_tuple = 0;
}
if (!is_homog_tuple)
{
PyErr_SetString(PyExc_TypeError, "Expected an argument of type tuple[int, ...] for argument a");
return NULL;
}
a_size_0001 = PyTuple_Size(a_obj);
a_ptr = malloc(sizeof(int64_t) * (a_size_0001));
a = (array_int64_1d)cspan_md_layout(c_ROWMAJOR, a_ptr, a_size_0001);
for (Dummy_0000 = INT64_C(0); Dummy_0000 < a_size_0001; Dummy_0000 += INT64_C(1))
{
Dummy_0001 = PyTuple_GetItem(a_obj, Dummy_0000);
(*cspan_at(&a, Dummy_0000)) = PyInt64_to_Int64(Dummy_0001);
}
Out_0001 = get_first_element_of_tuple(a);
Out_0001_obj = Int64_to_PyLong(&Out_0001);
return Out_0001_obj;
}
Interfaces#
Interfaces are functions which accept more than one type. These functions are handled via multiple functions in the wrapper:
A function which can be called from Python with the prototype:
PyObject* func_name(PyObject* self, PyObject* args, PyObject* kwargs);
A function which determines which combination of types were used in the call
A function for each combination of types which calls the translated function
Example#
The following Python code:
def f(x : int | float):
return x + 2
leads to C code with the following prototypes:
double f_00(double x);
int64_t f_01(int64_t x);
which is then wrapped as follows:
/*........................................*/
PyObject* f_00_wrapper(PyObject* x_obj)
{
double x;
double Out_0001;
PyObject* Out_0001_obj;
x = PyDouble_to_Double(x_obj);
Out_0001 = f_00(x);
Out_0001_obj = Double_to_PyDouble(&Out_0001);
return Out_0001_obj;
}
/*........................................*/
/*........................................*/
PyObject* f_01_wrapper(PyObject* x_obj)
{
int64_t x;
int64_t Out_0001;
PyObject* Out_0001_obj;
x = PyInt64_to_Int64(x_obj);
Out_0001 = f_01(x);
Out_0001_obj = Int64_to_PyLong(&Out_0001);
return Out_0001_obj;
}
/*........................................*/
/*_________________________________CommentBlock_________________________________*/
/*Assess the types. Raise an error for unexpected types and calculate an integer */
/*which indicates which function should be called. */
/*_______________________________________________________________________________*/
int64_t f_type_check(PyObject* x_obj)
{
int64_t type_indicator;
type_indicator = INT64_C(0);
if (PyIs_NativeFloat(x_obj))
{
type_indicator += INT64_C(0);
}
else if (PyIs_NativeInt(x_obj))
{
type_indicator += INT64_C(1);
}
else
{
PyErr_SetString(PyExc_TypeError, "Unexpected type for argument x");
return -INT64_C(1);
}
return type_indicator;
}
/*........................................*/
/*........................................*/
PyObject* f_wrapper(PyObject* self, PyObject* args, PyObject* kwargs)
{
PyObject* x_obj;
int64_t type_indicator;
static char *kwlist[] = {
"x",
NULL
};
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O", kwlist, &x_obj))
{
return NULL;
}
type_indicator = f_type_check(x_obj);
if (type_indicator == INT64_C(0))
{
return f_00_wrapper(x_obj);
}
else if (type_indicator == INT64_C(1))
{
return f_01_wrapper(x_obj);
}
else
{
PyErr_SetString(PyExc_TypeError, "Unexpected type combination");
return NULL;
}
}
/*........................................*/
The function is linked to the module via a PyMethodDef
as follows:
static PyMethodDef tmp_methods[] = {
{
"f", // Function name
(PyCFunction)f_wrapper, // Function implementation
METH_VARARGS | METH_KEYWORDS, // Indicates that the function accepts args and kwargs
"" // function docstring
},
{ NULL, NULL, 0, NULL}
};
Variables#
In order to import variables from a module, they must be attached to the module using the function PyModule_AddObject
.
In order to do this the value of the variable must be placed in a PyObject
.
This is done in a function which is executed when the module is imported for the first time.
Example#
The following Python code:
x = 3
leads to the following module initialisation function:
int32_t tmp_exec_func(PyObject* mod)
{
PyObject* Dummy_0001;
// Call the C function to initialise the module
tmp__init();
// Pack the module variable into a PyObject
Dummy_0001 = Int64_to_PyLong(&x);
// Add the PyObject to the Python module.
if (PyModule_AddObject(mod, "x", Dummy_0001) < INT64_C(0))
{
// Raise an error in case of problems
return -INT64_C(1);
}
return INT64_C(0);
}
Classes#
The wrapping of classes is described in detail at https://docs.python.org/3/extending/newtypes_tutorial.html.
The class methods are wrapped similarly to normal functions. The main difference is that the first argument (self
) is used and represents the class instance instead of the module instance.
In order to wrap class attributes, a getter and a setter function are created. These are then linked to the class as described in the tutorial.
Classes have an additional difficulty which is not present for module functions and variables. The difficulty is that they define a new type which may be imported and used in other modules. This difficulty is managed using capsules. Capsules are objects in the C-Python API which are designed to expose the API to other compiled Python libraries. The use of these capsules is described at https://docs.python.org/3/extending/extending.html#using-capsules.
Returning pointers#
Pyccel allows the user to return a pointer to an object which is an argument of a function (including an attribute of a bound argument). Pointers are created when pointing at non-trivial objects (e.g. classes, NumPy arrays, lists). When doing this it is very important to ensure that the reference counter on the target is incremented and will be decremented when the pointer goes out of scope.
Python and Pyccel handle this slightly differently. Python directly returns the object passed as argument so we have pointer is target
. The C to Python backend could theoretically do this, but they would have to compare the pointer for the results with the pointer for each of the target arguments to find the correct argument. This is unnecessarily complex and class attributes would have to be handled as a special case. Instead what is done is that a new object is created which operates on the same data.
For a NumPy array pointer, a numpy.ndarray
is created which operates on the same data. NumPy uses the base property to track the object whose reference counter should be decremented when this object is destroyed. If there is only 1 possible target then the base is set to this target (i.e. to the class for a class property, or to the argument for a pointer to an argument) so the data is not deleted and the decremented object can take care of destroying it when necessary. If there are multiple possible NumPy targets then a warning is raised advising the user to avoid using this function. In reality it can be used as long as one takes care with scoping to ensure dangling pointers cannot occur.
For a class pointer, a new class instance is created. This class has a referenced_objects
attribute which contains a Python list. All possible targets are added to this list. They are decremented by removing them from the Python list when the pointer class goes out of scope and is garbage collected.