Container types in Pyccel#
Pyccel provides support for some container types with certain limits. The types that are currently supported are:
NumPy arrays
Tuples
Lists
Sets
Dictionaries
NumPy arrays#
NumPy arrays are provided as part of the NumPy support. There is dedicated documentation which explains the limitations and implementation details. See N-dimensional array for more details.
Tuples#
In Pyccel tuples are divided into two types: homogeneous and inhomogeneous. Homogeneous tuples are objects where all elements of the container have the same type while inhomogeneous tuples can contain objects of different types. These two types are handled differently and therefore have very different restrictions.
Currently Pyccel cannot wrap tuples so they can be used in functions but cannot yet be exposed to Python.
Homogeneous tuples#
Homogeneous tuples are handled as though they were arrays. This means that they have the same restrictions and advantages as NumPy arrays. In particular they can be indexed at an arbitrary point.
Elements of a homogeneous tuple should have the same type, the same number of dimensions, and (if relevant) the same NumPy ordering. If any of these constraints is not respected then you may unexpectedly find yourself using the more inflexible inhomogeneous tuples. Further tuples containing pointers to other objects cannot always be stored in a homogeneous tuple.
Inhomogeneous tuples#
Inhomogeneous tuples are handled symbolically. This means that an inhomogeneous tuple is treated as a collection of translatable objects. Each of these objects is then handled individually. In particular this means that tuples can only be indexed by compile-time constants.
For example the following code:
def f():
a = (1, True, 3.0)
print(a)
b = a[0]+2
return a[2]
is translated to the following C code:
double f(void)
{
int64_t a_0;
bool a_1;
double a_2;
int64_t b;
a_0 = INT64_C(1);
a_1 = 1;
a_2 = 3.0;
printf("%s%"PRId64"%s%s%s%.15lf%s\n", "(", a_0, ", ", a_1 ? "True" : "False", ", ", a_2, ")");
b = a_0 + INT64_C(2);
return a_2;
}
and the following Fortran code:
function f() result(Out_0001)
implicit none
real(f64) :: Out_0001
integer(i64) :: a_0
logical(b1) :: a_1
real(f64) :: a_2
integer(i64) :: b
a_0 = 1_i64
a_1 = .True._b1
a_2 = 3.0_f64
write(stdout, '(A, I0, A, A, A, F0.15, A)', advance="no") '(' , a_0 &
, ', ' , merge("True ", "False", a_1) , ', ' , a_2 , ')'
write(stdout, '()', advance="yes")
b = a_0 + 2_i64
Out_0001 = a_2
return
end function f
But the following code will raise an error:
def f():
a = (1, True, 3.0)
i = 2
print(a[i])
ERROR at annotation (semantic) stage
pyccel:
|fatal [semantic]: foo.py [4,10]| Inhomogeneous tuples must be indexed with constant integers for the type inference to work (a)
Lists/Sets/Dictionaries#
Homogeneous lists, sets and dictionaries are implemented using external libraries. In C we rely on STC. In Fortran we rely on gFTL.
For example the following code:
def f():
my_list = [1, 2, 3, 4]
my_set = {1, 2, 3, 4}
my_dict = {1:1.0, 2:2.0}
b = my_list[0]+2
return b + my_set.pop() + my_dict[1]
is translated to the following C code:
double f(void)
{
vec_int64_t my_list = {0};
hset_int64_t my_set = {0};
hmap_int64_t_double my_dict = {0};
int64_t b;
double result;
my_list = c_make(vec_int64_t, {INT64_C(1),INT64_C(2),INT64_C(3),INT64_C(4)});
my_set = c_make(hset_int64_t, {INT64_C(1),INT64_C(2),INT64_C(3),INT64_C(4)});
my_dict = c_make(hmap_int64_t_double, {{INT64_C(1), 1.0}, {INT64_C(2), 2.0}});
b = (*vec_int64_t_at(&my_list, INT64_C(0))) + INT64_C(2);
result = b + hset_int64_t_pop(&my_set) + (*hmap_int64_t_double_at(&my_dict, INT64_C(1)));
hset_int64_t_drop(&my_set);
vec_int64_t_drop(&my_list);
hmap_int64_t_double_drop(&my_dict);
return result;
}
and the following Fortran code:
function f() result(result_0001)
implicit none
real(f64) :: result_0001
type(Vector_integer8) :: my_list
type(Set_integer8) :: my_set
type(Map_integer8__real8) :: my_dict
integer(i64) :: b
my_list = Vector_integer8([1_i64, 2_i64, 3_i64, 4_i64])
my_set = Set_integer8([1_i64, 2_i64, 3_i64, 4_i64])
my_dict = Map_integer8__real8([Pair_integer8__real8(1_i64, 1.0_f64), &
Pair_integer8__real8(2_i64, 2.0_f64)])
b = my_list%of(1_i64) + 2_i64
result_0001 = b + Set_integer8_pop(my_set) + my_dict % of( 1_i64 )
return
end function f
Lists of lists and more#
Containers such as lists, sets and dictionaries can also contain other containers. In this case memory management is critical to ensure that the memory is shared as it would be in Python. Consider the following example:
def f():
a = [1, 2, 3]
b = [a, [4, 5, 6]]
c = b[1]
a[0] = 4 # This modifies b
c[0] = 7 # This modifies b
The memory deallocation is not trivial in this case. As a result managed memory counting is used. The example above is translated to the following C code:
void f(void)
{
vec_vec_int64_t_mem b = {0};
vec_int64_t_mem a_mem = vec_int64_t_mem_make(vec_int64_t_init());
vec_int64_t_mem c_mem;
(*a_mem.get) = c_make(vec_int64_t, {INT64_C(1),INT64_C(2),INT64_C(3)});
b = c_make(vec_vec_int64_t_mem, {
vec_int64_t_mem_clone(a_mem),
vec_int64_t_mem_make(c_make(vec_int64_t, {INT64_C(4),INT64_C(5),INT64_C(6)}))
});
c_mem = vec_int64_t_mem_clone(*vec_vec_int64_t_mem_at(&b, INT64_C(1)));
(*vec_int64_t_at_mut(a_mem.get, INT64_C(0))) = INT64_C(4);
(*vec_int64_t_at_mut(c_mem.get, INT64_C(0))) = INT64_C(7);
vec_vec_int64_t_mem_drop(&b);
vec_int64_t_mem_drop(&a_mem);
vec_int64_t_mem_drop(&c_mem);
}