Syntactic Stage#
The syntactic stage is described by the file pyccel.parser.syntactic
The syntactic stage serves 4 main purposes:
Navigation and AST Creation : Convert Python’s AST (abstract syntax tree) representation of the Python file to Pyccel’s AST representation (objects of the classes in the folder pyccel.ast
Errors : Raise an error for any syntax used that is not yet supported by Pyccel
Headers : Convert header comments from strings to Pyccel’s AST representation
Scoping : Collect the name of all variables in each scope (see scope for more details) to ensure no name collisions can occur if Pyccel generates Variable names
Errors#
Error handling uses the classes in the file pyccel.errors.errors.
Errors in the syntactic stage should raise a PyccelSyntaxError
.
Where possible this should be done by accessing the Errors()
singleton and calling the report
function.
This function takes several arguments (see docstring for more details).
The most important arguments are:
message : Describe the issue that lead to the error
symbol : The Python AST object should be passed here. This object contains information about its position in the file (line number, column) which ensures the user can more easily locate their error
severity : The severity level must be one of the following:
warning : An error will be printed but Pyccel will continue executing
error : An error will be printed but Pyccel will continue executing the syntactic stage
fatal : An error will be printed and Pyccel will stop executing. This level should rarely be needed in the syntactic stage as a failure in one visitation function (
_visit_X
) should not affect the execution of another. It is preferable to show the users all errors at once
Although the failures in visitation functions (_visit_X
) do not affect other visitation functions it is still important to ensure that the functions provide a valid output.
In the SyntacticParser
all _visit_X
should return a Pyccel AST object (an object which inherits from pyccel.ast.basic.PyccelAstNode
, pyccel.ast.core.EmptyNode
can be used to ensure this restriction is fulfilled.
This is important to avoid errors caused by the construction of the tree which relates the objects (for more details see the semantic stage).
Headers#
The headers (type declarations/OpenMP pragmas/etc) also have their own syntax which cannot be parsed by Python’s ast
module.
The module textx is used to parse these statements.
The files describing the textx grammar are found in the folder pyccel.parser.grammar.
From these files textx generates instances of the classes found in the folder [pyccel.parser.syntax](https://github.com/pyccel/pyccel/tree/devel/api/pyccel.parser.syntax.rst.
These instances can then be inserted into the abstract syntax tree.
Scoping#
The final purpose of the syntactic stage is to collect all names declared in the code.
This is important to avoid name collisions if Pyccel creates temporaries or requires additional names.
The names are saved in the scope (for more details see scope).
Whenever a symbol is encountered in a declaration context, it should be saved to the scope using the function self.scope.insert_symbol
.
This is usually done in the _visit_Name
function, however this function is not aware of the context so it cannot determine whether it is a declaration.
To get round this the class has the attribute SyntaxParser._in_lhs_assign
, which should be True
in a declaration context, and False
elsewhere.
Consider for example a for loop. Such a loop has 3 main parts (which are each members of ast.For
):
target (
ast.For.target
)iterable (
ast.For.iter
)body (
ast.For.body
)
such that a for loop is defined as:
for target in iterable:
body
Each of these 3 members must be visited individually, but the target additionally is in a declaration context.
The visitation section of _visit_For
is therefore:
self._in_lhs_assign = True
iterator = self._visit(stmt.target)
self._in_lhs_assign = False
iterable = self._visit(stmt.iter)
body = self._visit(stmt.body)
Scoped Node#
Any functions visiting a class which inherits from ScopedAstNode
must create a new scope before visiting objects and exit it after everything inside the scope has been visited.
The scope must then be passed to the class using the keyword argument scope
.
Care should be taken here as this keyword is not compulsory[1].
A child scope can be created using one of the following functions (for more details see the docstrings in pyccel.parser.scope:
Scope.new_child_scope
Scope.create_new_loop_scope
Scope.create_product_loop_scope
Temporary Creation#
Occasionally it is necessary to create objects in the syntactic stage.
When this happens, the Scope
functions should be used to avoid name collisions with the objects in the original code.
See the scope docs for more details.
In all cases it is preferable to delay the creation of new objects as long as possible to ensure that as much information is known about the scope as possible.
This is important as at this stage there may still be conflicting names which appear later in the file.
The Scope
should prevent name collisions with these objects, but that will lead to them being renamed which makes the translated code harder to recognise when compared with the original.
History#
Originally the syntactic stage translated from RedBaron’s AST representation to Pyccel’s AST representation.
RedBaron parses Python, but makes no attempt to validate the code.
This made Pyccel’s job harder as there was no guarantee that the syntax was correct.
Since moving to Python’s ast
module the syntactic stage has been massively simplified.
This is because Python’s ast
module checks the validity of the syntax.