The Abstract Syntax of CIL.
Root of the AST
In Frama-C, the whole AST is accessible through Ast.get
.
type file = {
mutable fileName : Filepath.Normalized.t;
mutable globals : global list;
List of globals as they will appear in the printed file
mutable globinit : fundec option;
An optional global initializer function. This is a function where you can put stuff that must be executed before the program is started. This function, is conceptually at the end of the file, although it is not part of the globals list. Use Cil.getGlobInit
to create/get one.
mutable globinitcalled : bool;
Whether the global initialization function is called in main. This should always be false if there is no global initializer. When you create a global initialization CIL will try to insert code in main to call it.
}
and global =
| GType of typeinfo * location
A typedef. All uses of type names (through the TNamed
constructor) must be preceded in the file by a definition of the name. The string is the defined name and always not-empty.
| GCompTag of compinfo * location
Defines a struct/union tag with some fields. There must be one of these for each struct/union tag that you use (through the TComp
constructor) since this is the only context in which the fields are printed. Consequently nested structure tag definitions must be broken into individual definitions with the innermost structure defined first.
| GCompTagDecl of compinfo * location
Declares a struct/union tag. Use as a forward declaration. This is printed without the fields.
| GEnumTag of enuminfo * location
Declares an enumeration tag with some fields. There must be one of these for each enumeration tag that you use (through the TEnum
constructor) since this is the only context in which the items are printed.
| GEnumTagDecl of enuminfo * location
Declares an enumeration tag. Use as a forward declaration. This is printed without the items.
| GVarDecl of varinfo * location
A variable declaration (not a definition) for a variable with object type. There can be several declarations and at most one definition for a given variable. If both forms appear then they must share the same varinfo structure. Either has storage Extern or there must be a definition in this file
| GFunDecl of funspec * varinfo * location
A variable declaration (not a definition) for a function, i.e. a prototype. There can be several declarations and at most one definition for a given function. If both forms appear then they must share the same varinfo structure. A prototype shares the varinfo with the fundec of the definition. Either has storage Extern or there must be a definition in this file.
| GVar of varinfo * initinfo * location
A variable definition. Can have an initializer. The initializer is updateable so that you can change it without requiring to recreate the list of globals. There can be at most one definition for a variable in an entire program. Cannot have storage Extern or function type.
| GFun of fundec * location
| GAsm of string * location
Global asm statement. These ones can contain only a template
| GPragma of attribute * location
Pragmas at top level. Use the same syntax as attributes
| GText of string
Some text (printed verbatim) at top level. E.g., this way you can put comments in the output.
| GAnnot of global_annotation * location
a global annotation. Can be
- an axiom or a lemma
- a predicate declaration or definition
- a global type invariant
- a global invariant
- a logic function declaration or definition.
The main type for representing global declarations and definitions. A list of these form a CIL file. The order of globals in the file is generally important.
Types
A C type is represented in CIL using the type Cil_types.typ
. Among types we differentiate the integral types (with different kinds denoting the sign and precision), floating point types, enumeration types, array and pointer types, and function types. Every type is associated with a list of attributes, which are always kept in sorted order. Use Cil.addAttribute
and Cil.addAttributes
to construct list of attributes. If you want to inspect a type, you should use Cil.unrollType
or Cil.unrollTypeDeep
to see through the uses of named types.
CIL is configured at build-time with the sizes and alignments of the underlying compiler (GCC or MSVC). CIL contains functions that can compute the size of a type (in bits) Cil.bitsSizeOf
, the alignment of a type (in bytes) Cil.alignOf_int
, and can convert an offset into a start and width (both in bits) using the function Cil.bitsOffset
. At the moment these functions do not take into account the packed
attributes and pragmas.
and typ =
| TVoid of attributes
| TInt of ikind * attributes
| TFloat of fkind * attributes
A floating-point type. The kind specifies the precision. You can also use the predefined constant Cil.doubleType
.
| TPtr of typ * attributes
| TArray of typ * exp option * attributes
Array type. It indicates the base type and the array length.
| TFun of typ * (string * typ * attributes) list option * bool * attributes
Function type. Indicates the type of the result, the name, type and name attributes of the formal arguments (None
if no arguments were specified, as in a function whose definition or prototype we have not seen; Some []
means void). Use Cil.argsToList
to obtain a list of arguments. The boolean indicates if it is a variable-argument function. If this is the type of a varinfo for which we have a function declaration then the information for the formals must match that in the function's sformals. Use Cil.setFormals
, or Cil.setFunctionType
, or Cil.makeFormalVar
for this purpose.
| TNamed of typeinfo * attributes
The use of a named type. All uses of the same type name must share the typeinfo. Each such type name must be preceded in the file by a GType
global. This is printed as just the type name. The actual referred type is not printed here and is carried only to simplify processing. To see through a sequence of named type references, use Cil.unrollType
. The attributes are in addition to those given when the type name was defined.
| TComp of compinfo * attributes
A reference to a struct or a union type. All references to the same struct or union must share the same compinfo among them and with a GCompTag
global that precedes all uses (except maybe those that are pointers to the composite type). The attributes given are those pertaining to this use of the type and are in addition to the attributes that were given at the definition of the type and which are stored in the compinfo.
| TEnum of enuminfo * attributes
A reference to an enumeration type. All such references must share the enuminfo among them and with a GEnumTag
global that precedes all uses. The attributes refer to this use of the enumeration and are in addition to the attributes of the enumeration itself, which are stored inside the enuminfo
| TBuiltin_va_list of attributes
This is the same as the gcc's type with the same name
and ikind =
| IBool
| IChar
| ISChar
| IUChar
| IInt
| IUInt
| IShort
| IUShort
| ILong
| IULong
| ILongLong
long long
(or _int64
on Microsoft Visual C)
| IULongLong
unsigned long long
(or unsigned _int64
on Microsoft Visual C)
Various kinds of integers
and fkind =
| FFloat
| FDouble
| FLongDouble
Various kinds of floating-point numbers
Attributes
and attribute =
| Attr of string * attrparam list
An attribute has a name and some optional parameters. The name should not start or end with underscore. When CIL parses attribute names it will strip leading and ending underscores (to ensure that the multitude of GCC attributes such as const, __const and __const__ all mean the same thing.)
| AttrAnnot of string
Attributes are lists sorted by the attribute name. Use the functions Cil.addAttribute
and Cil.addAttributes
to insert attributes in an attribute list and maintain the sortedness.
and attrparam =
| AInt of Integer.t
| AStr of string
| ACons of string * attrparam list
Constructed attributes. These are printed foo(a1,a2,...,an)
. The list of parameters can be empty and in that case the parentheses are not printed.
There are some Frama-C builtins that are used to account for OSX's peculiarities:
- __fc_assign takes two arguments and emulate
a1=a2
syntax - __fc_float takes one string argument and indicates a floating point constant, that will be printed as such. See https://clang.llvm.org/docs/AttributeReference.html#availability for more information. Proper attributes node might be added if really needed, i.e. if some plug-in wants to interpret the availability attribute.
| ASizeOf of typ
A way to talk about types
| ASizeOfE of attrparam
| AAlignOf of typ
| AAlignOfE of attrparam
| AUnOp of unop * attrparam
| ABinOp of binop * attrparam * attrparam
| ADot of attrparam * string
| AStar of attrparam
| AAddrOf of attrparam
| AIndex of attrparam * attrparam
| AQuestion of attrparam * attrparam * attrparam
The type of parameters of attributes
Structures
The Cil_types.compinfo
describes the definition of a structure or union type. Each such Cil_types.compinfo
must be defined at the top-level using the GCompTag
constructor and must be shared by all references to this type (using either the TComp
type constructor or from the definition of the fields.
If all you need is to scan the definition of each composite type once, you can do that by scanning all top-level GCompTag
.
Constructing a Cil_types.compinfo
can be tricky since it must contain fields that might refer to the host Cil_types.compinfo
and furthermore the type of the field might need to refer to the Cil_types.compinfo
for recursive types. Use the Cil.mkCompInfo
function to create a Cil_types.compinfo
. You can easily fetch the Cil_types.fieldinfo
for a given field in a structure with Cil.getCompField
.
and compinfo = {
mutable cstruct : bool;
true
if struct, false
if union
corig_name : string;
Original name as found in C file. Will never be changed
mutable cname : string;
The name. Always non-empty. Use Cil.compFullName
to get the full name of a comp (along with the struct or union)
mutable ckey : int;
A unique integer. This is assigned by Cil_const.mkCompInfo
using a global variable in the Cil module. Thus two identical structs in two different files might have different keys. Use Cil_const.copyCompInfo
to copy structures so that a new key is assigned.
mutable cfields : fieldinfo list option;
Information about the fields. Notice that each fieldinfo has a pointer back to the host compinfo. This means that you should not share fieldinfo's between two compinfo's.
None value means that the type is incomplete.
mutable cattr : attributes;
The attributes that are defined at the same time as the composite type. These attributes can be supplemented individually at each reference to this compinfo
using the TComp
type constructor.
mutable creferenced : bool;
true
if used. Initially set to false
.
}
The definition of a structure or union type. Use Cil.mkCompInfo
to make one and use Cil.copyCompInfo
to copy one (this ensures that a new key is assigned and that the fields have the right pointers to parents.).
Structure fields
The Cil_types.fieldinfo
structure is used to describe a structure or union field. Fields, just like variables, can have attributes associated with the field itself or associated with the type of the field (stored along with the type of the field).
and fieldinfo = {
mutable fcomp : compinfo;
The host structure that contains this field. There can be only one compinfo
that contains the field.
mutable forder : int;
The position in the host structure.
forig_name : string;
original name as found in C file.
mutable fname : string;
The name of the field. Might be the value of Cil.missingFieldName
in which case it must be a bitfield and is not printed and it does not participate in initialization
mutable ftype : typ;
The type. If the field is a bitfield, a special attribute FRAMA_C_BITFIELD_SIZE
indicating the width of the bitfield is added.
mutable fbitfield : int option;
If a bitfield then ftype should be an integer type and the width of the bitfield must be 0 or a positive integer smaller or equal to the width of the integer type. A field of width 0 is used in C to control the alignment of fields.
mutable fattr : attributes;
The attributes for this field (not for its type)
mutable floc : location;
The location where this field is defined
mutable faddrof : bool;
Adapted from CIL vaddrof
field for variables. Only set for non-array fields. Variable whose field address is taken is not marked anymore as having its own address taken. True if the address of this field is taken. CIL will set these flags when it parses C, but you should make sure to set the flag whenever your transformation create AddrOf
expression.
mutable fsize_in_bits : int option;
mutable foffset_in_bits : int option;
}
Information about a struct/union field.
Enumerations
Information about an enumeration. This is shared by all references to an enumeration. Make sure you have a GEnumTag
for each of these.
and enuminfo = {
eorig_name : string;
original name as found in C file.
mutable ename : string;
The name. Always non-empty.
mutable eitems : enumitem list;
Items. The list must be non-empty
mutable eattr : attributes;
The attributes that are defined at the same time as the enumeration type. These attributes can be supplemented individually at each reference to this enuminfo
using the TEnum
type constructor.
mutable ereferenced : bool;
true
if used. Initially set to false
.
mutable ekind : ikind;
The integer kind used to represent this enum. MSVC always assumes IInt but this is not the case for gcc. See ISO C 6.7.2.2
}
Information about an enumeration.
and enumitem = {
eiorig_name : string;
original name as found in C file.
mutable einame : string;
the name, always non-empty.
mutable eival : exp;
value of the item. Must be a compile-time constant
mutable eihost : enuminfo;
the host enumeration in which the item is declared.
eiloc : location;
}
and typeinfo = {
torig_name : string;
original name as found in C file.
mutable tname : string;
The name. Can be empty only in a GType
when introducing a composite or enumeration tag. If empty cannot be referred to from the file
mutable ttype : typ;
The actual type. This includes the attributes that were present in the typedef
mutable treferenced : bool;
true
if used. Initially set to false
.
}
Information about a defined type.
Variables
Each local or global variable is represented by a unique Cil_types.varinfo
structure. A global Cil_types.varinfo
can be introduced with the GVarDecl
or GVar
, GFunDecl
or GFun
globals. A local varinfo can be introduced as part of a function definition Cil_types.fundec
.
All references to a given global or local variable must refer to the same copy of the varinfo
. Each varinfo
has a globally unique identifier that can be used to index maps and hashtables (the name can also be used for this purpose, except for locals from different functions). This identifier is constructor using a global counter.
It is very important that you construct varinfo
structures using only one of the following functions:
A varinfo
is also used in a function type to denote the list of formals.
and varinfo = {
mutable vname : string;
The name of the variable. Cannot be empty. It is primarily your responsibility to ensure the uniqueness of a variable name. For local variables Cil.makeTempVar
helps you ensure that the name is unique.
vorig_name : string;
the original name of the variable. Need not be unique.
mutable vtype : typ;
The declared type of the variable. For modifications of the field, Cil.update_var_type
helps in synchronizing the type of the C variable and the one of the associated logic variable.
mutable vattr : attributes;
A list of attributes associated with the variable.
mutable vstorage : storage;
mutable vglob : bool;
True if this is a global variable
mutable vdefined : bool;
- For global variables, true iff the variable or function is defined in the file.
- For local variables, true iff the variable is explicitly initialized at declaration time.
- Unused for formals variables and logic variables.
mutable vformal : bool;
True if the variable is a formal parameter of a function.
mutable vinline : bool;
Whether this varinfo is for an inline function.
mutable vdecl : location;
Location of variable declaration.
mutable vid : int;
mutable vaddrof : bool;
true
if the address of this variable is taken. CIL will set these flags when it parses C, but you should make sure to set the flag whenever your transformation create AddrOf
expression.
mutable vreferenced : bool;
true
if this variable is ever referenced. This is computed by removeUnusedVars
. It is safe to just initialize this to false
.
vtemp : bool;
true
for temporary variables generated by CIL normalization. false
for all the other variables.
mutable vdescr : string option;
For most temporary variables, a description of what the var holds. (e.g. for temporaries used for function call results, this string is a representation of the function call.)
mutable vdescrpure : bool;
Indicates whether the vdescr above is a pure expression or call. True for all CIL expressions and Lvals, but false for e.g. function calls. Printing a non-pure vdescr more than once may yield incorrect results.
mutable vghost : bool;
Indicates if the variable is declared in ghost code
vsource : bool;
true
iff this variable appears in the source of the program, which is the case of all the variables in the initial AST. Plugins may create variables with vsource=false
, for example to handle dynamic allocation. Those variables do *not* have an associated GVar
or GVarDecl
.
mutable vlogic_var_assoc : logic_var option;
Logic variable representing this variable in the logic world. Do not access this field directly. Instead, call Cil.cvar_to_lvar
.
}
Information about a variable.
and storage =
| NoStorage
The default storage. Nothing is printed
| Static
| Register
| Extern
Storage-class information
Expressions
The CIL expression language contains only the side-effect free expressions of C. They are represented as the type Cil_types.exp
. There are several interesting aspects of CIL expressions:
Integer and floating point constants can carry their textual representation. This way the integer 15 can be printed as 0xF if that is how it occurred in the source.
CIL uses arbitrary precision integers to represent the integer constants and also stores the width of the integer type. Care must be taken to ensure that the constant is representable with the given width. Use the functions Cil.kinteger
, Cil.kinteger64
and Cil.integer
to construct constant expressions. CIL predefines the constants Cil.zero
, Cil.one
and Cil.mone
(for -1).
Use the functions Cil.isConstant
and Cil.isInteger
to test if an expression is a constant and a constant integer respectively.
CIL keeps the type of all unary and binary expressions. You can think of that type qualifying the operator. Furthermore there are different operators for arithmetic and comparisons on arithmetic types and on pointers.
Another unusual aspect of CIL is that the implicit conversion between an expression of array type and one of pointer type is made explicit, using the StartOf
expression constructor (which is not printed). If you apply the AddrOf
constructor to an lvalue of type T
then you will be getting an expression of type TPtr(T)
.
You can find the type of an expression with Cil.typeOf
.
You can perform constant folding on expressions using the function Cil.constFold
.
and exp = {
eid : int;
enode : exp_node;
eloc : location;
location of the expression.
}
Expressions (Side-effect free)
and exp_node =
| Const of constant
| Lval of lval
| SizeOf of typ
sizeof(<type>). Has unsigned int
type (ISO 6.5.3.4). This is not turned into a constant because some transformations might want to change types
| SizeOfE of exp
| SizeOfStr of string
sizeof(string_literal). We separate this case out because this is the only instance in which a string literal should not be treated as having type pointer to character.
| AlignOf of typ
This corresponds to the GCC __alignof_. Has unsigned int
type
| AlignOfE of exp
| UnOp of unop * exp * typ
Unary operation. Includes the type of the result.
| BinOp of binop * exp * exp * typ
Binary operation. Includes the type of the result. The arithmetic conversions are made explicit for the arguments.
| CastE of typ * exp
| AddrOf of lval
Always use Cil.mkAddrOf
to construct one of these. Apply to an lvalue of type T
yields an expression of type TPtr(T)
| StartOf of lval
Conversion from an array to a pointer to the beginning of the array. Given an lval of type TArray(T)
produces an expression of type TPtr(T)
. In C this operation is implicit, the StartOf
operator is not printed. We have it in CIL because it makes the typing rules simpler.
Constants
and constant =
| CInt64 of Integer.t * ikind * string option
Integer constant. Give the ikind (see ISO9899 6.1.3.2) and the textual representation. Textual representation is always set to Some s when it comes from user code. This allows us to print a constant as it was represented in the code, for example, 0xF instead of 15. It is usually None for constant generated by Cil itself. Use Cil.integer
or Cil.kinteger
to create these.
| CStr of string
String constant. The escape characters inside the string have been already interpreted. This constant has pointer to character type! The only case when you would like a string literal to have an array type is when it is an argument to sizeof. In that case you should use SizeOfStr.
| CWStr of int64 list
Wide character string constant. Note that the local interpretation of such a literal depends on Cil.theMachine.wcharType
and Cil.theMachine.wcharKind
. Such a constant has type pointer to Cil.theMachine.wcharType
. The escape characters in the string have not been "interpreted" in the sense that L"A\xabcd" remains "A\xabcd" rather than being represented as the wide character list with two elements: 65 and 43981. That "interpretation" depends on the underlying wide character type.
| CChr of char
Character constant. This has type int, so use charConstToInt to read the value in case sign-extension is needed.
| CReal of float * fkind * string option
Floating point constant. Give the fkind (see ISO 6.4.4.2) and also the textual representation, if available.
| CEnum of enumitem
An enumeration constant. Use Cillower.lowerEnumVisitor
to replace these with integer constants.
and unop =
| Neg
| BNot
| LNot
and binop =
| PlusA
| PlusPI
| MinusA
| MinusPI
| MinusPP
| Mult
| Div
| Mod
| Shiftlt
| Shiftrt
| Lt
< (arithmetic comparison)
| Gt
> (arithmetic comparison)
| Le
<= (arithmetic comparison)
| Ge
>= (arithmetic comparison)
| Eq
== (arithmetic comparison)
| Ne
!= (arithmetic comparison)
| BAnd
| BXor
| BOr
| LAnd
| LOr
Left values
Left values (aka Lvalues) are the sublanguage of expressions that can appear at the left of an assignment or as operand to the address-of operator. In C the syntax for lvalues is not always a good indication of the meaning of the lvalue. For example the C value a[0][1][2]
might involve 1, 2 or 3 memory reads when used in an expression context, depending on the declared type of the variable a
. If a
has type int [4][4][4]
then we have one memory read from somewhere inside the area that stores the array a
. On the other hand if a
has type int ***
then the expression really means *
( * ( * (a + 0) + 1) + 2)
, in which case it is clear that it involves three separate memory operations.
An lvalue denotes the contents of a range of memory addresses. This range is denoted as a host object along with an offset within the object. The host object can be of two kinds: a local or global variable, or an object whose address is in a pointer expression. We distinguish the two cases so that we can tell quickly whether we are accessing some component of a variable directly or we are accessing a memory location through a pointer. To make it easy to tell what an lvalue means CIL represents lvalues as a host object and an offset (see Cil_types.lval
). The host object (represented as Cil_types.lhost
) can be a local or global variable or can be the object pointed-to by a pointer expression. The offset (represented as Cil_types.offset
) is a sequence of field or array index designators.
Both the typing rules and the meaning of an lvalue is very precisely specified in CIL.
The following are a few useful function for operating on lvalues:
The following equivalences hold
Mem(AddrOf(Mem a, aoff)), off = Mem a, aoff + off
Mem(AddrOf(Var v, aoff)), off = Var v, aoff + off
AddrOf (Mem a, NoOffset) = a
and lhost =
| Var of varinfo
| Mem of exp
The host is an object of type T
when the expression has pointer TPtr(T)
.
and offset =
| NoOffset
No offset. Can be applied to any lvalue and does not change either the starting address or the type. This is used when the lval consists of just a host or as a terminator in a list of other kinds of offsets.
| Field of fieldinfo * offset
A field offset. Can be applied only to an lvalue that denotes a structure or a union that contains the mentioned field. This advances the offset to the beginning of the mentioned field and changes the type to the type of the mentioned field.
| Index of exp * offset
An array index offset. Can be applied only to an lvalue that denotes an array. This advances the starting address of the lval to the beginning of the mentioned array element and changes the denoted type to be the type of the array element
The offset part of an Cil_types.lval
. Each offset can be applied to certain kinds of lvalues and its effect is that it advances the starting address of the lvalue and changes the denoted type, essentially focussing to some smaller lvalue that is contained in the original one.
Initializers
A special kind of expressions are those that can appear as initializers for global variables (initialization of local variables is turned into assignments). The initializers are represented as type Cil_types.init
. You can create initializers with Cil.makeZeroInit
and you can conveniently scan compound initializers them with Cil.foldLeftCompound
.
and init =
| SingleInit of exp
| CompoundInit of typ * (offset * init) list
Used only for initializers of structures, unions and arrays. The offsets are all of the form Field(f, NoOffset)
or Index(i, NoOffset)
and specify the field or the index being initialized. For structures all fields must have an initializer (except the unnamed bitfields), in the proper order. This is necessary since the offsets are not printed. For arrays the list must contain a prefix of the initializers; the rest are 0-initialized. For unions there must be exactly one initializer. If the initializer is not for the first field then a field designator is printed, so you better be on GCC since MSVC does not understand this. You can scan an initializer list with Cil.foldLeftCompound
.
Initializers for global variables.
and initinfo = {
mutable init : init option;
}
We want to be able to update an initializer in a global variable, so we define it as a mutable field
kind of constructor for initializing a local variable through a function call.
and constructor_kind =
| Plain_func
plain function call, whose result is used for initializing the variable.
| Constructor
C++-like constructor: the function takes as first argument the address of the variable to be initialized, and returns void
.
and local_init =
| AssignInit of init
| ConsInit of varinfo * exp list * constructor_kind
ConsInit(f,args,kind)
indicates that the corresponding local is initialized via a call to f
, of kind kind
with the given args
.
Initializers for local variables.
Function definitions
A function definition is always introduced with a GFun
constructor at the top level. All the information about the function is stored into a Cil_types.fundec
. Some of the information (e.g. its name, type, storage, attributes) is stored as a Cil_types.varinfo
that is a field of the fundec
. To refer to the function from the expression language you must use the varinfo
.
The function definition contains, in addition to the body, a list of all the local variables and separately a list of the formals. Both kind of variables can be referred to in the body of the function. The formals must also be shared with the formals that appear in the function type. For that reason, to manipulate formals you should use the provided functions Cil.makeFormalVar
and Cil.setFormals
.
and fundec = {
mutable svar : varinfo;
Holds the name and type as a variable, so we can refer to it easily from the program. All references to this function either in a function call or in a prototype must point to the same varinfo
.
mutable sformals : varinfo list;
Formals. These must be in the same order and with the same information as the formal information in the type of the function. Use Cil.setFormals
or Cil.setFunctionType
to set these formals and ensure that they are reflected in the function type. Do not make copies of these because the body refers to them.
mutable slocals : varinfo list;
Locals. Does NOT include the sformals. Do not make copies of these because the body refers to them.
mutable smaxid : int;
Max local id. Starts at 0. Used for creating the names of new temporary variables. Updated by Cil.makeLocalVar
and Cil.makeTempVar
. You can also use Cil.setMaxId
to set it after you have added the formals and locals.
mutable sbody : block;
mutable smaxstmtid : int option;
max id of a (reachable) statement in this function, if we have computed it. range = 0 ... (smaxstmtid-1). This is computed by Cfg.computeCFGInfo
.
mutable sallstmts : stmt list;
After you call Cfg.computeCFGInfo
this field is set to contain all statements in the function.
mutable sspec : funspec;
}
and block = {
mutable battrs : attributes;
mutable bscoping : bool;
Whether the block is used to determine the scope of local variables.
mutable blocals : varinfo list;
variables that are local to the block. It is a subset of the slocals of the enclosing function.
mutable bstatics : varinfo list;
static variables whose syntactic scope is restricted to the block. They are normalized as globals, since their lifetime is the whole program execution, but we maintain a syntactic scope information here for better traceability from the original source code.
mutable bstmts : stmt list;
The statements comprising the block.
}
A block is a sequence of statements with the control falling through from one element to the next. In addition, blocks are used to determine the scope of variables, through the blocals field. Variables in blocals
that have their vdefined
field set to true
must appear as the target of a Local_init
instruction directly in the bstmts
, with two exceptions: the Local_init
instruction can be part of an UnspecifiedSequence
, or of a block that has bscoping
set to false
. Such block _must not_ itself have local variables: it denotes a simple list of statements grouped together (e.g. to stay in scope of an annotation extending to the whole list).
Statements
CIL statements are the structural elements that make the CFG. They are represented using the type Cil_types.stmt
. Every statement has a (possibly empty) list of labels. The Cil_types.stmtkind
field of a statement indicates what kind of statement it is.
Use Cil.mkStmt
to make a statement and the fill-in the fields.
CIL also comes with support for control-flow graphs. The sid
field in stmt
can be used to give unique numbers to statements, and the succs
and preds
fields can be used to maintain a list of successors and predecessors for every statement. The CFG information is not computed by default. Instead you must explicitly use the functions Cfg.prepareCFG
and Cfg.computeCFGInfo
to do it.
and stmt = {
mutable labels : label list;
Whether the statement starts with some labels, case statements or default statements.
mutable skind : stmtkind;
mutable sid : int;
A number (>= 0) that is unique in a function. Filled in only after the CFG is computed.
mutable succs : stmt list;
The successor statements. They can always be computed from the skind and the context in which this statement appears. Filled in only after the CFG is computed.
mutable preds : stmt list;
The inverse of the succs function.
mutable ghost : bool;
mutable sattr : attributes;
}
and label =
| Label of string * location * bool
A real label. If the bool is "true", the label is from the input source program. If the bool is "false", the label was created by CIL or some other transformation
| Case of exp * location
| Default of location
and stmtkind =
| Instr of instr
An instruction that does not contain control flow. Control implicitly falls through.
| Return of exp option * location
The return statement. This is a leaf in the CFG.
| Goto of stmt Stdlib.ref * location
A goto statement. Appears from actual goto's in the code or from goto's that have been inserted during elaboration. The reference points to the statement that is the target of the Goto. This means that you have to update the reference whenever you replace the target statement. The target statement MUST have at least a label.
| Break of location
A break to the end of the nearest enclosing Loop or Switch.
| Continue of location
A continue to the start of the nearest enclosing Loop
.
| If of exp * block * block * location
A conditional. Two successors, the "then" and the "else" branches (in this order). Both branches fall-through to the successor of the If statement.
| Switch of exp * block * stmt list * location
A switch statement. exp
is the index of the switch. block
is the body of the switch. stmt list
contains the set of statements whose labels
are cases of the switch (i.e. for each case, the corresponding statement is in stmt list
, a statement cannot appear more than once in the list, and statements in stmt list
can have several labels corresponding to several cases.
| Loop of code_annotation list * block * location * stmt option * stmt option
A while(1)
loop. The termination test is implemented in the body of a loop using a Break
statement. If Cfg.prepareCFG
has been called, the first stmt option will point to the stmt containing the continue label for this loop and the second will point to the stmt containing the break label for this loop.
| Block of block
Just a block of statements. Use it as a way to keep some block attributes local.
| UnspecifiedSequence of (stmt
* lval list
* lval list
* lval list
* stmt Stdlib.ref list)
list
statements whose order of execution is not specified by ISO/C. This is important for the order of side effects during evaluation of expressions. Each statement comes together with three list of lval, in this order.
- lvals that are written during the sequence and whose future value depends upon the statement (it is legal to read from them, but not to write to them)
- lvals that are written during the evaluation of the statement itself
- lval that are read.
- Function calls in the corresponding statement Note that this include only a subset of the affectations of the statement. Namely, the temporary variables generated by cil are excluded (i.e. it is assumed that the "compilation" is correct). In addition, side effects caused by function applications are not taken into account in the list. For a single statement, the written lvals are supposed to be ordered (or their order of evaluation doesn't matter), so that an alarm should be emitted only if the lvals read by a statement overlap with the lvals written (or read) by another statement of the sequence.
At this time this feature is experimental and may miss some unspecified sequences.
In case you do not care about this feature just handle it like a block (see Cil.block_from_unspecified_sequence
).
| Throw of (exp * typ) option * location
Throws an exception, C++ style. We keep the type of the expression, to match it against the appropriate catch clause. A Throw node has no successor, even if it is in try-catch block that will catch the exception: we keep normal and exceptional control-flow completely separate, as in Jo and Chang, ICSSA 2004.
| TryCatch of block * (catch_binder * block) list * location
| TryFinally of block * block * location
On MSVC we support structured exception handling. This is what you might expect. Control can get into the finally block either from the end of the body block, or if an exception is thrown.
| TryExcept of block * instr list * exp * block * location
On MSVC we support structured exception handling. The try/except statement is a bit tricky:
__try \{ blk \}
__except (e) \{
handler
\}
The argument to __except must be an expression. However, we keep a list of instructions AND an expression in case you need to make function calls. We'll print those as a comma expression. The control can get to the __except expression only if an exception is thrown. After that, depending on the value of the expression the control goes to the handler, propagates the exception, or retries the exception. The location corresponds to the try keyword.
and catch_binder =
| Catch_exn of varinfo * (varinfo * block) list
catch exception of given type(s). If the list is empty, only exceptions with the same type as the varinfo can be caught. If the list is non-empty, only exceptions matching one of the type of a varinfo in the list are caught. The associated block contains the operations necessary to transform the matched varinfo into the principal one. Semantics is by value (i.e. the varinfo is bound to a copy of the caught object).
This clause is a declaration point for the varinfo(s) mentioned in it. More precisely, for Catch_exn(v_0,[(v_1, b_1),..., (v_n, b_n)])
, the v_i
must be referenced in the slocals
of the enclosing fundec
, and _must not_ appear in any blocals
of some block. The scope of v_0 is all the b_i
and the corresponding block in the catch_binder * block list
of the TryCatch
node the binder belongs to. The scope of the other v_i
is the corresponding b_i
.
| Catch_all
default catch clause: all exceptions are caught.
Kind of exceptions that are caught by a given clause.
and instr =
| Set of lval * exp * location
An assignment. A cast is present if the exp has different type from lval
| Call of lval option * exp * exp list * location
optional: result is an lval. A cast might be necessary if the declared result type of the function is not the same as that of the destination. Actual arguments must have a type equivalent (i.e. Cil.need_cast
must return false
) to the one of the formals of the function. If the type of the result variable is not the same as the declared type of the function result then an implicit cast exists.
| Local_init of varinfo * local_init * location
initialization of a local variable. The corresponding varinfo must belong to the blocals
list of the innermost enclosing block that does not have attribute Cil.block_no_scope_attr
. Such blocks are purely here for grouping statements and do not play a role for scoping variables. See Cil_types.block
definition for more information
| Asm of attributes * string list * extended_asm option * location
An inline assembly instruction. The arguments are (1) a list of attributes (only const and volatile can appear here and only for GCC) (2) templates (CR-separated) (3) GCC extended asm information if any (4) location information
| Skip of location
| Code_annot of code_annotation * location
Instructions. They may cause effects directly but may not have control flow.
and extended_asm = {
asm_outputs : (string option * string * lval) list;
outputs must be lvals with optional names and constraints. I would like these to be actually variables, but I run into some trouble with ASMs in the Linux sources
asm_inputs : (string option * string * exp) list;
inputs with optional names and constraints
asm_clobbers : string list;
asm_gotos : stmt Stdlib.ref list;
list of statements this asm section may jump to. Destination must have a label.
}
GNU extended-asm information:
- a list of outputs, each of which is an lvalue with optional names and constraints.
- a list of input expressions along with constraints
- clobbered registers
- Possible destinations statements
Describes a location in a source file
Abstract syntax trees for annotations
and logic_constant =
| Integer of Integer.t * string option
Integer constant with a textual representation.
| LStr of string
| LWStr of int64 list
Wide character string constant.
| LChr of char
| LReal of logic_real
| LEnum of enumitem
and logic_real = {
r_literal : string;
Initial string representation s
.
r_nearest : float;
Nearest approximation of s
in double precision.
r_upper : float;
Smallest double u
such that s <= u
.
r_lower : float;
Greatest double l
such that l <= s
.
}
and logic_type =
| Ctype of typ
| Ltype of logic_type_info * logic_type list
an user-defined logic type with its parameters
| Lvar of string
| Linteger
mathematical integers, i.e. Z
| Lreal
mathematical reals, i.e. R
| Larrow of logic_type list * logic_type
and identified_term = {
it_id : int;
it_content : term;
}
and logic_label =
| StmtLabel of stmt Stdlib.ref
| FormalLabel of string
label of global annotation.
| BuiltinLabel of logic_builtin_label
logic label referring to a particular program point.
and logic_builtin_label =
| Here
| Old
| Pre
| Post
| LoopEntry
| LoopCurrent
| Init
builtin logic labels defined in ACSL.
Terms
C Expressions as logic terms follow C constructs (with prefix T)
and term = {
term_node : term_node;
term_loc : Filepath.position * Filepath.position;
position in the source file.
term_type : logic_type;
term_name : string list;
names of the term if any. A name can be an arbitrary string, where '"' and '\'' are escaped by a \, and which does not end with a \. Hence, "name" and 'name' should be recognized as a unique label by most tools.
}
the various kind of terms.
lvalue: base address and offset.
and term_lhost =
| TVar of logic_var
| TResult of typ
value returned by a C function. Only used in post-conditions or assigns
| TMem of term
base address of an lvalue.
and model_info = {
mi_name : string;
mi_field_type : logic_type;
mi_base_type : typ;
type to which the field is associated.
mi_decl : location;
where the field has been declared.
mutable mi_attr : attributes;
attributes tied to the field.
}
and logic_info = {
mutable l_var_info : logic_var;
we use only fields lv_name and lv_id of l_var_info we should factorize lv_type and l_type+l_profile below
mutable l_labels : logic_label list;
label arguments of the function.
mutable l_tparams : string list;
mutable l_type : logic_type option;
return type. None for predicates
mutable l_profile : logic_var list;
mutable l_body : logic_body;
}
description of a logic function or predicate.
and builtin_logic_info = {
mutable bl_name : string;
mutable bl_labels : logic_label list;
mutable bl_params : string list;
mutable bl_type : logic_type option;
mutable bl_profile : (string * logic_type) list;
}
and logic_body =
| LBnone
no definition and no reads clause
| LBreads of identified_term list
read accesses performed by a function.
| LBterm of term
direct definition of a function.
| LBpred of predicate
direct definition of a predicate.
| LBinductive of (string * logic_label list * string list * predicate) list
and logic_type_info = {
mutable lt_name : string;
lt_params : string list;
mutable lt_def : logic_type_def option;
definition of the type. None for abstract types.
mutable lt_attr : attributes;
attributes associated to the logic type.
}
Description of a logic type.
and logic_var_kind =
| LVGlobal
global logic function or predicate.
| LVC
Logic counterpart of a C variable.
| LVFormal
formal parameter of a logic function / predicate or \lambda abstraction
| LVQuant
Bound by a quantifier (\exists or \forall)
| LVLocal
origin of a logic variable.
and logic_var = {
mutable lv_name : string;
mutable lv_id : int;
mutable lv_type : logic_type;
mutable lv_kind : logic_var_kind;
mutable lv_origin : varinfo option;
when the logic variable stems from a C variable, set to the original C variable.
mutable lv_attr : attributes;
attributes tied to the logic variable
}
description of a logic variable
and logic_ctor_info = {
mutable ctor_name : string;
ctor_type : logic_type_info;
type to which the constructor belongs.
ctor_params : logic_type list;
types of the parameters of the constructor.
}
Description of a constructor of a logic sum-type.
Predicates
variables bound by a quantifier.
and relation =
| Rlt
| Rgt
| Rle
| Rge
| Req
| Rneq
and predicate_kind =
| Assert
| Check
| Admit
and toplevel_predicate = {
tp_kind : predicate_kind;
whether the annotation is only used to check that ip_content
holds, but stays invisible for other verification tasks (see description of ACSL's check keyword).
tp_statement : predicate;
}
main statement of an annotation.
and predicate = {
pred_name : string list;
pred_loc : location;
position in the source code.
pred_content : predicate_node;
}
predicates with a location and an optional list of names
variant of a loop or a recursive function.
and allocation =
| FreeAlloc of identified_term list * identified_term list
tsets. Empty list means \nothing.
| FreeAllocAny
Nothing specified. Semantics depends on where it is written.
and deps =
| From of identified_term list
tsets. Empty list means \nothing.
| FromAny
Nothing specified. Any location can be involved.
dependencies of an assigned location.
and assigns =
| WritesAny
Nothing specified. Anything can be written.
| Writes of from list
list of locations that can be written. Empty list means \nothing.
zone assigned with its dependencies.
and spec = {
mutable spec_behavior : behavior list;
mutable spec_variant : variant option;
variant for recursive functions.
mutable spec_terminates : identified_predicate option;
mutable spec_complete_behaviors : string list list;
list of complete behaviors. It is possible to have more than one set of complete behaviors
mutable spec_disjoint_behaviors : string list list;
list of disjoint behaviors. It is possible to have more than one set of disjoint behaviors
}
Function or statement contract. This type shares the name of its constructors with Logic_ptree.spec
.
Extension to standard ACSL clause with an unique identifier.
The integer is a (unique) identifier. The boolean flag is true
if the annotation can be assigned a property status.
Use Logic_const.new_acsl_extension
to create new acsl extension with a fresh id. Each extension is associated with a keyword, and can be either a global annotation, the clause of a function contract, a code annotation, or a loop annotation. An extension can be registered through the function Acsl_extension.register_xxx
.
It is _not_ possible to register the same keyword for annotations at two different levels (e.g. global
and behavior
), as this would make the grammar ambiguous.
and acsl_extension_kind =
| Ext_id of int
id used internally by the extension itself.
| Ext_terms of term list
| Ext_preds of predicate list
a list of predicates, the most common case of for extensions
| Ext_annot of string * acsl_extension list
Where are we expected to find corresponding extension keyword.
and ext_code_annot_context =
| Ext_here
at current program point.
| Ext_next_stmt
| Ext_next_loop
| Ext_next_both
can be found both as normal code annot or loop annot.
Behavior of a function or statement. This type shares the name of its constructors with Logic_ptree.behavior
.
and termination_kind =
| Normal
| Exits
| Breaks
| Continues
| Returns
kind of termination a post-condition applies to. See ACSL manual.
and loop_pragma =
| Unroll_specs of term list
| Widen_hints of term list
| Widen_variables of term list
Pragmas for the value analysis plugin of Frama-C.
and slice_pragma =
| SPexpr of term
| SPctrl
| SPstmt
Pragmas for the slicing plugin of Frama-C.
and impact_pragma =
| IPexpr of term
| IPstmt
Pragmas for the impact plugin of Frama-C.
The various kinds of pragmas.
and code_annotation_node =
| AAssert of string list * toplevel_predicate
assertion to be checked. The list of strings is the list of behaviors to which this assertion applies.
| AStmtSpec of string list * spec
statement contract (potentially restricted to some enclosing behaviors).
| AInvariant of string list * bool * toplevel_predicate
loop/code invariant. The list of strings is the list of behaviors to which this invariant applies. The boolean flag is true for normal loop invariants and false for invariant-as-assertions.
| AVariant of variant
loop variant. Note that there can be at most one variant associated to a given statement
| AAssigns of string list * assigns
loop assigns. (see b_assigns
in the behaviors for other assigns). At most one clause associated to a given (statement, behavior) couple.
| AAllocation of string list * allocation
loop allocation clause. (see b_allocation
in the behaviors for other allocation clauses). At most one clause associated to a given (statement, behavior) couple.
| APragma of pragma
| AExtended of string list * bool * acsl_extension
extension in a code or loop annotation. Boolean flag is true for loop extensions and false for code extensions
all annotations that can be found in the code. This type shares the name of its constructors with Logic_ptree.code_annot
.
function contract.
and code_annotation = {
annot_id : int;
annot_content : code_annotation_node;
content of the annotation.
}
global annotations, not attached to a statement or a function.
type kinstr =
| Kstmt of stmt
| Kglobal
type cil_function =
| Definition of fundec * location
| Declaration of funspec * varinfo * varinfo list option * location
Declaration(spec,f,args,loc) represents a leaf function f
with specification spec
and arguments args
, at location loc
. As with the TFun
constructor of Cil_types.typ
, the arg list is optional, to distinguish void f()
(None
) from void f(void)
(Some []
).
Internal representation of decorated C functions
Only field fundec
can be used directly. Use Annotations.funspec
, Annotations.add_*
and Annotations.remove_*
to query or modify field spec
.
type syntactic_scope =
| Global
Any global symbol, whether static or not.
| Program
Only non-static global symbols.
| Translation_unit of Filepath.Normalized.t
Any global visible within the given C source file.
| Formal of kernel_function
formal parameter of the given function.
| Block_scope of stmt
locals (including static locals) of the block to which the given statement belongs.
| Whole_function of kernel_function
same as above, but any local variable of the given function, regardless of the block to which it is tied, will be considered.
Various syntactic scopes through which an identifier might be searched. Note that for this purpose static variables are still tied to the block where they were declared in the original source (see Cil_types.block
).
type mach = {
sizeof_short : int;
sizeof_int : int;
sizeof_long : int;
sizeof_longlong : int;
sizeof_ptr : int;
sizeof_float : int;
sizeof_double : int;
sizeof_longdouble : int;
sizeof_void : int;
sizeof_fun : int;
size_t : string;
ssize_t : string;
wchar_t : string;
ptrdiff_t : string;
intptr_t : string;
uintptr_t : string;
int_fast8_t : string;
int_fast16_t : string;
int_fast32_t : string;
int_fast64_t : string;
uint_fast8_t : string;
uint_fast16_t : string;
uint_fast32_t : string;
uint_fast64_t : string;
wint_t : string;
sig_atomic_t : string;
time_t : string;
alignof_short : int;
alignof_int : int;
alignof_long : int;
alignof_longlong : int;
alignof_ptr : int;
alignof_float : int;
alignof_double : int;
alignof_longdouble : int;
alignof_str : int;
alignof_fun : int;
char_is_unsigned : bool;
little_endian : bool;
alignof_aligned : int;
has__builtin_va_list : bool;
compiler : string;
cpp_arch_flags : string list;
version : string;
weof : string;
wordsize : string;
posix_version : string;
bufsiz : string;
eof : string;
fopen_max : string;
filename_max : string;
host_name_max : string;
tty_name_max : string;
l_tmpnam : string;
path_max : string;
tmp_max : string;
rand_max : string;
mb_cur_max : string;
nsig : string;
errno : (string * string) list;
machdep_name : string;
custom_defs : string;
}
Definition of a machine model (architecture + compiler).