Suppose we want to process a stream of events that represent the behavior of some memory allocator. We expect the following kinds of events:
(acquire SITE PTR LEN)represents an allocation event, where SITE is an allocation site (i.e., an address of a program instruction that performs an allocation),
PTRis a pointer to the allocated data, and
LENis the data size;
(release SITE PTR LEN)represents a memory deallocation event;
(violate SITE PTR KIND)represents a violation of a memory allocator invariants of the specified KIND.
Suppose we want to build a rule, that will report each violation as a tuple that contains a kind of the violation and all three sites, that are involved in the incident.
(((acquire ?asite ?ptr ?) (release ?rsite ?ptr ?) (violate ?vsite ?ptr ?kind) ((?kind ?asite ?rsite ?vsite)))
This rule can be read as, for all acquire, release, and violate events that have the same
PTR produce a fact of a violation of the specified kind that contains the program locations of these three events. For example, given the following sequence of events:
(acquire L10 0xBAD 10) (release L20 0XBAD 10) (violate L30 0XBAD use-after-free) (violate L40 0xBAD double-free) (violate L50 0XBAD use-after-free)
The following facts will be produced if the rule is sequentially applied to the above sequence:
(use-after-free L10 L20 L30) (double-free L10 L20 L40) (use-after-free L10 L20 L50)
A tuple is represented by an arbitrary S-expression. A variable is an atom that starts with the question mark. A special variable
? (one question mark symbol) may occur on the left hand side of a rule, and represents a freshly created variable. Every occurrence of the
? symbol represents a different variable. Both sides of a rule may be empty (represented by the 0-tuple
rule,r ::= ((p1 .. pM) (f1 .. fN)) sexp,s ::= atom | (s1 .. sM) patt,p ::= sexp fact,f ::= sexp atom,a ::= ?sequence-of-chars?
(((duck ?name) (does ?action)) ((looks like ?name did ?action)))
This rule contains two patterns
(duck ?name) and
(does ?action) and one production fact
(looks like ?name did ?action).
A rule r is a tuple (P,F) where P is a sequence of patterns p1,...,pM, F is a sequence of facts f1,...,fN, and M,N are non-negative numbers. A pattern p and fact f are arbitrary n-tuples (terms) t, that contain atoms, other tuples, and free variables.
rule,r ::= (B,P,F) P ::= p1,... F ::= f1,... B ::= v1->f,... V ::= v1,... T ::= t1,... term,p,f,s,t ::= v | a | (t,..,tM) atom,x,y,z ::= ?atom? v ::= ?variable?
We use a regular sequent calculus (with all structural rules) to represent semantics. A rule with a sequence of patterns P matches with a sequence of facts with the given valuations, denoted as
B,T |- P, if it can be proved using the rules below (plus regular rules of the sequent calculus). The (match-rule) rule, expands rules (eliminating tuples on the top level). The (match-tuple) rule further expands patterns until each pattern is either an atom, in that case it is proved by the axiom of identity, or to a variable, in that case a match can be proved with the (match-var) rule, that says that a variable matches a term only if it is bound to that term.
Note, that exchange and weakening rules of sequent calculus allows us to drop and rearrange facts. The latter makes the order of rules irrelevant. If for a sequence of facts, there are several valuations of variables that will lead to a match, then all these matches are provided. In other words, the engine will generate all derivable facts in an unspecified but consistent order.
B,T |- p1 .. B,T |- pM ----------------------- (match-rule) B,T |- (p1,..,pM) B,t1,T |- s1 .. B,tM,T |- sM ---------------------------- (match-tuple) B,(t1,..,tM),T |- (s1,..,sM) B,T |- v->t ---------- (match-var) B,t,T |- v
type tuple = Core_kernel.Sexp.t
representation of a tuple
type fact = tuple
representation of a fact
module Rule : sig ... end
Matching rule specification.