package bap-std
Disassembled program.
Project contains data that we were able to reconstruct during the disassembly, semantic analysis, and other arbitrary amount of analyses.
Actually, project allows to associate arbitrary data with memory regions, program terms, and even attach them globally to itself. So it can be seen as a knowledge base of deeply interconnected facts.
Other than delivering information, from the bap to a passes, it can be also used as a communication media between different passes, (see Working with project).
type t = project
val bin_size_state : state Core_kernel.Bin_prot.Size.sizer
val bin_write_state : state Core_kernel.Bin_prot.Write.writer
val bin_writer_state : state Core_kernel.Bin_prot.Type_class.writer
val bin_read_state : state Core_kernel.Bin_prot.Read.reader
val __bin_read_state__ : (int -> state) Core_kernel.Bin_prot.Read.reader
val bin_reader_state : state Core_kernel.Bin_prot.Type_class.reader
val bin_state : state Core_kernel.Bin_prot.Type_class.t
IO interface to a project data structure.
include Regular.Std.Data.S with type t := t
val size_in_bytes : ?ver:string -> ?fmt:string -> t -> int
val of_bytes : ?ver:string -> ?fmt:string -> Regular.Std.bytes -> t
val to_bytes : ?ver:string -> ?fmt:string -> t -> Regular.Std.bytes
val blit_to_bytes :
?ver:string ->
?fmt:string ->
Regular.Std.bytes ->
t ->
int ->
val of_bigstring : ?ver:string -> ?fmt:string -> Core_kernel.bigstring -> t
val to_bigstring : ?ver:string -> ?fmt:string -> t -> Core_kernel.bigstring
val blit_to_bigstring :
?ver:string ->
?fmt:string ->
Core_kernel.bigstring ->
t ->
int ->
module Io : sig ... end
module Cache : sig ... end
val add_reader :
?desc:string ->
ver:string ->
string ->
t Regular.Std.reader ->
val add_writer :
?desc:string ->
ver:string ->
string ->
t Regular.Std.writer ->
val available_readers : unit -> info list
val default_reader : unit -> info
val available_writers : unit -> info list
val default_writer : unit -> info
val default_printer : unit -> info option
val find_reader : ?ver:string -> string -> t Regular.Std.reader option
val find_writer : ?ver:string -> string -> t Regular.Std.writer option
val create :
?package:string ->
?state:state ->
?disassembler:string ->
?brancher:brancher source ->
?symbolizer:symbolizer source ->
?rooter:rooter source ->
?reconstructor:reconstructor source ->
input ->
t Core_kernel.Or_error.t
from_file filename
creates a project from the provided input source.
The input code regions are speculatively disassembled and the set of basic blocks is determined, using the algorithm described in Disasm.Driver
. After that the concrete whole program control-flow graph (CFG) is built, which can be accessed with the Project.disasm
function. The whole program CFG is then partitioned into a set of subroutines using the dominators analsysis, see Disasm.Subroutines
for details. Based on this partition a symbol table, which is a set of a subroutines control-flow graphs, is built. The symbol table, which can be accessed with Project.symbols
, also contains information about the interprocedural control flow. Finally, the symbol table is translated into the intermediate representation, which can be accessed using the Project.program
function. The whole process is pictured below.
--------------------- ( Disassembling ) --------------------- | +---------------------+ | | | All instructions | | and basic blocks | | | +---------------------+ | --------------------- ( CFG reconstruction ) --------------------- | +---------------------+ | | | The whole program | | control-flow graph | | | +---------------------+ | --------------------- ( Partitioning ) --------------------- | +---------------------+ | | | The quotient set of | | basic blocks | | | +---------------------+ | --------------------- ( Constructing Symtab ) --------------------- | +---------------------+ | | | The symbol table | | and the callgraph | | | +---------------------+ | --------------------- ( IR Reconstruction ) --------------------- | +---------------------+ | | | The IR of the | | binary program | | | +---------------------+
The disassembling process is fully integrated with the knowledge base. If the input source provides information about symbols and their location, then this information will be automatically reflected to the knowledge base.
The brancher
, symbolizer
, and rooter
parameters are ignored since 2.0.0 and their information could be reflected to the knowledge base using, correspondingly, Brancher.provide
, Symbolizer.provide
, and Rooter.provide
val empty : Bap_core_theory.Theory.Target.t -> t
empty target
creates a for the given target
val target : t -> Bap_core_theory.Theory.Target.t
target project
returns the target system of the project.
val specification : t -> Ogre.doc
specification p
returns the specification of the binary.
map_program t ~f
maps the IR representation of the program with function f
Note: since the program is computed lazily this function should be preferred to program
composed with_program
for passes that transform the program representation so that they are not run if the program is never ever used.
memory t
returns the memory as an interval tree marked with arbitrary values.
val memory_slot :
(Bap_core_theory.Theory.Unit.cls, value memmap) Bap_core_theory.KB.slot
the memory of the unit in the knowledge base.
tag_memory project region tag value
tags a given region
of memory in project
with a given tag
and value
. Example: Project.tag_memory project tained color red
substitute p region tag value
is like tag_memory, but it will also apply substitutions in the provided string value, as per OCaml standard library's Buffer.add_substitute
Project.substitute project comment "$symbol starts at $symbol_addr"
The following substitutions are supported:
- name of region of file to which it belongs. For example, in ELF this name will correspond to the section name
- name or address of the symbol to which this memory belongs
- assembler listing of the memory region
- BIL code of the tagged memory region
- name or address of a basic block to which this region belongs
$min_addr, $addr
- starting address of a memory region
- address of the last byte of a memory region.
with_memory project
updates project memory. It is recommended to use tag_memory
and substitute
instead of this function, if possible.
Extensible record
Project can also be viewed as an extensible record, where one can store arbitrary values. Example,
let p = Project.set project color `green
This will set field color
to a value `green
set project field value
sets a field
to a give value. If field
was already set, then new value overrides the old one. Otherwise the field is added.
has project field
checks whether field exists or not. Useful for fields of type unit, that actually isomorphic to bool fields, e.g., if Project.has project mark
module Info : sig ... end
Information obtained during project reconstruction.
module State : sig ... end
The core state of the project.
module Input : sig ... end
Input information.
Registering passes
To add new pass one of the following register_*
functions should be called.
val register_pass :
?autorun:bool ->
?runonce:bool ->
?deps:string list ->
?name:string ->
(t -> t) ->
register_pass ?autorun ?runonce ?deps ?name pass
registers a pass
over a project.
If autorun
is true
, then the host program will run this pass automatically. If runonce
is true, then for a given project the pass will be run only once. Each repeating attempts to run the pass will be ignored. The runonce
parameter defaults to false
when autorun
is false
, and to true
Parameter deps
is list of dependencies. Each dependency is a name of a pass, that should be run before the pass
. The dependencies will be run in a specified order every time the pass
is run.
To get access to command line arguments use Plugin.argv
val register_pass' :
?autorun:bool ->
?runonce:bool ->
?deps:string list ->
?name:string ->
(t -> unit) ->
register_pass' pass
registers pass
that doesn't modify the project effect and is run only for side effect. (See register_pass
val passes : unit -> pass list
passes ()
returns all currently registered passes.
val find_pass : string -> pass option
find_pass name
returns a pass with the given name.
module Pass : sig ... end
A program analysis pass.
module Collator : sig ... end
A pass that collates projects.
module Analysis : sig ... end
Knowledge base analyses.