package bap-std

  1. Overview
  2. Docs
On This Page
  1. Basic types
Module type
Class type

Basic disassembler.

This is a target agnostic basic low-level disassembler.

type pred = [
  1. | `Valid

    stop on first valid insn

  2. | Kind.t

predicate to drive the disassembler

include sig ... end
val pred_of_sexp : Sexplib.Sexp.t -> pred
val __pred_of_sexp__ : Sexplib.Sexp.t -> pred
val sexp_of_pred : pred -> Sexplib.Sexp.t

Basic types

type (+'a, +'k) insn

insn basic instruction.

See Insn module for a more detailed description.

@typevar 'a = asm | empty, denotes whether assembly representation is available for the given instruction.

@typevar 'k = kind | empty, denotes whether semantics kinds are available for the given instruction.

type (+'a, +'k) insns = (mem * ('a, 'k) insn option) list

insns is a list of pairs, where each pair consists of a memory region occupied by an instruction, and the instruction itself.

type empty

witnesses the absence of the information

type asm

witnesses a presence of the assembly string

type kinds

witnesses a presence of the semantic kinds

type full_insn = (asm, kinds) insn

abbreviate an instruction with full information.

include sig ... end
val sexp_of_full_insn : full_insn -> Sexplib.Sexp.t
val compare_full_insn : full_insn -> full_insn -> int
type ('a, 'k) t


'a and 'k type variables specify disassembler modes of operation. In a process of disassembly it can store extra information that might be useful. Although, since storing it takes extra time and space, it is disabled by default.

The first type variable specifies whether storing assembly strings is enabled. It can be switched using store_asm, drop_asm functions. When it is enabled, then this type variable will be set to asm, and it will give an access to functions that returns this information. Otherwise, this type variable will be set to empty, thus stopping you from accessing assembler information.

The second type variable stands for kinds, i.e. to store or not to store extra information about instruction kind.

Note: at some points you can have an access to this information even if you don't enable it explicitly.

type (+'a, +'k, 's, 'r) state

Disassembler state.

Words of precautions: this state is valid only inside handlers functions of the run function. It shouldn't be stored anywhere. First two type variables are bound correspondingly to two variables of the disassmbler ('a,'k) t type. The last pair of type variables are bounded to input and output types of user functions. They are made different, so that a function can be run in an arbitrary monad. For simple cases, the can be made the same.

val with_disasm : ?debug_level:int -> ?cpu:string -> backend:string -> string -> f:((empty, empty) t -> 'a Core_kernel.Std.Or_error.t) -> 'a Core_kernel.Std.Or_error.t

with_disasm ?debug_level ?cpu ~backend ~f target creates a disassembler passing all options to create function and applies function f to it. Once f is evaluated the disassembler is closed with close function.

val create : ?debug_level:int -> ?cpu:string -> backend:string -> string -> (empty, empty) t Core_kernel.Std.Or_error.t

create ?debug_level ?cpu ~backend target creates a disassembler for the specified target. All parameters are backend specific, consult the concrete backend for more information. In general, the greater debug_level is, the more debug information will be outputed by a backend. To silent backend set it 0. This is a default value. Example:

create ~debug_level:3 ~backend:"llvm" "x86_64" ~f:process

val close : (_, _) t -> unit

close d closes a disassembler d.

val store_asm : (_, 'k) t -> (asm, 'k) t

enables storing assembler information

val store_kinds : ('a, _) t -> ('a, kinds) t

enables storing instruction kinds information

val run : ?backlog:int -> ?stop_on:pred list -> ?invalid:(('a, 'k, 's, 'r) state -> mem -> 's -> 'r) -> ?stopped:(('a, 'k, 's, 'r) state -> 's -> 'r) -> ?hit:(('a, 'k, 's, 'r) state -> mem -> (asm, kinds) insn -> 's -> 'r) -> ('a, 'k) t -> return:('s -> 'r) -> init:'s -> mem -> 'r

run ?stop_on ?invalid ?stopped dis mem ~init ~return ~hit performs recursive disassembly of specified memory mem. The process of disassembly can be driven using stop, step, back and jump functions, described later.

  • parameter backlog

    defines a size of history of states, that can be used for backtracking. Defaults to some positive natural number.

  • parameter stop_on

    defines a set of predicates that will be checked on each step to decide whether it should stop here and call a user-provided hit function, or it should continue. The descision is made acording to the rule: if exists stop_on then stop, i.e., it there exists such predicate in a set of predicates, that evaluates to true, then stop the disassembly and pass the control to the user function hit. A few notes: only valid instructions can match predicates, and if the set is empty, then it always evaluates to false.

  • parameter init

    initial value of user data, that can be passed through handlers (cf., fold)

  • parameter return

    a function that lifts user data type 's to type 'r. It is useful when you need to perform disassembly in some monad, like Or_error, or Lwt. Otherwise, just use ident function and assume that 's == 'r.

    In a process of disassembly user provided callbacks are invoked by the engine. To each callback at least two parameters are passed: state and user_data. user_data is arbitrary data of type 's with which the folding over the memory is actually performed. state incapsulates the current state of the disassembler, and provides continuation functions, namely stop, next and back, that drives the process of disassembly. This functions are used to pass control back to the disassembler.

    stopped state user_data is called when there is no more data to disassemble. This handler is optional and defaults to stop.

    invalid state user_data is an optional handler that is called on each invalid instruction (i.e., a portion of data that is not a valid instruction), it defaults to step, i.e., to skipping.

    hit state mem insn data is called when one of the predicates specifed by a user was hit. insn is actually the instruction that satisfies the predicate. mem is a memory region spanned by the instruction. data is a user data. insn can be queried for assembly string and kinds even if the corresponding modes are disabled.

val insn_of_mem : (_, _) t -> mem -> (mem * (asm, kinds) insn option * [ `left of mem | `finished ]) Core_kernel.Std.Or_error.t

insn_of_mem dis mem performes a disassembly of one instruction from the a given memory region mem. Returns a tuple imem,insn,`left over where imem stands for a piece of memory consumed in a process of disassembly, insn can be Some ins if disassembly was successful, and None otherwise. `left over complements imem to original mem.

val addr : (_, _, _, _) state -> addr

current position of the disassembler

val preds : (_, _, _, _) state -> pred list

current set of predicates

val with_preds : ('a, 'k, 's, 'r) state -> pred list -> ('a, 'k, 's, 'r) state

updates the set of predicates, that rules the stop condition.

val insns : ('a, 'k, _, _) state -> ('a, 'k) insns

a queue of instructions disassembled in this step

val last : ('a, 'k, 's, 'r) state -> int -> ('a, 'k) insns

last s n returns last n instructions disassembled in this step. If there are less then n instructions, then returns a smaller list

val memory : (_, _, _, _) state -> mem

the memory region we're currently working on

val stop : (_, _, 's, 'r) state -> 's -> 'r

stop the disassembly and return the provided value.

val step : (_, _, 's, 'r) state -> 's -> 'r

continue disassembling from the current point. You can change a a set of predicates, before stepping next. If you want to continue from a different address, use jump

val jump : (_, _, 's, 'r) state -> mem -> 's -> 'r

jump to the specified memory and continue disassembly in it.

For example, if you want to jump to a specified address, and you're working in a Or_error monad, then you can:

view ~from:addr (mem state) >>= fun mem -> jump mem data

val back : (_, _, 's, 'r) state -> 's -> 'r

restarts last step.

module Insn : sig ... end

Basic instruction. This instruction is an opaque pointer into C-backend, thus it is protected with phantom types.

module Trie : sig ... end

Trie maps over instructions

val available_backends : unit -> string list

enumerates names of available disassembler backends.


Innovation. Community. Security.