Bitvector -- a type for representing binary values.
Overview
A numeric value with a 2-complement binary representation. It is good for representing addresses, offsets and other numeric values.
Each value is attributed by a its bit-width. All arithmetic operations over values are done modulo their widths. It is an error to apply arithmetic operation to values with different widths. Default implementations will raise a Width
exception, however there exists a family of modules that provide arithmetic operations lifted to an Or_error.t
monad. It is suggested to use them, if you know what kind of operands you're expecting.
Clarifications endianness and bit-ordering
Bitvector should be considered as an number with an arbitrary width. That means, that as with all numbers it is subject to endianness. When we iterate over bitvector using some container interface we always start from the byte with the lower address. Depending on endianness it will be either least significant bytes (little-endian), or most significant (big-endian). Sometimes id does matter, sometimes it doesn't. In a latter case you can just use a default native-endian interface. But in a former case, please consider using explicit modules, either Bytes_LE
or Bytes_BE
, even if you know that your system is LE
. Things change.
Bits are always numbered from right to left, with least significant bit having a zero index, and most significant having index equal to width - 1
. That means, they're endianness agnostic.
Clarification on size-morphism
Size-monomorphic operations (as opposed to size-polymorphic comparison) doesn't allow to compare two operands with different sizes, and either raise exception or return Error
. If we would have type safe interface, with type t
defined as type 'a t
, where 'a
stands for size, then size-monomorphic operations will have type 'a t -> 'a t -> _
, and size-polymorphic 'a t -> 'b t -> _
.
By default, size-polymorphic comparison is used (for rationale of this decision look at the implementation of a hash function). To understand the ordering relation one can think that a lexical ordering is specified on a tuple (x,n)
, where x
is the number and n
is the size. For example, the following sequence is in an ascending order:
0x0:1, 0x0:32, 0x0:64, 0x1:1, 0x1:32, 0xD:4, 0xDEADBEEF:32
.
A size-monomorphic interfaced is exposed in a Mono
submodule. So if you want a monomorphic map, then just use Mono.Map
module. Note, Mono
submodule doesn't provide Table
, since we cannot guarantee that all keys in a hash-table have equal size.
Clarification on signs
By default all numbers represented by a bitvector are considered unsigned. This includes comparisons, e.g., of_int (-1) ~width:32
is greater than zero. If you need to perform signed operation, you can use signed
operator to temporary cast your value to signed. We use temporary to emphasize that, the signedness property won't propagate to the result of the operation, e.g. result of the following expression: Int_exn.(signed x / y)
will not be signed.
If any operand of a binary operation is signed, then a signed version of an operation is used.
Remember to use explicit casts, whenever you really need a signed representation. Examples:
let x = of_int ~-6 ~width:8
let y = to_int x (* y = 250 *)
let z = to_int (signed x) (* z = ~-6 *)
let zero = of_int 0 ~width:8
let p = x < zero (* p = false *)
let q = signed x < zero (* p = true *)
Clarification on string representation
As a part of Identifiable
interface bitvector provides a pair of complement functions: to_string
and of_string
, that provides facilities to store bitvector as a human readable string, and to restore it from string. The format of the representation is the following (in EBNF):
repr = [sign], base, digit, {digit}, ":", size | true | false;
sign = "+" | "-";
base = "0x" | "0b" | "0o";
size = dec, {dec};
digit = dec | oct | hex;
dec = ?decimal digit?;
oct = ?octal digit?;
hex = ?hexadecimal digit?;
Examples: 0x5D:32, 0b0101:16, 5:64, +5:8, +0x5D:16, true, false.
.
Form false
is a shortcut for 0:1
, as well as true
is 1:1
.
If base
is omitted base-10 is assumed. The output format is lways "0x", hex, {hex}
in an unsigned form.
word
is an abbreviation to Bitvector.t
Common Interfaces
Bitvector is a value, first of all, so it supports a common set of a value interface: it can be stored, compared, it can be a key in a dictionary, etc. Moreover, being a number it can be compared with zero and applied to a common set of integer operations.
Bitvector implements a common set of operations that are expected from integral values.
include Regular.Std.Regular.S with type t := t
val t_of_sexp : Sexplib.Sexp.t -> t
val sexp_of_t : t -> Sexplib.Sexp.t
val bin_t : t Core_kernel.Std.Bin_prot.Type_class.t
val bin_read_t : t Core_kernel.Std.Bin_prot.Read.reader
val __bin_read_t__ : (int -> t) Core_kernel.Std.Bin_prot.Read.reader
val bin_reader_t : t Core_kernel.Std.Bin_prot.Type_class.reader
val bin_size_t : t Core_kernel.Std.Bin_prot.Size.sizer
val bin_write_t : t Core_kernel.Std.Bin_prot.Write.writer
val bin_writer_t : t Core_kernel.Std.Bin_prot.Type_class.writer
val to_string : t -> string
val str : unit -> t -> string
val pps : unit -> t -> string
val ppo : Core_kernel.Std.out_channel -> t -> unit
val pp_seq : Format.formatter -> t Core_kernel.Std.Sequence.t -> unit
val pp : Format.formatter -> t -> unit
val (>=) : t -> t -> bool
val (<=) : t -> t -> bool
val (<>) : t -> t -> bool
val equal : t -> t -> bool
val compare : t -> t -> int
val ascending : t -> t -> int
val descending : t -> t -> int
val between : t -> low:t -> high:t -> bool
val clamp_exn : t -> min:t -> max:t -> t
val clamp : t -> min:t -> max:t -> t Core_kernel.Or_error.t
val validate_lbound :
min:t Core_kernel.Maybe_bound.t ->
t Core_kernel.Validate.check
val validate_ubound :
max:t Core_kernel.Maybe_bound.t ->
t Core_kernel.Validate.check
val validate_bound :
min:t Core_kernel.Maybe_bound.t ->
max:t Core_kernel.Maybe_bound.t ->
t Core_kernel.Validate.check
val hashable : t Core_kernel.Std.Hashable.Hashtbl.Hashable.t
module Table : sig ... end
type info = string * [ `Ver of string ] * string option
val size_in_bytes : ?ver:string -> ?fmt:string -> t -> int
val of_bytes : ?ver:string -> ?fmt:string -> Regular.Std.bytes -> t
val to_bytes : ?ver:string -> ?fmt:string -> t -> Regular.Std.bytes
val blit_to_bytes :
?ver:string ->
?fmt:string ->
Regular.Std.bytes ->
t ->
int ->
unit
val of_bigstring : ?ver:string -> ?fmt:string -> Core_kernel.Std.bigstring -> t
val to_bigstring : ?ver:string -> ?fmt:string -> t -> Core_kernel.Std.bigstring
val blit_to_bigstring :
?ver:string ->
?fmt:string ->
Core_kernel.Std.bigstring ->
t ->
int ->
unit
module Cache : sig ... end
val add_reader :
?desc:string ->
ver:string ->
string ->
t Regular.Std.reader ->
unit
val add_writer :
?desc:string ->
ver:string ->
string ->
t Regular.Std.writer ->
unit
val available_readers : unit -> info list
val default_reader : unit -> info
val set_default_reader : ?ver:string -> string -> unit
val with_reader : ?ver:string -> string -> (unit -> 'a) -> 'a
val available_writers : unit -> info list
val default_writer : unit -> info
val set_default_writer : ?ver:string -> string -> unit
val with_writer : ?ver:string -> string -> (unit -> 'a) -> 'a
val default_printer : unit -> info option
val set_default_printer : ?ver:string -> string -> unit
val with_printer : ?ver:string -> string -> (unit -> 'a) -> 'a
val find_reader : ?ver:string -> string -> t Regular.Std.reader option
val find_writer : ?ver:string -> string -> t Regular.Std.writer option
include Integer.S with type t := t
include Integer.Base with type t := t
abs x
absolute value of x
lnot x
is a logical negation of x
(1-complement)
logand x y
is a conjunction of x
and y
logand x y
is a conjunction of x
and y
logor x y
is a disjunction of x
and y
logxor x y
is exclusive or between x
and y
lshift x y
shift x
by y
bits left
rshift x y
shift x
by y
bits to the right
val arshift : t -> t -> t
arshift x y
shift x
by y
bits to the right and fill with the sign bit.
A common set of infix operators
module Mono : Core_kernel.Std.Comparable with type t := t
A comparable interface with size-monomorphic comparison.
type endian =
| LittleEndian
least significant byte comes first
| BigEndian
most significant byte comes first
Specifies the order of bytes in a word.
include sig ... end
val endian_of_sexp : Sexplib.Sexp.t -> endian
val sexp_of_endian : endian -> Sexplib.Sexp.t
val bin_endian : endian Core_kernel.Std.Bin_prot.Type_class.t
val bin_read_endian : endian Core_kernel.Std.Bin_prot.Read.reader
val __bin_read_endian__ : (int -> endian) Core_kernel.Std.Bin_prot.Read.reader
val bin_reader_endian : endian Core_kernel.Std.Bin_prot.Type_class.reader
val bin_size_endian : endian Core_kernel.Std.Bin_prot.Size.sizer
val bin_write_endian : endian Core_kernel.Std.Bin_prot.Write.writer
val bin_writer_endian : endian Core_kernel.Std.Bin_prot.Type_class.writer
Constructors
val of_string : string -> t
of_bool x
is a bitvector with length 1
and value b0
if x
is false and b1
otherwise.
val of_int : width:int -> int -> t
of_int ~width n
creates a bitvector of the specified bit-width
with the value equal to n
. If bits of the n
that doesn't fit into width
are ignored.
val of_int32 : ?width:int -> int32 -> t
of_int32 ?width n
creates a bitvector of the specified bit-width
with the value equal to n
. If bits of the n
that doesn't fit into width
are ignored. Parameter width
defaults to 32
.
val of_int64 : ?width:int -> int64 -> t
of_int32 ?width n
creates a bitvector of the specified bit-width
with the value equal to n
. If bits of the n
that doesn't fit into width
are ignored. Parameter width
defaults to 32
.
Some predefined constant constructors
b0 = of_bool false
is a zero bit
b1 = of_bool true
is a one bit
Helpful shortcuts
one width
number one with a specified width
, is a shortcut for of_int 1 ~width
zero width
zero with a specified width
, is a shortcut for of_int 0 ~width
ones width
is a number with a specified width
, and all bits set to 1. It is a shortcut for lnot (zero width)
val of_binary : ?width:int -> endian -> string -> t
of_binary ?width endian num
creates a bitvector from a string interpreted as a sequence of bytes in a specified order.
The result is always positive. num
argument is not shared. width
defaults to String.length num
Conversions to built-in integers
val to_int : t -> int Core_kernel.Std.Or_error.t
to_int x
projects x
in to OCaml int
.
val to_int32 : t -> int32 Core_kernel.Std.Or_error.t
to_int32 x
projects x
in to int32
val to_int64 : t -> int64 Core_kernel.Std.Or_error.t
to_int64 x
projects x
in to int64
val string_of_value : ?hex:bool -> t -> string
string_of_value ?hex x
returns a textual representation of x
value. If hex
is true (defaul), then it is in hexadecimal format without a leading (0x). Otherwise the decimal format is used.
signed t
casts t to a signed type, so that any operations applied on t
will be signed
is_zero bv
is true iff all bits are set to zero.
is_ones bv
is true if the least significant bit is equal to one
bitwidth bv
return a bit-width, i.e., the amount of bits
extract bv ~hi ~lo
extracts a subvector from bv
, starting from bit hi
and ending with lo
. Bits are enumerated from right to left (from least significant to most), starting from zero. hi
maybe greater then size
.
hi
defaults to width bv - 1
lo
defaults to 0
.
Example:
extract (of_int 17 ~width:8) ~hi:4 ~lo:3
will result in a two bit vector consisting of the forth and third bits, i.e., equal to a number 2
.
lo
and hi
should be non-negative numbers. lo
must be less then a width bv
and hi >= lo
.
extract_exn bv ~hi ~lo
is the same as extract
, but will raise an exception on error.
concat b1 b2
concatenates two bitvectors
succ n
returns next value after n
. It is not guaranteed that signed (succ n) > signed n
pred n
returns a value preceding n
.
val nsucc : t -> int -> t
nsucc m n
is Fn.apply_n_times ~n succ m
, but more efficient.
val npred : t -> int -> t
npred m n
is Fn.apply_n_times ~n pred addr
, but more efficient.
Iteration over bitvector components
enum_bytes x order
returns a sequence of bytes of x
in a specified order
. Each byte is represented as a bitvector
itself.
enum_bytes x order
returns bytes of x
in a specified order
, with bytes represented by char
type
enum_bits x order
returns bits of x
in a specified order
. order
defines only the ordering of words in a bitvector, bits will always be in MSB first order. The length of the sequence is always a power of 8
.
Comparison with zero
Note, we're not including With_zero
interface, since it refers to the `Sign` module, that is available only in core_kernel >= 113.33.00.
val validate_positive : t Core_kernel.Std.Validate.check
validate_positive
validates that a value is positive.
val validate_non_negative : t Core_kernel.Std.Validate.check
validate_non_negative
validates that a value is non negative.
val validate_negative : t Core_kernel.Std.Validate.check
validate_negative
validates that a value is negative.
val validate_non_positive : t Core_kernel.Std.Validate.check
validate_non_positive
validates that a value is not positive.
val is_positive : t -> bool
is_positive x
is true if x
is greater than zero. Always true if x
is unsigned.
val is_non_negative : t -> bool
is_non_negative x
is true if x
is greater than or equal to zero. Tautology if x
is unsigned.
val is_negative : t -> bool
is_negative x
is true if x
is strictly less than zero. It is a contradiction if x
is not signed.
val is_non_positive : t -> bool
is_non_positive x
is true if x
is less than zero. It is a contradiction if x
is not signed.
Arithmetic that raises exceptions.
module Trie : sig ... end
Prefix trees for bitvectors.