Module Std.Word
Shortcut for bitvectors that represent words
Bitvector -- an integer with modular arithmentics.
Overview
A numeric value with the 2-complement binary representation. It is good for representing addresses, offsets and other arithmetic values.
Each value is attributed by a bitwidth and signedness. All arithmetic operations over values are done modulo their widths. It is an error to apply arithmetic operation to values with different widths. Default implementations will raise a an exception, however there exists a family of modules that provide arithmetic operations lifted to an Or_error.t monad. It is suggested to use them, if you know what kind of operands you're expecting.
Clarification on signs
By default, all are numbers represented with bitvectors are considered unsigned. This includes the ordering, e.g., of_int (-1) ~width:32 is greater than of_int 0 ~width:32. If you need to perform a signed operation, you can use the signed operator create a signed word with the same value.
If any operand of a binary operation is signed, then a signed version of an operation is used, i.e., the other operand is upcasted to the signed kind.
Remember to use explicit casts, whenever you really need a signed representation. Examples:
let x = of_int ~-6 ~width:8
let y = to_int x (* y = 250 *)
let z = to_int (signed x) (* z = ~-6 *)
let zero = of_int 0 ~width:8
let p = x < zero (* p = false *)
let q = signed x < zero (* p = true *)
Clarification on size-morphism
Size-monomorphic operations (as opposed to size-polymorphic) expect operands of the same size. When applied to operands of different sizes they either raise exceptions or return an Error variant as the result. All arithmetic operations are size-monomorphic and we provide interface that use either exceptions or Result.t to indicate the outcome.
The comparison operation is size-polymorphic by default and takes the size of the bitvector into account. Bitvectors with equal values but different sizes are unequal. The precise order matches with the order of pairs, where the first constituent is the bitvector value, and the second is its size, for example, the following sequence is in an ascending order:
0x0:1, 0x0:32, 0x0:64, 0x1:1, 0x1:32, 0xD:4, 0xDEADBEEF:32
.
A size-monomorphic interfaced is exposed in a Mono submodule. So if you want a monomorphic map, then just use Mono.Map module. Note, Mono submodule doesn't provide Table, since we cannot guarantee that all keys in a hash-table have equal size. The order functions provided by the Mono module will raise an exception when applied to bitvectors with different sizes.
In the default and Mono orders, if either of two values is signed (see Clarification on signs) then the values will be ordered as 2-complement signed integers.
Another alternative orders are Signed_value_order, Unsigned_value_order, and Literal_order. They will be briefly described below.
Signed_value_order is size-polymoprhic and it simply ignores the sizes of bitvectors and orders them by values, e.g., the following bitvectors are ordered in the Value.Signed order, FF:8; 0:1; 0F:8; FF:32, and 0:1 is equal to 0:32. See Clarification on size-morphism for more details on the signedness of operations. Note, that the size of a word still affects the order since it defines the position of the most significant bit.
Unsigned_value_order ignores the sign and the size of words and compares them by the unsigned order of their values. he following numbers are ordered with the Unsigned_value_order order, 0:1, 1:32, 0F:8 FF:8, and FF:32 is equal to FF:8. Unsigned_value_order is faster than then any previously described order and is useful when the size of the words should be ignored (or is known to be equal and therefore could be ignored).
Literal_order is the fastest order that takes into account all constituents of bitvectors, like if we will treat a bitvector as triple of its value, size, and sign and order bitvectors using the lexicographical order.
Clarification on string representation
As a part of Identifiable interface bitvector provides a pair of complement functions: to_string and of_string, that provides facilities to store bitvector as a human readable string, and to restore it from string. The format of the representation is the following (in EBNF):
repr = [sign], [base], digit, {digit}, ":", size, [kind]
sign = "+" | "-";
base = "0x" | "0b" | "0o";
size = dec, {dec};
digit = dec | oct | hex;
dec = ?decimal digit?;
oct = ?octal digit?;
hex = ?hexadecimal digit?;
kind = u | s
Examples: 0x5D:32s, 0b0101:16u, 5:64, +5:8, +0x5D:16.
If base is omitted base-10 is assumed. If the kind is omitted, then the usigned kind is assumed. The output format is always in a hex representation with a full prefix. .
word is an abbreviation to Bitvector.t
Common Interfaces
A bitvector is a value, first of all, so it supports a common set of a value interface: it can be stored, compared, it can be a key in a dictionary, etc. Moreover, being a number it can be compared with zero and applied to a common set of integer operations.
include Regular.Std.Regular.S
with type t := t
with type comparator_witness = Bitvector.comparator_witness
include Core_kernel.Bin_prot.Binable.S with type t := t
val bin_size_t : t Bin_prot.Size.sizerval bin_write_t : t Bin_prot.Write.writerval bin_read_t : t Bin_prot.Read.readerval __bin_read_t__ : (int -> t) Bin_prot.Read.readerval bin_shape_t : Bin_prot.Shape.tval bin_writer_t : t Bin_prot.Type_class.writerval bin_reader_t : t Bin_prot.Type_class.readerval bin_t : t Bin_prot.Type_class.tinclude Sexplib0.Sexpable.S with type t := t
val t_of_sexp : Sexplib0__.Sexp.t -> tval sexp_of_t : t -> Sexplib0__.Sexp.tinclude Ppx_compare_lib.Comparable.S with type t := t
include Regular.Std.Printable.S with type t := t
val to_string : t -> stringto_string x returns a human-readable representation of x
val str : unit -> t -> stringstr () t is formatted output function that matches "%a" conversion format specifier in functions, that prints to string, e.g., sprintf, failwithf, errorf and, surprisingly all Lwt printing function, including Lwt_io.printf and logging (or any other function with type ('a,unit,string,...) formatN`. Example:
Or_error.errorf "type %a is not valid for %a"
Type.str ty Exp.str exp
val pps : unit -> t -> stringval ppo : Core_kernel.Out_channel.t -> t -> unitwill print to a standard output_channel, useful for using in printf, fprintf, etc.
prints a sequence of values of type t
this will include pp function from Core that has type t printer, and can be used in Format.printf family of functions
include Core_kernel.Pretty_printer.S with type t := t
include Core_kernel.Comparable.S_binable
with type t := t
with type comparator_witness = Bitvector.comparator_witness
val (>=) : t -> t -> boolval (<=) : t -> t -> boolval (<>) : t -> t -> boolval equal : t -> t -> boolval compare : t -> t -> intval ascending : t -> t -> intval descending : t -> t -> intval between : t -> low:t -> high:t -> boolval clamp_exn : t -> min:t -> max:t -> tval clamp : t -> min:t -> max:t -> t Base__.Or_error.tval validate_lbound : min:t Core__.Maybe_bound.t -> t Validate.checkval validate_ubound : max:t Core__.Maybe_bound.t -> t Validate.checkval validate_bound :
min:t Core__.Maybe_bound.t ->
max:t Core__.Maybe_bound.t ->
t Validate.checkinclude Core_kernel.Hashable.S_binable with type t := t
val hash_fold_t : t Base__Ppx_hash_lib.hash_foldval hash : t -> Base__Ppx_hash_lib.Std.Hash.hash_valueval hashable : t Core__.Hashtbl.Hashable.tmodule Table : sig ... endinclude Regular.Std.Data.S with type t := t
type info = string * [ `Ver of string ] * string optionname,Ver v,desc information attached to a particular reader or writer.
Data representation version. After any change in data representation the version should be increased.
Serializers that are derived from a data representation must have the same version as a version of the data structure, from which it is derived. This kind of serializers can only read and write data of the same version.
Other serializers can actually read and write data independent on its representation version. A serializer, that can't store data of current version simply shouldn't be added to a set of serializers.
It is assumed, that if a reader and a writer has the same name and version, then whatever was written by the writer should be readable by the reader. The round-trip equality is not required, thus it is acceptable if some information is lost.
It is also possible, that a reader and a writer that has the same name are compatible. In that case it is recommended to use semantic versioning.
val size_in_bytes : ?ver:string -> ?fmt:string -> t -> intsize_in_bytes ?ver ?fmt datum returns the amount of bytes that is needed to represent datum in the given format and version
of_bytes ?ver ?fmt bytes deserializes a value from bytes.
to_bytes ?ver ?fmt datum serializes a datum to a sequence of bytes.
blit_to_bytes ?ver ?fmt buffer datum offset copies a serialized representation of datum into a buffer, starting from the offset.
of_bigstring ?ver ?fmt buf deserializes a datum from bigstring
of_bigstring ?ver ?fmt datum serializes a datum to a sequence of bytes represented as bigstring
blit_to_bigstring ?ver ?fmt buffer datum offset copies a serialized representation of datum into a buffer, starting from offset.
Input/Output functions for the given datum.
module Cache : sig ... endadd_reader ?desc ~ver name reader registers a new reader with a provided name, version ver and optional description desc
add_writer ?desc ~ver name writer registers a new writer with a provided name, version ver and optional description desc
val available_readers : unit -> info listavailable_reader () lists available readers for the data type
val default_reader : unit -> infodefault_reader returns information about default reader
val set_default_reader : ?ver:string -> string -> unitset_default_reader ?ver name sets new default reader. If version is not specified then the latest available version is used. Raises an exception if a reader with a given name doesn't exist.
val with_reader : ?ver:string -> string -> (unit -> 'a) -> 'awith_reader ?ver name operation temporary sets a default reader to a reader with a specified name and version. The default reader is restored after operation is finished.
val available_writers : unit -> info listavailable_writer () lists available writers for the data type
val default_writer : unit -> infodefault_writer returns information about the default writer
val set_default_writer : ?ver:string -> string -> unitset_default_writer ?ver name sets new default writer. If version is not specified then the latest available version is used. Raises an exception if a writer with a given name doesn't exist.
val with_writer : ?ver:string -> string -> (unit -> 'a) -> 'awith_writer ?ver name operation temporary sets a default writer to a writer with a specified name and version. The default writer is restored after operation is finished.
val default_printer : unit -> info optiondefault_writer optionally returns an information about default printer
val set_default_printer : ?ver:string -> string -> unitset_default_printer ?ver name sets new default printer. If version is not specified then the latest available version is used. Raises an exception if a printer with a given name doesn't exist.
val with_printer : ?ver:string -> string -> (unit -> 'a) -> 'awith_printer ?ver name operation temporary sets a default printer to a printer with a specified name and version. The default printer is restored after operation is finished.
Low level access to serializers
find_reader ?ver name lookups a reader with a given name. If version is not specified, then a reader with maximum version is returned.
find_writer ?ver name lookups a writer with a given name. If version is not specified, then a writer with maximum version is returned.
Bitvector implements a common set of operations that are expected from integral values.
include Integer.S with type t := t
include Integer.Base with type t := t
abs x absolute value of x
lnot x is a logical negation of x (1-complement)
logand x y is a conjunction of x and y
logand x y is a conjunction of x and y
logor x y is a disjunction of x and y
logxor x y is exclusive or between x and y
lshift x y shift x by y bits left
rshift x y shift x by y bits to the right
val arshift : t -> t -> tarshift x y shift x by y bits to the right and fill with the sign bit.
A common set of infix operators
The comparable interface with size-monomorphic comparison.
Compare by value, ignore size, but take into account the sign.
Compare by value, ignore both the size and the sign.
The lexicographical order of (value,size,sign) triples.
type endian = endian = | LittleEndianleast significant byte comes first
| BigEndianmost significant byte comes first
Specifies the order of bytes in a word.
val bin_shape_endian : Core_kernel.Bin_prot.Shape.tval bin_size_endian : endian Core_kernel.Bin_prot.Size.sizerval bin_write_endian : endian Core_kernel.Bin_prot.Write.writerval bin_writer_endian : endian Core_kernel.Bin_prot.Type_class.writerval bin_read_endian : endian Core_kernel.Bin_prot.Read.readerval __bin_read_endian__ : (int -> endian) Core_kernel.Bin_prot.Read.readerval bin_reader_endian : endian Core_kernel.Bin_prot.Type_class.readerval bin_endian : endian Core_kernel.Bin_prot.Type_class.tval sexp_of_endian : endian -> Sexplib0.Sexp.tval endian_of_sexp : Sexplib0.Sexp.t -> endianConstructors
create v w creates a word from bitvector v of width w.
code_addr t x uses target's address size to create a word.
Same as create x (Theory.Target.code_addr_size t).
data_addr t x uses target's code address size to create a word.
Same as create x (Theory.Target.data_addr_size t).
data_word t x uses target's word size to create a word.
Same as create x (Theory.Target.bits t).
val of_string : string -> tof_bool x is a bitvector with length 1 and value b0 if x is false and b1 otherwise.
val of_int : width:int -> int -> tof_int ~width n creates a bitvector of the specified bit-width with the value equal to n. If bits of the n that doesn't fit into width are ignored.
val of_int32 : ?width:int -> int32 -> tof_int32 ?width n creates a bitvector of the specified bit-width with the value equal to n. If bits of the n that doesn't fit into width are ignored. Parameter width defaults to 32.
val of_int64 : ?width:int -> int64 -> tof_int32 ?width n creates a bitvector of the specified bit-width with the value equal to n. If bits of the n that doesn't fit into width are ignored. Parameter width defaults to 32.
Some predefined constant constructors
Helpful shortcuts
one width number one with a specified width, is a shortcut for of_int 1 ~width
zero width zero with a specified width, is a shortcut for of_int 0 ~width
ones width is a number with a specified width, and all bits set to 1. It is a shortcut for lnot (zero width)
val of_binary : ?width:int -> endian -> string -> tof_binary ?width endian num creates a bitvector from a string interpreted as a sequence of bytes in a specified order.
The result is always positive and unsigned. The num argument is not shared. width defaults to the length of num in bits, i.e. 8 * String.length num.
Conversions to OCaml built in integer types
to_bitvec x returns a Bitvec represenation of x
val to_int : t -> int Core_kernel.Or_error.tto_int x projects x in to OCaml int.
val to_int32 : t -> int32 Core_kernel.Or_error.tto_int32 x projects x in to int32
val to_int64 : t -> int64 Core_kernel.Or_error.tto_int64 x projects x in to int64
val to_int_exn : t -> intto_int_exn x projects x in to OCaml int.
val to_int32_exn : t -> int32to_int32_exn x projects x in to int32
val to_int64_exn : t -> int64to_int64_exn x projects x in to int64
printf "%a" pp x prints x into a formatter. This is a default printer, controlled by set_default_printer. Multiple formats are available, see the available_writers for the actual list of formats and a format description. Out of box it defaults to pp_hex_full. Note, the printf function from examples refers to the Format.printf, thus it is assumed that the Format module is open in the scope.
printf "%a" pp_hex x prints x in the hexadecimal format omitting suffixes, and the prefix if it is not necessary. Example,
# printf "%a\n" pp_hex (Word.of_int32 0xDEADBEEFl);;
0xDEADBEEF
# printf "%a\n" pp_hex (Word.of_int32 0x1);;
1
printf "%a" pp_dec x prints x in the decimal format omitting suffixes and prefixes. Example,
# printf "%a\n" pp_dec (Word.of_int32 0xDEADBEEFl);;
3735928559
# printf "%a\n" pp_dec (Word.of_int32 0x1);;
1
printf "%a" pp_oct x prints x in the octal format omitting suffixes, and the prefix if it is not necessary. Example,
# printf "%a\n" pp_oct (Word.of_int32 0xDEADBEEFl);;
0o33653337357
# printf "%a\n" pp_oct (Word.of_int32 0x1);;
1
printf "%a" pp_bin x prints x in the binary (0 and 1) format omitting suffixes, and the prefix if it is not necessary. Example,
# printf "%a\n" pp_bin (Word.of_int32 0xDEADBEEFl);;
0b11011110101011011011111011101111
# printf "%a\n" pp_bin (Word.of_int32 0x1);;
1
printf "%a" pp_hex_full x prints x in the hexadecimal format with suffixes, and the prefix if it is necessary. Example,
# printf "%a\n" pp_hex_full (Word.of_int32 0xDEADBEEFl);;
0xDEADBEEF:32u
# printf "%a\n" pp_hex_full (Word.of_int32 0x1);;
1:32u
printf "%a" pp_dec_full x prints x in the decimal format with suffixes and prefixes. Example,
# printf "%a\n" pp_dec_full (Word.of_int32 0xDEADBEEFl);;
3735928559:32u
# printf "%a\n" pp_dec_full (Word.of_int32 0x1);;
1:32u
printf "%a" pp_oct_full x prints x in the octal format with suffixes, and the prefix if it is necessary. Example,
# printf "%a\n" pp_oct_full (Word.of_int32 0xDEADBEEFl);;
0o33653337357:32u
# printf "%a\n" pp_oct_full (Word.of_int32 0x1);;
1:32u
printf "%a" pp_bin_full x prints x in the binary (0 and 1) format omitting suffixes, and the prefix if it is necessary. Example,
# printf "%a\n" pp_bin_full (Word.of_int32 0xDEADBEEFl);;
0b11011110101011011011111011101111:32u
# printf "%a\n" pp_bin_full (Word.of_int32 0x1);;
1:32u
val pp_generic :
?case:[ `upper | `lower ] ->
?prefix:[ `auto | `base | `none | `this of string ] ->
?suffix:[ `none | `full | `size ] ->
?format:[ `hex | `dec | `oct | `bin ] ->
Format.formatter ->
t ->
unitpp_generic ?case ?prefix ?suffix ?format ppf x - a printer to suit all tastes.
Note: this is a generic printer factory that should be used if none of the nine preinstantiated suits you.
val string_of_value : ?hex:bool -> t -> stringstring_of_value ?hex x returns a textual representation of the x value, i.e., ignores size and signedness. If hex is true (default), then it is in the hexadecimal representation, otherwise the decimal representation is used. The returned value is not prefixed. No leading zeros are printed. If a value is signed and negative, then a leading negative sign is printed. Hexadecimal letter literals are printed in a lowercase format.
signed t casts t to a signed type, so that any operations applied on t will be signed.
unsigned t casts t to an unsigned type, so that any operations applied to it will interpret t as an unsigned word.
is_zero bv is true iff all bits are set to zero.
is_ones bv is true if the least significant bit is equal to one
bitwidth bv return a bit-width, i.e., the amount of bits
extract bv ~hi ~lo extracts a subvector from bv, starting from bit hi and ending with lo. Bits are enumerated from right to left (from least significant to most), starting from zero. hi maybe greater than size.
hi defaults to width bv - 1 lo defaults to 0.
Example:
extract (of_int 17 ~width:8) ~hi:4 ~lo:3 will result in a two bit vector consisting of the forth and third bits, i.e., equal to a number 2.
lo and hi should be non-negative numbers. lo must be less then a width bv and hi >= lo.
extract_exn bv ~hi ~lo is the same as extract, but will raise an exception on error.
concat b1 b2 concatenates two bitvectors
succ n returns next value after n. It is not guaranteed that signed (succ n) > signed n
pred n returns a value preceding n.
val nsucc : t -> int -> tnsucc m n is Fn.apply_n_times ~n succ m, but more efficient.
val npred : t -> int -> tnpred m n is Fn.apply_n_times ~n pred addr, but more efficient.
val gcd : t -> t -> t Core_kernel.Or_error.tgcd x y is the greatest common divisor of x and y in the integers. Note that this is not always the greatest common divisor in the bitvectors of fixed length. For example, in the 32-bit unsigned integers, 2 = 2 + 2^32 = 2(1 + 2^31). Thus, 1 + 2^31 is a divisor of 2, even though gcd 2 2 = 2. Two properties that still hold are: 1. Both x and y are multiples of gcd x y, and 2. gcd x y <= min (abs x) (abs y)
val lcm : t -> t -> t Core_kernel.Or_error.tlcm x y is the least common multiple of x and y in the integers. Note that, like gcd x y, this is not always the least common multiple of x and y in the fixed- length bitvectors. See the gcd documentation for an example. The result of this function will always be some common multiple of the inputs, even in the fixed-width bitvectors.
val gcdext : t -> t -> (t * t * t) Core_kernel.Or_error.tgcdext x y returns (g, s, t) where g = gcd x y and g = s*x + t*y. See the documentation for gcd x y for why this function is tricky to use.
val gcd_exn : t -> t -> tgcd_exn x y is the same as gcd, but will raise an exception on error.
val lcm_exn : t -> t -> tlcm_exn x y is the same as lcm, but will raise an exception on error.
val gcdext_exn : t -> t -> t * t * tgcdext_exn x y is the same as gcdext, but will raise an exception on error.
Iteration over bitvector components
enum_bytes x order returns a sequence of bytes of x in a specified order. Each byte is represented as a bitvector itself.
enum_bytes x order returns bytes of x in a specified order, with bytes represented by char type
enum_bits x order returns bits of x in a specified order. order defines only the ordering of words in a bitvector, bits will always be in MSB first order. The length of the sequence is always a power of 8.
Comparison with zero
Note, we're not including With_zero interface, since it refers to the `Sign` module, that is available only in core_kernel >= 113.33.00.
val validate_positive : t Core_kernel.Validate.checkvalidate_positive validates that a value is positive.
val validate_non_negative : t Core_kernel.Validate.checkvalidate_non_negative validates that a value is non negative.
val validate_negative : t Core_kernel.Validate.checkvalidate_negative validates that a value is negative.
val validate_non_positive : t Core_kernel.Validate.checkvalidate_non_positive validates that a value is not positive.
val is_positive : t -> boolis_positive x is true if x is greater than zero. Always true if x is unsigned.
val is_non_negative : t -> boolis_non_negative x is true if x is greater than or equal to zero. Tautology if x is unsigned.
val is_negative : t -> boolis_negative x is true if x is strictly less than zero. It is a contradiction if x is not signed.
val is_non_positive : t -> boolis_non_positive x is true if x is less than zero. It is a contradiction if x is not signed.
Arithmetic that raises exceptions.
Arithmetic operations that doesn't check the widths.
Stable marshaling interface.
module Trie : sig ... endPrefix trees for bitvectors.