package cachet

  1. Overview
  2. Docs
type bigstring = (char, Stdlib.Bigarray.int8_unsigned_elt, Stdlib.Bigarray.c_layout) Stdlib.Bigarray.Array1.t
val memcpy : bigstring -> src_off:int -> bigstring -> dst_off:int -> len:int -> unit
val memmove : bigstring -> src_off:int -> bigstring -> dst_off:int -> len:int -> unit
module Bstr : sig ... end

A read-only bigstring.

type slice = private {
  1. offset : int;
  2. length : int;
  3. payload : Bstr.t;
}

A slice is an aligned segment of bytes (according to the pagesize specified by the cache, see make) with its absolute position into the underlying block-device and size.

val pp_slice : Stdlib.Format.formatter -> slice -> unit

Pretty-printer of slices.

val bstr_of_slice : ?logical_address:int -> slice -> Bstr.t

bstr_of_slice ?logical_address slice returns a read-only bigstring according the given slice and optionnaly the logical_address.

  • raises Invalid_argument

    if the given logical_address does not correspond to the given slice.

type 'fd map = 'fd -> pos:int -> int -> bigstring

A value map : 'fd map when applied map fd ~pos len reads a bigstring at pos. map must return as much data as is available, though never more than len bytes. map never fails. Instead, an empty bigstring must be returned if e.g. the position is out of range. Depending on how the cache is configured (see make), map never read more than pagesize bytes.

Note about schedulers and Cachet.

Cachet assumes that map is atomic, in other words: map is a unit of work that is indivisible and guaranteed to be executed as a single, coherent, and uninterrupted operation.

In this way, the map function is considered as a "direct" computation that does not interact with a scheduler. However, reading a page can take time. It may therefore be necessary to add a cooperation point after load or the user-friendly functions.

These functions can read one or more pages. load reads one page at most.

Note about large file and Cachet.

For performance reasons, Cachet has chosen to use an int rather than an int64 for the offset (the logical address). On a 64-bit architecture, addressing in the block device should not be a problem and Cachet is able to manage large block devices. However, on a 32-bit architecture, Cachet should only be able to handle ~2 GB files.

We consider that it is up to the developer to check this:

let _max_int31 = 2147483647L (* (1 lsl 31) - 1 *)

let () =
  let fd = Unix.openfile "disk.img" Unix.[ O_RDONLY ] 0o644 in
  let stat = Unix.LargeFile.fstat fd in
  if Sys.word_size = 32 && stat.Unix.LargeFile.st_size > _max_int31
  then failwith "Too big block-device";
  ...

So that, as soon as possible, the user can find out whether or not the program can handle large block-devices.

type 'fd t

Type of cachet's values.

val fd : 'fd t -> 'fd

fd t is the abstract file-descriptor used by t (and specified on make).

val pagesize : 'fd t -> int

pagesize t is the page-size used by t (and specified on make).

val cache_hit : 'fd t -> int

cache_hit t is the number of times a load hit the cache.

val cache_miss : 'fd t -> int

cache_miss t is the number of times a load didn't hit the cache.

val copy : 'fd t -> 'fd t

copy t creates a new, empty cache using the same map function.

val make : ?cachesize:int -> ?pagesize:int -> map:'fd map -> 'fd -> 'fd t

make ~cachesize ~pagesize ~map fd creates a new, empty cache using map and fd for reading pagesize bytes. The size of the cache is cachesize.

  • raises Invalid_argument

    if either cachesize or pagesize is not a power of two.

val load : 'fd t -> ?len:int -> int -> slice option

load t ~len logical_address loads a page at the given logical_address and returns a slice. len (defaults to 1) is the expected minimum number of bytes returned.

If the slice does not contains, at least, len bytes, load returns None. load t ~len:0 logical_address always returns an empty slice.

val invalidate : 'fd t -> off:int -> len:int -> unit

invalidate t ~off ~len invalidates the cache on len bytes from off.

val is_cached : 'fd t -> int -> bool

is_cached t logical_address returns true if the logicial_address requested is available in the cache, otherwise false.

User friendly functions.

Binary decoding of integers.

The functions in this section binary decode integers from byte sequences.

All following functions raise Invalid_argument if the space needed at index i to decode the integer is not available.

Little-endian (resp. big-endian) encoding means that least (resp. most) significant bytes are stored first. Big-endian is also known as network byte order. Native-endian encoding is either little-endian or big-endian depending on Sys.big_endian.

32-bit and 64-bit integers are represented by the int type, which has more bits than the binary encoding. Functions that decode signed (resp. unsigned) 8-bit or 16-bit integers represented by int values sign-extend (resp. zero-extend) their result.

exception Out_of_bounds of int

If Cachet tries to retrieve a byte outside the block device, this exception is raised.

val get_int8 : 'fd t -> int -> int

get_int8 t logical_address is t's signed 8-bit integer starting at byte index logical_address.

val get_uint8 : 'fd t -> int -> int

get_uint8 t logical_address is t's unsigned 8-bit integer starting at byte index logical_address.

val get_uint16_ne : 'fd t -> int -> int

get_uint16_ne t i is t's native-endian unsigned 16-bit integer starting at byte index i.

val get_uint16_le : 'fd t -> int -> int

get_uint16_le t i is t's little-endian unsigned 16-bit integer starting at byte index i.

val get_uint16_be : 'fd t -> int -> int

get_uint16_be t i is t's big-endian unsigned 16-bit integer starting at byte index i.

val get_int16_ne : 'fd t -> int -> int

get_int16_be t i is t's native-endian signed 16-bit integer starting at byte index i.

val get_int16_le : 'fd t -> int -> int

get_int16_le t i is t's little-endian signed 16-bit integer starting at byte index i.

val get_int16_be : 'fd t -> int -> int

get_int16_be t i is t's big-endian signed 16-bit integer starting at byte index i.

val get_int32_ne : 'fd t -> int -> int32

get_int32_ne t i is t's native-endian 32-bit integer starting at byte index i.

val get_int32_le : 'fd t -> int -> int32

get_int32_le t i is t's little-endian 32-bit integer starting at byte index i.

val get_int32_be : 'fd t -> int -> int32

get_int32_be t i is t's big-endian 32-bit integer starting at byte index i.

val get_int64_ne : 'fd t -> int -> int64

get_int64_ne t i is t's native-endian 64-bit integer starting at byte index i.

val get_int64_le : 'fd t -> int -> int64

get_int64_le t i is t's little-endian 64-bit integer starting at byte index i.

val get_int64_be : 'fd t -> int -> int64

get_int64_be t i is t's big-endian 64-bit integer starting at byte index i.

val get_string : 'fd t -> len:int -> int -> string

get_string t ~len logical_address loads the various pages needed from the cache or using map to copy len bytes available at off.

You can use syscalls to find out how many times get_string can call map at most.

  • raises Out_of_bounds

    if logical_address and len byte(s) are not accessible.

val get_seq : 'fd t -> int -> string Stdlib.Seq.t

get_seq t off returns a string Seq.t which loads various pages until the end of the underlying block-device and starting at off.

val next : 'fd t -> slice -> slice option

next t slice returns the next slice from the block-device after the given one slice.

val iter : 'fd t -> ?len:int -> fn:(int -> unit) -> int -> unit

iter t ?len ~fn off iters on each bytes until len (or the end of the block-device if it's not specified and starting at off.

val blit_to_bytes : 'fd t -> src_off:int -> bytes -> dst_off:int -> len:int -> unit

blit_to_bytes t ~src_off dst ~dst_off ~len copies len bytes from the cached block-device represented by t, starting at index src_off as the logical address, to byte sequence dst, starting at index dst_off.

This function can read several pages depending on the size of the dst buffer.

  • raises Invalid_argument

    if src_off and len do not designate a valid range of the block-device, or if dst_off and len do not designate a valid range of dst.

val syscalls : 'fd t -> logical_address:int -> len:int -> int

syscalls t ~logicial_address ~len returns the maximum number (if the cache is empty) of calls to map to load a segment of the block-device according to the logical_address and the size len (in bytes) of the segment.

OCaml

Innovation. Community. Security.