Page
Library
Module
Module type
Parameter
Class
Class type
Source
String.SubSubstrings.
A substring defines a possibly empty subsequence of bytes in a base string.
The positions of a string s of length l are the slits found before each byte and after the last byte of the string. They are labelled from left to right by increasing number in the range [0;l].
positions 0 1 2 3 4 l-1 l
+---+---+---+---+ +-----+
indices | 0 | 1 | 2 | 3 | ... | l-1 |
+---+---+---+---+ +-----+The ith byte index is between positions i and i+1.
Formally we define a substring of s as being a subsequence of bytes defined by a start and a stop position. The former is always smaller or equal to the latter. When both positions are equal the substring is empty. Note that for a given base string there are as many empty substrings as there are positions in the string.
Like in strings, we index the bytes of a substring using zero-based indices.
See how to use substrings to parse data.
type t = subThe type for substrings.
val empty : subempty is the empty substring of the empty string String.empty.
val v : ?start:int -> ?stop:int -> string -> subv ~start ~stop s is the substring of s that starts at position start (defaults to 0) and stops at position stop (defaults to String.length s).
val start_pos : sub -> intstart_pos s is s's start position in the base string.
val stop_pos : sub -> intstop_pos s is s's stop position in the base string.
val base_string : sub -> stringbase_string s is s's base string.
val length : sub -> intlength s is the number of bytes in s.
val get : sub -> int -> charget s i is the byte of s at its zero-based index i.
val get_byte : sub -> int -> intget_byte s i is Char.to_int (get s i).
val head : ?rev:bool -> sub -> char optionhead s is Some (get s h) with h = 0 if rev = false (default) or h = length s - 1 if rev = true. None is returned if s is empty.
val of_string : string -> subof_string s is v s
val to_string : sub -> stringto_string s is the bytes of s as a string.
rebase s is v (to_string s). This puts s on a base string made solely of its bytes.
val hash : sub -> inthash s is Hashtbl.hashs.
See the graphical guide.
tail s is s without its first (rev is false, default) or last (rev is true) byte or s if it is empty.
extend ~rev ~max ~sat s extends s by at most max consecutive sat satisfiying bytes of the base string located after stop s (rev is false, default) or before start s (rev is true). If max is unspecified the extension is limited by the extents of the base string of s. sat defaults to fun _ -> true.
reduce ~rev ~max ~sat s reduces s by at most max consecutive sat satisfying bytes of s located before stop s (rev is false, default) or after start s (rev is true). If max is unspecified the reduction is limited by the extents of the substring s. sat defaults to fun _ -> true.
extent s s' is the smallest substring that includes all the positions of s and s'.
overlap s s' is the smallest substring that includes all the positions common to s and s' or None if there are no such positions. Note that the overlap substring may be empty.
append s s' is like Appending strings. The substrings can be on different bases and the result is on a base string that holds exactly the appended bytes.
concat ~sep ss is like String.concat. The substrings can all be on different bases and the result is on a base string that holds exactly the concatenated bytes.
val is_empty : sub -> boolis_empty s is length s = 0.
is_prefix is like String.is_prefix. Only bytes are compared, affix can be on a different base string.
is_infix is like String.is_infix. Only bytes are compared, affix can be on a different base string.
is_suffix is like String.is_suffix. Only bytes are compared, affix can be on a different base string.
val for_all : (char -> bool) -> sub -> boolfor_all is like String.for_all on the substring.
val exists : (char -> bool) -> sub -> boolexists is like String.exists on the substring.
same_base s s' is true iff the substrings s and s' have the same base string according to physical equality.
equal_bytes s s' is true iff the substrings s and s' have exactly the same bytes. The substrings can be on a different base string.
compare_bytes s s' compares the bytes of s and s' in lexicographical order. The substrings can be on a different base string.
compare s s' compares the positions of s and s' in lexicographical order.
Extracted substrings are always on the same base string as the substring s acted upon.
with_range is like String.sub_with_range. The indices are the substring's zero-based ones, not those in the base string.
with_index_range is like String.sub_with_index_range. The indices are the substring's zero-based ones, not those in the base string.
trim is like String.trim. If all bytes are dropped returns an empty string located in the middle of the argument.
span is like String.span. For a substring s a left empty span is start s and a right empty span is stop s.
take is like String.take.
drop is like String.drop.
cut is like String.cut. sep can be on a different base string
cuts is like String.cuts. sep can be on a different base string
fields is like String.fields.
find ~rev sat s is the substring of s (if any) that spans the first byte that satisfies sat in s after position start s (rev is false, default) or before stop s (rev is true). None is returned if there is no matching byte in s.
find_sub ~rev ~sub s is the substring of s (if any) that spans the first match of sub in s after position start s (rev is false, default) or before stop s (rev is true). Only bytes are compared and sub can be on a different base string. None is returned if there is no match of sub in s.
filter sat s is like String.filter. The result is on a base string that holds only the filtered bytes.
filter_map f s is like String.filter_map. The result is on a base string that holds only the filtered bytes.
map is like String.map. The result is on a base string that holds only the mapped bytes.
mapi is like String.mapi. The result is on a base string that holds only the mapped bytes. The indices are the substring's zero-based ones, not those in the base string.
val fold_left : ('a -> char -> 'a) -> 'a -> sub -> 'afold_left is like String.fold_left.
val fold_right : (char -> 'a -> 'a) -> sub -> 'a -> 'afold_right is like String.fold_right.
val iter : (char -> unit) -> sub -> unititer is like String.iter.
val iteri : (int -> char -> unit) -> sub -> unititeri is like String.iteri. The indices are the substring's zero-based ones, not those in the base string.
val pp : Format.formatter -> sub -> unitpp ppf s prints s's bytes on ppf.
val dump : Format.formatter -> sub -> unitdump ppf s prints s as a syntactically valid OCaml string on ppf using Ascii.escape_string.
val dump_raw : Format.formatter -> sub -> unitdump_raw ppf s prints an unspecified raw internal representation of s on ppf.
val of_char : char -> subof_char c is a string that contains the byte c.
val to_char : sub -> char optionto_char s is the single byte in s or None if there is no byte or more than one in s.
val of_bool : bool -> subof_bool b is a string representation for b. Relies on Stdlib.string_of_bool.
val to_bool : sub -> bool optionto_bool s is a bool from s, if any. Relies on Stdlib.bool_of_string.
val of_int : int -> subof_int i is a string representation for i. Relies on Stdlib.string_of_int.
val to_int : sub -> int optionto_int is an int from s, if any. Relies on Stdlib.int_of_string.
val of_nativeint : nativeint -> subof_nativeint i is a string representation for i. Relies on Nativeint.of_string.
val to_nativeint : sub -> nativeint optionto_nativeint is an nativeint from s, if any. Relies on Nativeint.to_string.
val of_int32 : int32 -> subof_int32 i is a string representation for i. Relies on Int32.of_string.
val to_int32 : sub -> int32 optionto_int32 is an int32 from s, if any. Relies on Int32.to_string.
val of_int64 : int64 -> subof_int64 i is a string representation for i. Relies on Int64.of_string.
val to_int64 : sub -> int64 optionto_int64 is an int64 from s, if any. Relies on Int64.to_string.
val of_float : float -> subof_float f is a string representation for f. Relies on Stdlib.string_of_float.
val to_float : sub -> float optionto_float s is a float from s, if any. Relies on Stdlib.float_of_string.
+---+---+---+---+---+---+---+---+---+---+---+
| R | e | v | o | l | t | | n | o | w | ! |
+---+---+---+---+---+---+---+---+---+---+---+
|---------------| a
| start a
| stop a
|-----------| tail a
|-----------| tail ~rev:true a
|-----------------------------------| extend a
|-----------------------| extend ~rev:true a
|-------------------------------------------| base a
|-----------| b
| start b
| stop b
|-------| tail b
|-------| tail ~rev:true b
|-------------------------------------------| extend b
|-----------| extend ~rev:true b
|-------------------------------------------| base b
|-----------------------| extent a b
|---| overlap a b
| c
| start c
| stop c
| tail c
| tail ~rev:true c
|---------------| extend c
|---------------------------| extend ~rev:true c
|-------------------------------------------| base c
|-------------------| extent a c
None overlap a c
|---------------| d
| start d
| stop d
|-----------| tail d
|-----------| tail ~rev:true d
|---------------| extend d
|-------------------------------------------| extend ~rev:true d
|-------------------------------------------| base d
|---------------| extent d c
| overlap d c