To focus the search input from anywhere on the page, press the 'S' key.
in-package search v0.1.0
Library
Module
Module type
Parameter
Class
Class type
Basic String Utils (Labeled version of CCString
)
Strings
make n c
is a string of length n
with each index holding the character c
.
init n ~f
is a string of length n
with index i
holding the character f i
(called in increasing index order).
Return a new string that contains the same bytes as the given byte sequence.
Return a new byte sequence that contains the same bytes as the given string.
get s i
is the character at index i
in s
. This is the same as writing s.[i]
.
Concatenating
Note. The Stdlib.(^)
binary operator concatenates two strings.
concat ~sep ss
concatenates the list of strings ss
, inserting the separator string sep
between each.
Predicates and comparisons
starts_with
~prefix s
is true
if and only if s
starts with prefix
.
ends_with
~suffix s
is true
if and only if s
ends with suffix
.
contains_from s start c
is true
if and only if c
appears in s
after position start
.
rcontains_from s stop c
is true
if and only if c
appears in s
before position stop+1
.
contains s c
is String.contains_from
s 0 c
.
Extracting substrings
sub s ~pos ~len
is a string of length len
, containing the substring of s
that starts at position pos
and has length len
.
Transforming
map f s
is the string resulting from applying f
to all the characters of s
in increasing order.
mapi ~f s
is like map
but the index of the character is also passed to f
.
fold_left f x s
computes f (... (f (f x s.[0]) s.[1]) ...) s.[n-1]
, where n
is the length of the string s
.
fold_right f s x
computes f s.[0] (f s.[1] ( ... (f s.[n-1] x) ...))
, where n
is the length of the string s
.
trim s
is s
without leading and trailing whitespace. Whitespace characters are: ' '
, '\x0C'
(form feed), '\n'
, '\r'
, and '\t'
.
escaped s
is s
with special characters represented by escape sequences, following the lexical conventions of OCaml.
All characters outside the US-ASCII printable range [0x20;0x7E] are escaped, as well as backslash (0x2F) and double-quote (0x22).
The function Scanf.unescaped
is a left inverse of escaped
, i.e. Scanf.unescaped (escaped s) = s
for any string s
(unless escaped s
fails).
Traversing
iteri
is like iter
, but the function is also given the corresponding character index.
Searching
index_from s i c
is the index of the first occurrence of c
in s
after position i
.
index_from_opt s i c
is the index of the first occurrence of c
in s
after position i
(if any).
rindex_from s i c
is the index of the last occurrence of c
in s
before position i+1
.
rindex_from_opt s i c
is the index of the last occurrence of c
in s
before position i+1
(if any).
index s c
is String.index_from
s 0 c
.
index_opt s c
is String.index_from_opt
s 0 c
.
rindex s c
is String.rindex_from
s (length s - 1) c
.
rindex_opt s c
is String.rindex_from_opt
s (length s - 1) c
.
Strings and Sequences
to_seqi s
is like to_seq
but also tuples the corresponding index.
UTF decoding and validations
UTF-8
val get_utf_8_uchar : t -> int -> Uchar.utf_decode
get_utf_8_uchar b i
decodes an UTF-8 character at index i
in b
.
val is_valid_utf_8 : t -> bool
is_valid_utf_8 b
is true
if and only if b
contains valid UTF-8 data.
UTF-16BE
val get_utf_16be_uchar : t -> int -> Uchar.utf_decode
get_utf_16be_uchar b i
decodes an UTF-16BE character at index i
in b
.
val is_valid_utf_16be : t -> bool
is_valid_utf_16be b
is true
if and only if b
contains valid UTF-16BE data.
UTF-16LE
val get_utf_16le_uchar : t -> int -> Uchar.utf_decode
get_utf_16le_uchar b i
decodes an UTF-16LE character at index i
in b
.
val is_valid_utf_16le : t -> bool
is_valid_utf_16le b
is true
if and only if b
contains valid UTF-16LE data.
Binary decoding of integers
The functions in this section binary decode integers from strings.
All following functions raise Invalid_argument
if the characters needed at index i
to decode the integer are not available.
Little-endian (resp. big-endian) encoding means that least (resp. most) significant bytes are stored first. Big-endian is also known as network byte order. Native-endian encoding is either little-endian or big-endian depending on Sys.big_endian
.
32-bit and 64-bit integers are represented by the int32
and int64
types, which can be interpreted either as signed or unsigned numbers.
8-bit and 16-bit integers are represented by the int
type, which has more bits than the binary encoding. These extra bits are sign-extended (or zero-extended) for functions which decode 8-bit or 16-bit integers and represented them with int
values.
get_uint8 b i
is b
's unsigned 8-bit integer starting at character index i
.
get_int8 b i
is b
's signed 8-bit integer starting at character index i
.
get_uint16_ne b i
is b
's native-endian unsigned 16-bit integer starting at character index i
.
get_uint16_be b i
is b
's big-endian unsigned 16-bit integer starting at character index i
.
get_uint16_le b i
is b
's little-endian unsigned 16-bit integer starting at character index i
.
get_int16_ne b i
is b
's native-endian signed 16-bit integer starting at character index i
.
get_int16_be b i
is b
's big-endian signed 16-bit integer starting at character index i
.
get_int16_le b i
is b
's little-endian signed 16-bit integer starting at character index i
.
get_int32_ne b i
is b
's native-endian 32-bit integer starting at character index i
.
val seeded_hash : int -> t -> int
A seeded hash function for strings, with the same output value as Hashtbl.seeded_hash
. This function allows this module to be passed as argument to the functor Hashtbl.MakeSeeded
.
get_int32_be b i
is b
's big-endian 32-bit integer starting at character index i
.
get_int32_le b i
is b
's little-endian 32-bit integer starting at character index i
.
get_int64_ne b i
is b
's native-endian 64-bit integer starting at character index i
.
get_int64_be b i
is b
's big-endian 64-bit integer starting at character index i
.
get_int64_le b i
is b
's little-endian 64-bit integer starting at character index i
.
val length : t -> int
length s
returns the length (number of characters) of the given string s
.
blit ~src ~src_pos ~dst ~dst_pos ~len
copies len
characters from string src
starting at character indice src_pos
, to the Bytes sequence dst
starting at character indice dst_pos
. Like String.blit
. Compatible with the -safe-string
option.
val fold : f:('a -> char -> 'a) -> init:'a -> t -> 'a
fold ~f ~init s
folds on chars by increasing index. Computes f(… (f (f init s.[0]) s.[1]) …) s.[n-1]
.
val foldi : f:('a -> int -> char -> 'a) -> 'a -> t -> 'a
foldi ~f init s
is just like fold
, but it also passes in the index of each chars as second argument to the folded function f
.
Conversions
to_seq s
returns the Seq.t
of characters contained in the string s
. Renamed from to std_seq
since 3.0.
val to_list : t -> char list
to_list s
returns the list
of characters contained in the string s
.
pp_buf buf s
prints s
to the buffer buf
. Renamed from pp
since 2.0.
val pp : Format.formatter -> t -> unit
pp f s
prints the string s
within quotes to the formatter f
. Renamed from print
since 2.0.
Strings
compare s1 s2
compares the strings s1
and s2
and returns an integer that indicates their relative position in the sort order.
pad ?side ?c n s
ensures that the string s
is at least n
bytes long, and pads it on the side
with c
if it's not the case.
val of_gen : char gen -> string
of_gen gen
converts a gen
of characters to a string.
val of_iter : char iter -> string
of_iter iter
converts an iter
of characters to a string.
val of_seq : char Seq.t -> string
of_seq seq
converts a seq
of characters to a string. Renamed from of_std_seq
since 3.0.
to_array s
returns the array of characters contained in the string s
.
find ?start ~sub s
returns the starting index of the first occurrence of sub
within s
or -1
.
val find_all : ?start:int -> sub:string -> string -> int gen
find_all ?start ~sub s
finds all occurrences of sub
in s
, even overlapping instances and returns them in a generator gen
.
find_all_l ?start ~sub s
finds all occurrences of sub
in s
and returns them in a list.
mem ?start ~sub s
is true
iff sub
is a substring of s
.
rfind ~sub s
finds sub
in string s
from the right, returns its first index or -1
. Should only be used with very small sub
.
replace ?which ~sub ~by s
replaces some occurrences of sub
by by
in s
.
is_sub ~sub ~sub_pos s ~pos ~sub_len
returns true
iff the substring of sub
starting at position sub_pos
and of length sub_len
is a substring of s
starting at position pos
.
chop_prefix ~pre s
removes pre
from s
if pre
really is a prefix of s
, returns None
otherwise.
chop_suffix ~suf s
removes suf
from s
if suf
really is a suffix of s
, returns None
otherwise.
val lines_gen : string -> string gen
lines_gen s
returns a generator gen
of the lines of s
(splits along '\n').
val lines_iter : string -> string iter
lines_iter s
returns the iter
of the lines of s
(splits along '\n').
val lines_seq : string -> string Seq.t
lines_seq s
returns the Seq.t
of the lines of s
(splits along '\n').
val concat_iter : sep:string -> string iter -> string
concat_iter ~sep iter
concatenates all strings of iter
, separated with sep
.
val concat_gen : sep:string -> string gen -> string
concat_gen ~sep gen
concatenates all strings of gen
, separated with sep
.
val concat_seq : sep:string -> string Seq.t -> string
concat_seq ~sep seq
concatenates all strings of seq
, separated with sep
.
val unlines_gen : string gen -> string
unlines_gen gen
concatenates all strings of gen
, separated with '\n'.
val unlines_iter : string iter -> string
unlines_iter iter
concatenates all strings of iter
, separated with '\n'.
val unlines_seq : string Seq.t -> string
unlines_seq seq
concatenates all strings of seq
, separated with '\n'.
set s i c
creates a new string which is a copy of s
, except for index i
, which becomes c
.
iter ~f s
applies function f
on each character of s
. Alias to String.iter
.
filter_map ~f s
calls (f a0) (f a1) … (f an)
where a0 … an
are the characters of s. It returns the string of characters ci
such as f ai = Some ci
(when f
returns None
, the corresponding element of s
is discarded).
filter ~f s
discards characters of s
not satisfying f
.
uniq ~eq s
remove consecutive duplicate characters in s
.
flat_map ?sep ~f s
maps each chars of s
to a string, then concatenates them all.
for_all ~f s
is true
iff all characters of s
satisfy the predicate f
.
exists ~f s
is true
iff some character of s
satisfy the predicate f
.
drop_while ~f s
discards any characters of s
starting from the left, up to the first character c
not satisfying f c
.
rdrop_while ~f s
discards any characters of s
starting from the right, up to the first character c
not satisfying f c
.
ltrim s
trims space on the left (see String.trim
for more details).
rtrim s
trims space on the right (see String.trim
for more details).
Operations on 2 strings
iter2 ~f s1 s2
iterates on pairs of chars.
iteri2 ~f s1 s2
iterates on pairs of chars with their index.
fold2 ~f ~init s1 s2
folds on pairs of chars.
for_all2 ~f s1 s2
returns true
iff all pairs of chars satisfy the predicate f
.
exists2 ~f s1 s2
returns true
iff a pair of chars satisfy the predicate f
.
Ascii functions
Those functions are deprecated in String
since 4.03, so we provide a stable alias for them even in older versions.
capitalize_ascii s
returns a copy of s
with the first character set to uppercase using the US-ASCII character set. See String
.
uncapitalize_ascii s
returns a copy of s
with the first character set to lowercase using the US-ASCII character set. See String
.
uppercase_ascii s
returns a copy of s
with all lowercase letters translated to uppercase using the US-ASCII character set. See String
.
lowercase_ascii s
returns a copy of s
with all uppercase letters translated to lowercase using the US-ASCII character set. See String
.
equal_caseless s1 s2
compares s1
and s2
without respect to ascii lowercase.
Same as of_hex
but fails harder.
Finding
A relatively efficient algorithm for finding sub-strings.
module Find : sig ... end
Splitting
module Split : sig ... end
split_on_char ~by s
splits the string s
along the given char by
.
split ~by s
splits the string s
along the given string by
. Alias to Split.list_cpy
.
Utils
compare_versions s1 s2
compares version strings s1
and s2
, considering that numbers are above text.
compare_natural s1 s2
is the Natural Sort Order, comparing chunks of digits as natural numbers. https://en.wikipedia.org/wiki/Natural_sort_order
edit_distance ?cutoff s1 s2
is the edition distance between the two strings s1
and s2
. This satisfies the classical distance axioms: it is always positive, symmetric, and satisfies the formula distance s1 s2 + distance s2 s3 >= distance s1 s3
.
Infix operators
module Infix : sig ... end