package tyre

  1. Overview
  2. Docs
Typed Regular Expressions

Install

dune-project
 Dependency

Authors

Maintainers

Sources

1.0.tar.gz
sha256=63ca1915da896640534b5cf928d220198709ec74b899d55b830fb0ceccebd633
sha512=536440d090046569449c7752315d568b3447e84c8c0e555a35a20a504a96a538ed9bc4e8e5f78e5860744ba863023331aa0a9893bf2028ae280cad678ec8d59c

doc/tyre/Tyre/Charset/index.html

Module Tyre.CharsetSource

Sets of characters

Sets of characters support more operations than regular regexps, as you can diff them, so they have a specific type that allows these operations.

To convert to a regular Tyre.t, use charset or rep_charset

Sourcetype t

A set of characters.

Sourceval not : t -> t

not s is any - s

Sourceval union : t list -> t
Sourceval inter : t list -> t
Sourceval diff : t -> t -> t

diff a b is the sets of chars that are in a but not in b

Sourceval compl : t list -> t

compl sets is not (union sets)

Sourceval (||) : t -> t -> t

a || b is union [a; b]

Sourceval (&&) : t -> t -> t

a && b is inter [a; b]

Sourceval (-) : t -> t -> t

a - b is diff a b

Sourceval char : char -> t

The singleton set

Sourceval range : char -> char -> t

range of characters ordered according to their code. Include both bounds.

Sourceval set : string -> t

any character in the string

Predefined character sets

In general, matches latin1 characters, thats is the ocaml Stdlib.char type.

The exact characters matched are not documented in Re, the documentation bellow was written using the source: https://ocaml.orange/p/re/latest/doc/src/re/cset.ml.html .

Sourceval any : t

any character including newline

Sourceval notnl : t

any character except a new line

Sourceval wordc : t

wordc is the union of alnum and char '_'

Sourceval alpha : t

aplha is the union of lower, upper and set "\170\186"

Sourceval alnum : t

alnum is the uninon of alpha and digit

Sourceval ascii : t

chars with code 0 to 127, bounds included

Sourceval blank : t

blank is a space ' ' or a tab \t.

Sourceval cntrl : t

control characters. union of range '\000' '\031' and rg '\127' '\159'.

Sourceval digit : t

digit is set "0123456789"

Sourceval graph : t

union of range '\033' '\126' and range '\160' '\255'

Sourceval lower : t

lower is lowercase latin1 letter.

Includes range 'a' 'z', char 'µ', range '\223' '\246' = set "ßàáâãäåæçèéêëìíîïðñòóôõö" and range '\248' '\255' = set "øùúûüýþÿ"

Sourceval print : t

printable latin1 characters. range '\032' '\126' || range '\160' '\255'

Sourceval punct : t

latin1 ponctuation.

set "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~\160¡¢£¤¥¦§¨©«¬\173®¯°±²³´¶·¸¹»¼½¾¿×÷"
Sourceval space : t

space is set " \t\n\013"

Sourceval upper : t

upper is latin1 uppercase letter. This includes ascii uppercase letters, that is range 'A' 'Z', but also the ranges range '\192' '\214' = set "ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ" and range '\216' '\222' = set "ØÙÚÛÜÝÞ"

Sourceval xdigit : t

hexadecimal digit. range '0' '9' || range 'a' 'f' || range 'A' 'F'