Compile a regular expression. The following constructs are recognized:
.
Matches any character except newline.*
(postfix) Matches the preceding expression zero, one or several times+
(postfix) Matches the preceding expression one or several times?
(postfix) Matches the preceding expression once or not at all[..]
Character set. Ranges are denoted with -
, as in [a-z]
. An initial ^
, as in [^0-9]
, complements the set. To include a ]
character in a set, make it the first character of the set. To include a -
character in a set, make it the first or the last character of the set.^
Matches at beginning of line: either at the beginning of the matched string, or just after a '\n' character.$
Matches at end of line: either at the end of the matched string, or just before a '\n' character.\|
(infix) Alternative between two expressions.\(..\)
Grouping and naming of the enclosed expression.\1
The text matched by the first \(...\)
expression (\2
for the second expression, and so on up to \9
).\b
Matches word boundaries.\
Quotes special characters. The special characters are $^\.*+?[]
.
In regular expressions you will often use backslash characters; it's easier to use a quoted string literal {|...|}
to avoid having to escape backslashes.
For example, the following expression:
let r = Str.regexp {|hello \([A-Za-z]+\)|} in
Str.replace_first r {|\1|} "hello world"
returns the string "world"
.
If you want a regular expression that matches a literal backslash character, you need to double it: Str.regexp {|\\|}
.
You can use regular string literals "..."
too, however you will have to escape backslashes. The example above can be rewritten with a regular string literal as:
let r = Str.regexp "hello \\([A-Za-z]+\\)" in
Str.replace_first r "\\1" "hello world"
And the regular expression for matching a backslash becomes a quadruple backslash: Str.regexp "\\\\"
.