Match on Groups in Regular Expressions using ppx_regexp
Task
Text Processing / Regular Expressions / Match on Groups in Regular Expressions
Use Regexp to parse a YYYY-MM-DD date.
Opam Packages Used
- ppx_regexp Tested with version: 0.5.1 — Used libraries: ppx_regexp
- re Tested with version: 1.12.0 — Used libraries: re
Code
Extracting components from a date string
- We use
match%pcre
to pattern match against a string using regex - The regex pattern is enclosed in
{re|...|re}
string delimiters (it does not matter whether you use named delimiters or not, i.e.re
has no special meaning here) - Named capture groups are created using
?<name>...
syntax \d
means "match a digit",{4}
means "exactly 4 times"
let () =
match%pcre "Date: 1972-01-23 " with
| {|?<date>(?<year>\d{4})-(?<month>\d\d)-(?<day>\d\d)|} ->
Printf.printf "Date found: (%s)\n" date;
Printf.printf "Year: (%s)\n" year;
Printf.printf "Month: (%s)\n" month;
Printf.printf "Day: (%s)\n" day;
| _ -> print_string "Date not found\n"
Discussion
The re
library supports multiple syntaxes, and
provides concurrent pattern matching.
The ppx_regexp
package provides a preprocessor extension
(PPX) that introduces syntactic sugar (e.g. match%pcre
)
for the PCRE syntax.
To work with this package, we recommend referencing the PCRE syntax or any PCRE cheat sheet.