package pcre2

  1. Overview
  2. Docs
Bindings to the Perl Compatibility Regular Expressions library (version 2)

Install

Dune Dependency

Authors

Maintainers

Sources

8.0.3.tar.gz
sha512=614bd7d44460ea7c35a61dcff14546e16eb7bbb959be02cf77463d4448c01e2462f10656ca8b1f21fead752a148ce94943de99dff8106a50eef1468e1d2f99f9

Description

pcre2-ocaml offers library functions for string pattern matching and substitution, similar to the functionality offered by the Perl language.

Published: 16 Feb 2025

README

PCRE2-OCaml - Perl Compatibility Regular Expressions for OCaml

Fork of the original pcre-ocaml project for PCRE2 support.

These are the bindings as needed by the Haxe compiler. I do not plan on maintaining this repository.

This OCaml library interfaces with the C library PCRE2, providing Perl-compatible regular expressions for string matching.

Features

PCRE2-OCaml offers:

  • Pattern searching

  • Subpattern extraction

  • String splitting by patterns

  • Pattern substitution

Reasons to choose PCRE2-OCaml:

  • The PCRE2 library by Philip Hazel is mature and stable, implementing nearly all Perl regular expression features. High-level OCaml functions (split, replace, etc.) are compatible with Perl functions, as much as OCaml allows. Some developers find Perl-style regex syntax more intuitive and powerful than the Emacs-style regex used in OCaml's Str module.

  • PCRE2-OCaml is reentrant and thread-safe, unlike the Str module. This reentrancy offers convenience, eliminating concerns about library state.

  • High-level replacement and substitution functions in OCaml are faster than those in the Str module. When compiled to native code, they can even outperform Perl's C-based functions.

  • Returned data is unique, allowing safe destructive updates without side effects.

  • The library interface uses labels and default arguments for enhanced programming comfort.

Usage

Please run:

$ odig odoc pcre2

Or (maybe?):

$ dune build @doc

Functions support two flag types:

  1. Convenience flags: Readable and concise, translated internally on each call. Example:

    let rex = Pcre2.regexp ~flags:[`ANCHORED; `CASELESS] "some pattern" in
    (* ... *)
    

    These are easy to use but may incur overhead in loops. For performance optimization, consider the next approach.

  2. Internal flags: Predefined and translated from convenience flags for optimal loop performance. Example:

    let iflags = Pcre2.cflags [`ANCHORED; `CASELESS] in
    for i = 1 to 1000 do
      let rex = Pcre2.regexp ~iflags "some pattern constructed at runtime" in
      (* ... *)
    done
    

    Translating flags outside loops saves cycles. Avoid creating regex in loops:

    for i = 1 to 1000 do
      let chunks = Pcre2.split ~pat:"[ \t]+" "foo bar" in
      (* ... *)
    done
    

    Instead, predefine the regex:

    let rex = Pcre2.regexp "[ \t]+" in
    for i = 1 to 1000 do
      let chunks = Pcre2.split ~rex "foo bar" in
      (* ... *)
    done
    

Functions use optional arguments with intuitive defaults. For instance, Pcre2.split defaults to whitespace as the pattern. The examples directory contains applications demonstrating PCRE2-OCaml's functionality.

Restartable (Partial) Pattern Matching

PCRE2 includes a DFA match function for restarting partial matches with new input, exposed via pcre2_dfa_exec. While not suitable for extracting submatches or splitting strings, it's useful for streaming and search tasks.

Example of a partial match restarted:

utop # open Pcre2;;
utop # let rex = regexp "12+3";;
val rex : regexp = <abstr>
utop # let workspace = Array.make 40 0;;
val workspace : int array =
  [|0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0;
    0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0|]
utop # pcre2_dfa_match ~rex ~flags:[`PARTIAL_SOFT] ~workspace "12222";;
Exception: Pcre2.Error Partial.
utop # pcre2_dfa_match ~rex ~flags:[`PARTIAL_SOFT; `DFA_RESTART] ~workspace "2222222";;
Exception: Pcre2.Error Partial.
utop # pcre2_dfa_exec ~rex ~flags:[`PARTIAL_SOFT; `DFA_RESTART] ~workspace "2222222";;
Exception: Pcre2.Error Partial.
utop # pcre2_dfa_exec ~rex ~flags:[`PARTIAL_SOFT; `DFA_RESTART] ~workspace "223xxxx";;
- : int array = [|0; 3; 0|]

Refer to the pcre2_dfa_exec documentation and the dfa_restart example for more information.

Contact Information and Contributing

Submit bug reports, feature requests, and contributions via the GitHub issue tracker.

For the latest information, visit: https://github.com/camlp5/pcre2-ocaml

Dependencies (4)

  1. conf-libpcre2-8 build
  2. dune-configurator
  3. ocaml >= "4.08"
  4. dune >= "2.7"

Dev Dependencies (2)

  1. odoc with-doc
  2. ounit2 with-test

Used by (5)

  1. camlp5 >= "8.02.01"
  2. pa_ppx >= "0.14"
  3. pa_ppx_q_ast >= "0.11"
  4. pa_ppx_regexp >= "0.02"
  5. pa_ppx_static >= "0.02"

Conflicts

None

OCaml

Innovation. Community. Security.