Page
Library
Module
Module type
Parameter
Class
Class type
Source
Re2 is now a "stdless" library:
open Re2.Std
anymoreRe2.Std.Re2
is now simply Re2
Re2.Std.Parser
is now simply Re2.Parser
Re2.Regex
is now simply Re2
Re.Parser
is left unchanged.Parser.Decimal.int
to require at least one digit, i.e. to disallow zero digits.Regex.Multiple
, an efficient way to ask which of several regexes matches a string.Add Re2.Parser.any_string
combinator.
There are no tests because any_string
is constructed only from the tested API and there's almost no interesting properties of it that can be verified.
Improved Re2.find_submatches
on big patterns with many submatches unmatched, e.g. =(ABC)|(DEF)|(GHI)|(KLM)|...=.
Without the fix:
+-------------------------------------------------------+--------------+------------+----------+----------+------------+ | Name | Time/Run | mWd/Run | mjWd/Run | Prom/Run | Percentage | +-------------------------------------------------------+--------------+------------+----------+----------+------------+ | [re2_internal.ml] find_submatches with many Nones:5 | 406.81ns | 30.00w | | | 0.08% | | [re2_internal.ml] find_submatches with many Nones:10 | 2_385.11ns | 207.00w | | | 0.47% | | [re2_internal.ml] find_submatches with many Nones:50 | 12_772.97ns | 2_072.00w | 0.33w | 0.33w | 2.53% | | [re2_internal.ml] find_submatches with many Nones:100 | 43_196.95ns | 7_191.00w | 2.03w | 2.03w | 8.56% | | [re2_internal.ml] find_submatches with many Nones:200 | 504_884.95ns | 29_316.00w | 16.05w | 16.05w | 100.00% | +-------------------------------------------------------+--------------+------------+----------+----------+------------+
With it:
+-------------------------------------------------------+--------------+-----------+----------+----------+------------+ | Name | Time/Run | mWd/Run | mjWd/Run | Prom/Run | Percentage | +-------------------------------------------------------+--------------+-----------+----------+----------+------------+ | [re2_internal.ml] find_submatches with many Nones:5 | 408.24ns | 30.00w | | | 0.12% | | [re2_internal.ml] find_submatches with many Nones:10 | 1_607.67ns | 163.00w | | | 0.48% | | [re2_internal.ml] find_submatches with many Nones:50 | 3_223.89ns | 563.00w | | | 0.96% | | [re2_internal.ml] find_submatches with many Nones:100 | 5_288.09ns | 1_063.00w | 0.20w | 0.20w | 1.58% | | [re2_internal.ml] find_submatches with many Nones:200 | 334_107.81ns | 2_063.00w | 0.79w | 0.79w | 100.00% | +-------------------------------------------------------+--------------+-----------+----------+----------+------------+
Fixed build on FreeBSD.
Excise direct mention of g++ from re2 Makefile, preferring the inbuilt CXX macro. This fixes the build on FreeBSD (yes, really).
Made Re2 depend only on Core_kernel
, not Core
.
Fixes janestreet/re2#6
Fixed a bug in Re2.find_all_exn
, extant since 2014-01-23, in which it returns spurious extra matches.
Using pattern b
and input aaaaaaaaaaaab
is expected to return a single match at the end of the input but instead returned the match multiple times, approximately as many times as input length / min(match length, 1)
.
Added tests for this function and also get_matches
which uses the same code.
Upgraded to upstream library version 20140304.
Added options Dot_nl
and Never_capture
.
Added Re2.Std
, so that one should now use Re2
via module Re2 = Re2.Std.Re2
.
At some future date, we will rename the Regex
module to Re2_internal
to force the stragglers to update to the new convention.
Fixed a bug with replace_exn
and anchoring.
Fixed this bug:
$ R.replace_exn ~f:(fun _ -> "a") (R.create_exn "^") "XYZ";;
- : string = "aXaYaZa"
$ R.replace_exn ~f:(fun _ -> "a") (R.create_exn "^X") "XXXYXXZ";;
- : string = "aaaYXXZ"
Fixed bugs in Re2.Regexp.find_all
and Re2.Regexp.find_first
.
find_first
)Fixed a bug in the C bindings that could cause a segfault.
Fixed a bug where mlre2__create_re
in C can give OCaml a freed C string.
The bug was in:
if (!compiled->ok()) {
compile_error = compiled->error().c_str();
delete compiled;
compiled = NULL;
caml_raise_with_string(*caml_named_value("mlre2__Regex_compile_failed"),
compile_error);
}
This is in mlre2__create_re
if we fail to compile the regular expression. Notice how we delete the re object before we use its' error string. (Notice that in C++ c_str()
returns a pointer to the internal data of the string object it does NOT create a copy and error()
just returns a reference to the regular expression objects error string member *error_
).
So if caml_raise_with_string
has to allocate on the heap to create the exception or the copy of the string that might invalidate the ptr before we will copy it.