Bignum.of_string
needs to handle different formats for its input. The previous version of the code was trying to parse the common format (floats), and in case of failure, was attempting to use a different format (based on the error). This resulted in the string being parsed twice in some cases.
This version is a complete rewriting of of_string
to do the parsing in one step. The new code for to_string
encode an automaton and remembers the positions of the various elements of the string (depending on the format).
This feature uses a function which has been upstreamed in the new version of ZArith (1.4) which is a variant of the Zarith.of_string
function to work with substrings. This variant alone is responsible for a big part of the performance improvement.
The new version of the code performs better than the original one in all cases. The performance improvement are variable depending on the micro benchmark. See below.
We also tried to implement the lexing engine using OCamllex. This makes for a much more concise description, but the performance are significantly lower. OCamllex produces code which allocates some table and some state, which is avoided in the hand written code. Also, it will allocate the sub strings matched.
New version (patch for ZArith + of_substring, reimplementation of of_string)
┌─────────────────────────────────────────────────────────────────────┬──────────────┬───────────┬──────────┬──────────┬────────────┐ │ Name │ Time/Run │ mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │ ├─────────────────────────────────────────────────────────────────────┼──────────────┼───────────┼──────────┼──────────┼────────────┤ │ bigint\_bench.ml
random │ 48_381.13ns │ 7_166.00w │ 1.24w │ 1.24w │ 45.12% │ │ bigint\_bench.ml:vs. Big\_int
plus_self │ 293.96ns │ 72.00w │ │ │ 0.27% │ │ bigint\_bench.ml:vs. Big\_int
plus_other │ 807.62ns │ 124.00w │ │ │ 0.75% │ │ bigint\_bench.ml:vs. Big\_int
mult_self │ 353.98ns │ 91.00w │ │ │ 0.33% │ │ bigint\_bench.ml:vs. Big\_int
mult_other │ 783.78ns │ 128.00w │ │ │ 0.73% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (decimal) │ 14_415.44ns │ 475.00w │ │ │ 13.44% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (scientific) │ 61_363.80ns │ 3_929.00w │ │ │ 57.23% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (fraction) │ 24_957.02ns │ 303.00w │ │ │ 23.28% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (decimal) │ 15_867.52ns │ 1_523.00w │ │ │ 14.80% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (scientific) │ 33_345.31ns │ 4_206.00w │ │ │ 31.10% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (fraction) │ 31_770.26ns │ 3_779.00w │ │ │ 29.63% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (decimal) │ 9_726.82ns │ 380.00w │ │ │ 9.07% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (scientific) │ 28_141.40ns │ 2_059.00w │ │ │ 26.25% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (fraction) │ 70_436.16ns │ 5_541.00w │ │ │ 65.69% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (decimal) │ 27_000.73ns │ 1_994.00w │ │ │ 25.18% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (scientific) │ 66_057.63ns │ 6_217.00w │ │ │ 61.61% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (fraction) │ 107_219.89ns │ 8_097.00w │ │ │ 100.00% │ │ bignum\_bench.ml:Bignum binprot
roundtrip compact │ 5_997.81ns │ 581.00w │ │ │ 5.59% │ │ bignum\_bench.ml:Bignum binprot
roundtrip classic │ 18_522.20ns │ 779.00w │ │ │ 17.27% │ │ bignum\_bench.ml:round
round_decimal:0 │ 8_479.49ns │ 463.00w │ │ │ 7.91% │ │ bignum\_bench.ml:round
round_decimal:3 │ 24_621.71ns │ 2_115.00w │ │ │ 22.96% │ │ bignum\_bench.ml:round
round_decimal:6 │ 26_896.35ns │ 2_437.00w │ │ │ 25.09% │ │ bignum\_bench.ml:round
round_decimal:9 │ 29_428.19ns │ 2_730.00w │ │ │ 27.45% │ │ bignum\_bench.ml:round
round │ 8_452.31ns │ 459.00w │ │ │ 7.88% │ └─────────────────────────────────────────────────────────────────────┴──────────────┴───────────┴──────────┴──────────┴────────────┘
Original version
┌─────────────────────────────────────────────────────────────────────┬──────────────┬───────────┬──────────┬──────────┬────────────┐ │ Name │ Time/Run │ mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │ ├─────────────────────────────────────────────────────────────────────┼──────────────┼───────────┼──────────┼──────────┼────────────┤ │ bigint\_bench.ml
random │ 51_218.04ns │ 7_166.00w │ 1.25w │ 1.25w │ 43.26% │ │ bigint\_bench.ml:vs. Big\_int
plus_self │ 336.84ns │ 72.00w │ │ │ 0.28% │ │ bigint\_bench.ml:vs. Big\_int
plus_other │ 837.73ns │ 124.00w │ │ │ 0.71% │ │ bigint\_bench.ml:vs. Big\_int
mult_self │ 411.03ns │ 91.00w │ │ │ 0.35% │ │ bigint\_bench.ml:vs. Big\_int
mult_other │ 808.03ns │ 128.00w │ │ │ 0.68% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (decimal) │ 29_650.60ns │ 2_415.00w │ │ │ 25.04% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (scientific) │ 92_495.93ns │ 6_465.00w │ │ │ 78.12% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (fraction) │ 39_482.77ns │ 2_060.00w │ │ │ 33.35% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (decimal) │ 16_195.93ns │ 1_523.00w │ │ │ 13.68% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (scientific) │ 34_227.78ns │ 4_059.00w │ │ │ 28.91% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (fraction) │ 32_856.17ns │ 3_779.00w │ │ │ 27.75% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (decimal) │ 19_745.71ns │ 2_149.00w │ │ │ 16.68% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (scientific) │ 51_024.99ns │ 3_853.00w │ │ │ 43.09% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (fraction) │ 88_884.15ns │ 7_819.00w │ │ │ 75.07% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (decimal) │ 32_812.27ns │ 2_498.00w │ │ │ 27.71% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (scientific) │ 77_518.77ns │ 6_369.00w │ │ │ 65.47% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (fraction) │ 118_402.78ns │ 8_907.00w │ │ │ 100.00% │ │ bignum\_bench.ml:Bignum binprot
roundtrip compact │ 8_947.02ns │ 371.00w │ │ │ 7.56% │ │ bignum\_bench.ml:Bignum binprot
roundtrip classic │ 22_799.74ns │ 1_039.00w │ │ │ 19.26% │ │ bignum\_bench.ml:round
round_decimal:0 │ 8_176.74ns │ 463.00w │ │ │ 6.91% │ │ bignum\_bench.ml:round
round_decimal:3 │ 25_798.77ns │ 2_115.00w │ │ │ 21.79% │ │ bignum\_bench.ml:round
round_decimal:6 │ 28_561.23ns │ 2_437.00w │ │ │ 24.12% │ │ bignum\_bench.ml:round
round_decimal:9 │ 30_861.38ns │ 2_730.00w │ │ │ 26.06% │ │ bignum\_bench.ml:round
round │ 8_237.26ns │ 459.00w │ │ │ 6.96% │ └─────────────────────────────────────────────────────────────────────┴──────────────┴───────────┴──────────┴──────────┴────────────┘
Tentative version using OCamllex
┌─────────────────────────────────────────────────────────────────────┬──────────────┬────────────┬──────────┬──────────┬────────────┐ │ Name │ Time/Run │ mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │ ├─────────────────────────────────────────────────────────────────────┼──────────────┼────────────┼──────────┼──────────┼────────────┤ │ bigint\_bench.ml
random │ 48_164.21ns │ 7_166.00w │ 1.25w │ 1.25w │ 39.99% │ │ bigint\_bench.ml:vs. Big\_int
plus_self │ 285.84ns │ 72.00w │ │ │ 0.24% │ │ bigint\_bench.ml:vs. Big\_int
plus_other │ 768.12ns │ 124.00w │ │ │ 0.64% │ │ bigint\_bench.ml:vs. Big\_int
mult_self │ 343.14ns │ 91.00w │ │ │ 0.28% │ │ bigint\_bench.ml:vs. Big\_int
mult_other │ 780.00ns │ 128.00w │ │ │ 0.65% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (decimal) │ 26_931.12ns │ 3_108.00w │ │ │ 22.36% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (scientific) │ 79_750.28ns │ 6_599.00w │ 0.11w │ 0.11w │ 66.21% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
of_string (fraction) │ 34_988.94ns │ 4_300.00w │ │ │ 29.05% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (decimal) │ 15_958.17ns │ 1_523.00w │ │ │ 13.25% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (scientific) │ 32_495.25ns │ 4_059.00w │ │ │ 26.98% │ │ bignum\_bench.ml:Bignum of\_string/to\_string
to_string (fraction) │ 31_802.75ns │ 3_779.00w │ │ │ 26.40% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (decimal) │ 18_742.81ns │ 2_924.00w │ │ │ 15.56% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (scientific) │ 45_282.09ns │ 4_622.00w │ │ │ 37.60% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
of_sexp (fraction) │ 86_907.83ns │ 8_777.00w │ 0.15w │ 0.15w │ 72.16% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (decimal) │ 35_727.73ns │ 4_493.00w │ │ │ 29.66% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (scientific) │ 82_247.61ns │ 8_273.00w │ 0.13w │ 0.13w │ 68.29% │ │ bignum\_bench.ml:Bignum of\_sexp/to\_sexp
to_sexp (fraction) │ 120_445.25ns │ 10_688.00w │ 0.12w │ 0.12w │ 100.00% │ │ bignum\_bench.ml:Bignum binprot
roundtrip compact │ 6_734.49ns │ 371.00w │ │ │ 5.59% │ │ bignum\_bench.ml:Bignum binprot
roundtrip classic │ 21_773.79ns │ 1_890.00w │ │ │ 18.08% │ │ bignum\_bench.ml:round
round_decimal:0 │ 8_306.45ns │ 463.00w │ │ │ 6.90% │ │ bignum\_bench.ml:round
round_decimal:3 │ 24_714.96ns │ 2_115.00w │ │ │ 20.52% │ │ bignum\_bench.ml:round
round_decimal:6 │ 26_894.27ns │ 2_437.00w │ │ │ 22.33% │ │ bignum\_bench.ml:round
round_decimal:9 │ 29_343.81ns │ 2_730.00w │ │ │ 24.36% │ │ bignum\_bench.ml:round
round │ 8_296.05ns │ 459.00w │ │ │ 6.89% │ └─────────────────────────────────────────────────────────────────────┴──────────────┴────────────┴──────────┴──────────┴────────────┘