package ocaml-base-compiler
Floating-point arithmetic
OCaml's floating-point numbers follow the IEEE 754 standard, using double precision (64 bits) numbers. Floating-point operations never raise an exception on overflow, underflow, division by zero, etc. Instead, special IEEE numbers are returned as appropriate, such as infinity
for 1.0 /. 0.0
, neg_infinity
for -1.0 /. 0.0
, and nan
('not a number') for 0.0 /. 0.0
. These special numbers then propagate through floating-point computations as expected: for instance, 1.0 /. infinity
is 0.0
, and any arithmetic operation with nan
as argument returns nan
as result.
fma x y z
returns x * y + z
, with a best effort for computing this expression with a single rounding, using either hardware instructions (providing full IEEE compliance) or a software emulation. Note: since software emulation of the fma is costly, make sure that you are using hardware fma support if performance matters.
rem a b
returns the remainder of a
with respect to b
. The returned value is a -. n *. b
, where n
is the quotient a /. b
rounded towards zero to an integer.
succ x
returns the floating point number right after x
i.e., the smallest floating-point number greater than x
. See also next_after
.
pred x
returns the floating-point number right before x
i.e., the greatest floating-point number smaller than x
. See also next_after
.
A special floating-point value denoting the result of an undefined operation such as 0.0 /. 0.0
. Stands for 'not a number'. Any floating-point operation with nan
as argument returns nan
as result. As for floating-point comparisons, =
, <
, <=
, >
and >=
return false
and <>
returns true
if one or both of their arguments is nan
.
The difference between 1.0
and the smallest exactly representable floating-point number greater than 1.0
.
is_finite x
is true
iff x
is finite i.e., not infinite and not nan
.
is_infinite x
is true
iff x
is infinity
or neg_infinity
.
is_nan x
is true
iff x
is not a number (see nan
).
Truncate the given floating-point number to an integer. The result is unspecified if the argument is nan
or falls outside the range of representable integers.
Convert the given string to a float. The string is read in decimal (by default) or in hexadecimal (marked by 0x
or 0X
). The format of decimal floating-point numbers is [-] dd.ddd (e|E) [+|-] dd
, where d
stands for a decimal digit. The format of hexadecimal floating-point numbers is [-] 0(x|X) hh.hhh (p|P) [+|-] dd
, where h
stands for an hexadecimal digit and d
for a decimal digit. In both cases, at least one of the integer and fractional parts must be given; the exponent part is optional. The _
(underscore) character can appear anywhere in the string and is ignored. Depending on the execution platforms, other representations of floating-point numbers can be accepted, but should not be relied upon.
type fpclass = fpclass =
The five classes of floating-point numbers, as determined by the classify_float
function.
val classify_float : float -> fpclass
Return the class of the given floating-point number: normal, subnormal, zero, infinite, or not a number.
expm1 x
computes exp x -. 1.0
, giving numerically-accurate results even if x
is close to 0.0
.
log1p x
computes log(1.0 +. x)
(natural logarithm), giving numerically-accurate results even if x
is close to 0.0
.
Arc cosine. The argument must fall within the range [-1.0, 1.0]
. Result is in radians and is between 0.0
and pi
.
Arc sine. The argument must fall within the range [-1.0, 1.0]
. Result is in radians and is between -pi/2
and pi/2
.
atan2 y x
returns the arc tangent of y /. x
. The signs of x
and y
are used to determine the quadrant of the result. Result is in radians and is between -pi
and pi
.
hypot x y
returns sqrt(x *. x + y *. y)
, that is, the length of the hypotenuse of a right-angled triangle with sides of length x
and y
, or, equivalently, the distance of the point (x,y)
to origin. If one of x
or y
is infinite, returns infinity
even if the other is nan
.
trunc x
rounds x
to the nearest integer whose absolute value is less than or equal to x
.
round x
rounds x
to the nearest integer with ties (fractional values of 0.5) rounded away from zero, regardless of the current rounding direction. If x
is an integer, +0.
, -0.
, nan
, or infinite, x
itself is returned.
Round above to an integer value. ceil f
returns the least integer value greater than or equal to f
. The result is returned as a float.
Round below to an integer value. floor f
returns the greatest integer value less than or equal to f
. The result is returned as a float.
next_after x y
returns the next representable floating-point value following x
in the direction of y
. More precisely, if y
is greater (resp. less) than x
, it returns the smallest (resp. largest) representable number greater (resp. less) than x
. If x
equals y
, the function returns y
. If x
or y
is nan
, a nan
is returned. Note that next_after max_float infinity = infinity
and that next_after 0. infinity
is the smallest denormalized positive number. If x
is the smallest denormalized positive number, next_after x 0. = 0.
copy_sign x y
returns a float whose absolute value is that of x
and whose sign is that of y
. If x
is nan
, returns nan
. If y
is nan
, returns either x
or -. x
, but it is not specified which.
sign_bit x
is true
iff the sign bit of x
is set. For example sign_bit 1.
and signbit 0.
are false
while sign_bit (-1.)
and sign_bit (-0.)
are true
.
frexp f
returns the pair of the significant and the exponent of f
. When f
is zero, the significant x
and the exponent n
of f
are equal to zero. When f
is non-zero, they are defined by f = x *. 2 ** n
and 0.5 <= x < 1.0
.
compare x y
returns 0
if x
is equal to y
, a negative integer if x
is less than y
, and a positive integer if x
is greater than y
. compare
treats nan
as equal to itself and less than any other float value. This treatment of nan
ensures that compare
defines a total ordering relation.
min x y
returns the minimum of x
and y
. It returns nan
when x
or y
is nan
. Moreover min (-0.) (+0.) = -0.
max x y
returns the maximum of x
and y
. It returns nan
when x
or y
is nan
. Moreover max (-0.) (+0.) = +0.
min_max x y
is (min x y, max x y)
, just more efficient.
min_num x y
returns the minimum of x
and y
treating nan
as missing values. If both x
and y
are nan
, nan
is returned. Moreover min_num (-0.) (+0.) = -0.
max_num x y
returns the maximum of x
and y
treating nan
as missing values. If both x
and y
are nan
nan
is returned. Moreover max_num (-0.) (+0.) = +0.
min_max_num x y
is (min_num x y, max_num x y)
, just more efficient. Note that in particular min_max_num x nan = (x, x)
and min_max_num nan y = (y, y)
.
val hash : t -> int
The hash function for floating-point numbers.
module Array : sig ... end
module ArrayLabels : sig ... end