package ocaml-r

  1. Overview
  2. Docs

Module OCamlRSource

Bindings for the R interpreter.

It encapsulates the functionalities of the libR.so shared library provided by the R software. This enables us to embed the R interpreter into Objective Caml, to execute R code from Objective Caml and to exchange data structures between R and Objective Caml. A lot of information on R internals can be found in the official documentation as well as in this document.

THREAD SAFETY

It is important to understand that this binding is a rather low-level binding of R functionality. As such, it is no more thread-safe than R itself, which is not thread-safe at all. Therefore, avoid real threading unless you know what you're doing...

DATA CONVERSION

R is an array-oriented language. Therefore, simple values such as a boolean, a string, a number, are in fact encapsulated, in R, in an array of booleans, an array of strings, an array of numbers. For a simple value, the array has only one element.

Moreover, as R is scientific software, it is important that data types be correctly matched between Objective Caml and R. At the moment, they are not. I am thinking here of the 31/32 bit issues, or 63/64 bit issue, or, for instance.

Internal representation of R values.

Sourcetype sexp

Low-level SEXP typing.

Sourcetype 'a sxp = private sexp
Sourcetype nilsxp = [ `Nil ] sxp
Sourcetype symsxp = [ `Sym ] sxp
Sourcetype langsxp = [ `Lang ] sxp
Sourcetype listsxp = [ `List ] sxp
Sourcetype dotsxp = [ `Dot ] sxp
Sourcetype closxp = [ `Clo ] sxp
Sourcetype envsxp = [ `Env ] sxp
Sourcetype promsxp = [ `Prom ] sxp
Sourcetype specialsxp = [ `Special ] sxp
Sourcetype builtinsxp = [ `Builtin ] sxp
Sourcetype vecsxp = [ `Vec ] sxp
Sourcetype charsxp = [ `Char ] sxp
Sourcetype lglsxp = [ `Lgl ] sxp
Sourcetype intsxp = [ `Int ] sxp
Sourcetype realsxp = [ `Real ] sxp
Sourcetype strsxp = [ `Str ] sxp
Sourcetype rawsxp = [ `Raw ] sxp
Sourcetype exprsxp = [ `Expr ] sxp
Sourcetype 'a nonempty_list = [< `List | `Lang | `Dots ] as 'a

R-ints: Language objects (LANGSXP) are calls (including formulae and so on). Internally they are pairlists with first element a reference to the function to be called with remaining elements the actual arguments for the call. Although this is not enforced, many places in the code assume that the pairlist is of length one or more, often without checking.

Sourcetype internallist = [
  1. | `Nil
  2. | `List
  3. | `Lang
  4. | `Dots
]

Type of low-level internal list. In R, such * internal lists may be empty, a pairlist or * a call which is somewhat similar to closure * ready for execution.

Sourcetype 'a pairlist = [< `Nil | `List ] as 'a
Sourcetype vector = [
  1. | `Char
  2. | `Lgl
  3. | `Int
  4. | `Real
  5. | `Str
  6. | `Raw
  7. | `Expr
  8. | `Vec
]
Sourcemodule type SXP = sig ... end
Sourcemodule Sexp : SXP with type t = sexp
Sourcemodule Nilsxp : sig ... end
Sourcemodule Dotsxp : sig ... end
Sourcemodule Envsxp : sig ... end
Sourcemodule Langsxp : sig ... end
Sourcemodule Symsxp : sig ... end

Vector types

R is an array-oriented language. Therefore, simple values such as a boolean, a string, a number, are in fact encapsulated, in R, in an array of booleans, an array of strings, an array of numbers. For a simple value, the array has only one element.

Sourcemodule type Vector = sig ... end
Sourcemodule type Atomic_vector = sig ... end
Sourcemodule Intsxp : Atomic_vector with type t = intsxp and type repr = int

R array of integer values

Sourcemodule Lglsxp : Atomic_vector with type t = lglsxp and type repr = bool

R array of boolean values

Sourcemodule Realsxp : Atomic_vector with type t = realsxp and type repr = float

R array of float values

Sourcemodule Strsxp : Atomic_vector with type t = strsxp and type repr = string

R array of string values

Sourcemodule Vecsxp : Vector with type t = vecsxp and type repr = Sexp.t

R list

Value inspection

Sourcemodule Sexptype : sig ... end

Algebraic datatype reflecting R's dynamic typing.

Parsing R code.

Sourcetype parse_status =
  1. | Parse_Null
  2. | Parse_OK
  3. | Parse_Incomplete
  4. | Parse_Error
  5. | Parse_EOF
    (*

    Outcome of a parsing request.

    *)
Sourceexception Parsing_failure of parse_status * string

Exception raised when parsing fails.

Sourceval parse_string : ?max:int -> string -> langsxp list

Parse a string of R code into R calls.

  • parameter max

    If omitted, parse the whole R code, even if there are multiple statements. Otherwise, maximum number of statements to parse.

Sourceval parse : string -> langsxp

Parse the first R statement in the given R code.

Evaluation of R code and calls.

Sourceexception Runtime_error of langsxp * string
Sourcemodule type Conversion = sig ... end
Sourcemodule Enc : Conversion with type 'a t = 'a -> Sexp.t
Sourcemodule Dec : Conversion with type 'a t = Sexp.t -> 'a
Sourceval symbol : ?generic:bool -> string -> Sexp.t

Retrieves an R symbol from the symbol table, given its name.

Sourceval eval_string : string -> Sexp.t

string takes a string containing R code, and feeds it to the R interpreter. You get the resulting value back. The typing of this function is deliberately unsafe in order to allow the user to type it precisely.

Bug: currently, if you try to execute a statement that refers to symbols that haven't been loaded, you get a segfault. For instance, evaluating a string containing the rbinom symbol without the R.stats package being loaded raises a segfault.

Sourcetype arg

Convenience functions to wrap up arguments, when mapping R functions to Objective Caml functions.

Sourceval arg : 'a Enc.t -> ?name:string -> 'a -> arg
Sourceval opt_arg : 'a Enc.t -> string -> 'a option -> arg
Sourceval call : Sexp.t -> arg list -> Sexp.t

call f args evaluates an the R function f with respect to a list of arguments. Argument None is ignored, and Some (name, sexp) is the argument whose optional name is name and whose value is sexp. The typing of this function is deliberately unsafe in order to allow the user to type it precisely.

Initialisation

We provide two mechanisms to activate an R interpreter from OCaml-R:

The first mechanism consists of low-level bindings to the initialisation and termination functions of the libR.so shared library. While this is a rather flexible approach, it has the downside of not being a very static approach, specifically if your intention if to write Objective Caml bindings for a dependent bunch of R packages.

The second mechanism is a static, functorial approach: You just have to create a module with the Interpreter functor to initialise the R interpreter. You provide initialisation details through a module of module type Environment, and Interpreter will set it up correctly.

This functorial facility is available from the OCaml_R module: This OCaml_R module has the sole purpose of initialising the R interpreter with the Standard Environment module. No need to worry about initialisation details.

To create bindings for a dependent bunch of R packages, you simply have to make them depend on the findlib R.interpreter package, which involves the OCaml_R module. This is also convenient on the toplevel, where you simply have to have to invoke the #require "R.interpreter" directive to set up the interpreter.

Sourceexception Initialisation_failed

Denotes failure to initialise the R interpreter.

Sourceval init : ?name:string -> ?argv:string list -> ?env:(string * string) list -> ?packages:string list option -> ?sigs:bool -> unit -> unit

init initialises the embedded R interpreter.

  • parameter name

    Name of program. Defaults to Sys.argv.(0).

  • parameter argv

    Command line options given to libR.so. Defaults to rest of Sys.argv.

  • parameter env

    Environment variables to be set for R. Defaults to reasonable values.

  • parameter packages

    Packages to be loaded at startup. If None, load the usual standard library.

  • parameter sigs

    If false, stops R from setting his signal handlers. Defaults to false.

Sourceval terminate : unit -> unit

Terminates the R session.

Sourcemodule type Environment = sig ... end

Environment is the type of a module containing all necessary informations and data in order to set up the R interpreter properly.

The Standard module contains initialisation details for libR.so. These informations are determined when the binding is being compiled.

Functor used to initialise statically an R interpreter, given initialisation details provided by the provided Env module.

Sourcemodule Low_level : sig ... end
Sourceval attributes : sexp -> (Symsxp.description * sexp) list
Sourceval pairlist_of_list : (sexp * sexp) list -> [> internallist ] sxp
OCaml

Innovation. Community. Security.