package SZXX
Page
Library
Module
Module type
Parameter
Class
Class type
Source
Module SZXX.XlsxSource
type 'a cell_parser = {string : location -> Base.string -> 'a;formula : location -> formula:Base.string -> Base.string -> 'a;error : location -> formula:Base.string -> Base.string -> 'a;boolean : location -> Base.string -> 'a;(*"1" for
*)truenumber : location -> Base.string -> 'a;(*May contain a decimal part
*)date : location -> Base.string -> 'a;(*ISO-8601 format
*)null : 'a;
}A cell parser converts from XLSX types to your own data type (usually a variant). Use SZXX.Xlsx.string_cell_parser or SZXX.Xlsx.yojson_cell_parser to get started quickly, then make your own.
Convenience cell_parser to convert from XLSX types to String
val yojson_cell_parser :
[> `Bool of Base.bool
| `Float of Base.float
| `String of Base.string
| `Null ]
cell_parserConvenience cell_parser to convert from XLSX types to JSON
XLSX dates are stored as floats. Convert from a float to a Ptime.date
XLSX datetimes are stored as floats. Convert from a float to a Ptime.t
Convert from a column reference such as "D7" or "AA2" to a 0-based column index
val stream_rows_double_pass :
?filter_sheets:(sheet_id:Base.int -> raw_size:Base.int64 -> Base.bool) ->
sw:Eio.Std.Switch.t ->
_ Eio.File.ro ->
'a cell_parser ->
'a row Base.Sequence.tStream parsed rows from an XLSX file. This functions is GUARANTEED to run in constant memory, without buffering.
SZXX.Xlsx.stream_rows_double_pass ?only_sheet ~sw file cell_parser
filter_sheets: Default: all sheets. Sheet IDs start at 1. Note: it does not necessarily match the order of the sheets in Excel.
sw: A regular Eio.Switch.t
file: A file opened with Eio.Path.open_in or Eio.Path.with_open_in. If your XLSX document is not a file (e.g. an HTTP transfer), then use SZXX.Xlsx.stream_rows_single_pass
cell_parser: A cell parser converts from XLSX types to your own data type (usually a variant). Use SZXX.Xlsx.string_cell_parser or SZXX.Xlsx.yojson_cell_parser to get started quickly, then make your own.
SZXX will wait for you to consume rows from the Sequence before extracting more.
val stream_rows_single_pass :
?max_buffering:Base.int ->
?filter:(Xml.DOM.element row -> Base.bool) ->
?filter_sheets:(sheet_id:Base.int -> raw_size:Base.int64 -> Base.bool) ->
sw:Eio.Std.Switch.t ->
feed:Feed.t ->
'a cell_parser ->
'a row Base.Sequence.tStream parsed rows from an XLSX document. This function will only buffer rows encountered before the SST (see README.md). Consider using SZXX.Xlsx.stream_rows_double_pass if your XLSX is stored as a file.
SZXX.Xlsx.stream_rows_single_pass ?max_buffering ?filter ?only_sheet ~sw ~feed cell_parser
max_buffering: Default: unlimited. Sets a limit to the number of rows that may be buffered. Raises an exception if it runs out of buffer space before reaching the SST.
filter: Use this filter to drop uninteresting rows and reduce the number of rows that must be buffered. If necessary, use SZXX.Xlsx.Expert.parse_row_without_sst to access cell-level data. This function is called on every row of every sheet (unless ?only_sheet limits extraction to a single sheet).
filter_sheets: Default: all sheets. Sheet IDs start at 1. Note: it does not necessarily match the order of the sheets in Excel.
sw: A regular Eio.Switch.t
feed: A producer of raw input data. Create a feed by using the SZXX.Feed module.
cell_parser: A cell parser converts from XLSX types to your own data type (usually a variant). Use SZXX.Xlsx.string_cell_parser or SZXX.Xlsx.yojson_cell_parser to get started quickly, then make your own.
As much as possible, SZXX will wait for you to consume rows from the Sequence before extracting more.