package biocaml

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type
module Accu : sig ... end

A datastructure (based on Hashtbl) to accumulate values.

module Bam : sig ... end

Read and write BAM format.

module Bamstats : sig ... end
module Bar : sig ... end

Affymetrix's BAR files. Their Tiling Analysis Software (TAS) produces BAR files in binary format but this module supports only the text format generated by selecting the "Export probe analysis as TXT" option.

module Bed : sig ... end

BED data files.

module Bgzf : sig ... end

I/O on Blocked GNU Zip format (BGZF) files

module Bin_pred : sig ... end

Performance measurement of binary classifiers.

module Biocaml_result : sig ... end

Extension of Core's Result. Internal use only.

module Bpmap : sig ... end

Affymetrix's BPMAP files. Only text format supported. Binary BPMAP files must first be converted to text using Affymetrix's probe exporter tool.

module Cel : sig ... end

Affymetrix's CEL files. Only text format supported. Binary file must be converted using Affymetrix's conversion tool. This tool does not change file extension, so be sure your file really is in text format.

module Chr : sig ... end

Chromosome names. A chromosome name, as defined by this module, consists of two parts. An optional prefix "chr" (case-insensitive), followed by a suffix identifying the chromosome. The possible suffixes (case-insensitive) are:

module Entrez : sig ... end

Entrez Utilities API

module Fasta : sig ... end

FASTA files. The FASTA family of file formats has different incompatible descriptions (1, 2, 3, 4, etc.). Roughly FASTA files are in the format:

module Fastq : sig ... end

FASTQ files. The FASTQ file format is repeated sequence of 4 lines:

module File_mapper : sig ... end
module Future : sig ... end
module Future_unix : sig ... end
module GenomeMap : sig ... end

Data structures to represent sets of (possibly annotated) genomic regions

module Gff : sig ... end

GFF files.

module Histogram : sig ... end

Histograms with polymorphic bin types.

module Interval_tree : sig ... end

Interval tree (data structure)

module Iset : sig ... end

DIET : Discrete Interval Encoding Trees

module Jaspar : sig ... end

Access to Jaspar database

module Line : sig ... end
module Lines : sig ... end

Manipulate the lines of a file.

module Math : sig ... end

Numeric mathematics.

module Msg : sig ... end

Consistent printing of errors, warnings, and bugs. An error is a user mistake that prevents continuing program execution, a warning is a milder problem that the program continues to execute through, and a bug is a mistake in the software.

module MzData : sig ... end
module Phred_score : sig ... end

PHRED quality scores.

module Pos : sig ... end

File positions. A position within a file is defined by:

module Psl : sig ... end
module Pwm : sig ... end

Position-weight matrix

module RSet : sig ... end

Efficient integer sets when many elements expected to be large contiguous sequences of integers.

module Range : sig ... end

Ranges of contiguous integers (integer intervals). A range is a contiguous sequence of integers from a lower bound to an upper bound. For example, [2, 10] is the set of integers from 2 through 10, inclusive of 2 and 10.

module Roman_num : sig ... end

Roman numerals. Values greater than or equal to 1 are valid roman numerals.

module Sam : sig ... end

SAM files. Documentation here assumes familiarity with the SAM specification.

module Sbml : sig ... end

SBML file parser. Currently only level 2 version 4 is supported.

module Seq : sig ... end

Nucleic acid sequences. A nucleic acid code is any of A, C, G, T, U, R, Y, K, M, S, W, B, D, H, V, N, or X. See IUB/IUPAC standards for further information. Gaps are not supported. Internal representation uses uppercase, but constructors are case-insensitive. By convention the first nucleic acid in a sequence is numbered 1.

module Seq_range : sig ... end

Range on a sequence, where the sequence is represented by an identifier.

module Sgr : sig ... end

Sequence Graph (SGR) files.

module Solexa_score : sig ... end

Solexa quality scores.

module Strand : sig ... end

Strand names. There are various conventions for referring to the two strands of DNA. This module provides an of_string function that parses the various conventions into a canonical representation, which we define to be '-' or '+'.

module Table : sig ... end

Generic “tables” (like CSV, TSV, Bed …).

module Tfxm : sig ... end

Buffered transforms. A buffered transform represents a method for converting a stream of inputs to a stream of outputs. However, inputs can also be buffered, i.e. you can feed inputs to the transform and pull out outputs later. There is no requirement that 1 input produces exactly 1 output. It is common that multiple input values are needed to construct a single output, and vice versa.

module Track : sig ... end

Track files in UCSC Genome Browser format. The following documentation assumes knowledge of concepts explained on the UCSC Genome Browser's website. Basically, a track file is one of several types of data (WIG, GFF, etc.), possibly preceded by comments, browser lines, and a track line. This module allows only a single data track within a file, although the UCSC specifies that multiple tracks may be provided together.

module Transcripts : sig ... end

Transcripts are integer intervals containing a list of exons. Exons are themselves defined as a list of integer intervals.

module Vcf : sig ... end

Parsing of VCF files.

module Wig : sig ... end

WIG data.

module Zip : sig ... end

Streaming interface to the Zlib library.