package biocaml

You can search for identifiers within the package.

in-package search v0.2.0

On This Page

Low-level Printing

biocaml.base
- Biocaml_base
  - Bed
    
    Bed3
    
    Bed4
    
    Bed5
    
    Bed5_raw
  - Fasta
    
    Parser
    
    Parser0
  - Gff
  - Line
  - Lines
    
    Parser
  - Macs2
    
    Broad_peaks
    
    Xls
  - Table
    
    Field
  - Ucsc_genome_browser
biocaml.ez
- Biocaml_ez
  - Bam
  - Bamstats
  - Fasta
  - Fastq
    
    Illumina
  - Lines
    
    Buffer
    
    MakeIO
    
    Future
    
    Deferred
    
    Let_syntax
    
    Let_syntax
    
    Open_on_rhs
    
    List
    
    Monad_infix
    
    Or_error
    
    List
    
    In_thread
    
    Pipe
    
    Reader
    
    Reader
    
    Read_result
    
    Unix
    
    Writer
    
    Transform
  - Phred_score
  - Range
  - Roman_num
  - Sam
  - Seq_range
  - Strand
biocaml.unix
- Biocaml_unix
  - Accu
    
    Counter
    
    Relation
  - Bam
    
    Alignment0
    
    Header
  - Bamstats
    
    Chr_histogram
    
    Fragment_length_histogram
  - Bar
  - Bed
    
    Error
    
    Transform
  - Bgzf
  - Bin_pred
  - Biocaml_result
    
    Export
    
    Let_syntax
    
    Let_syntax
    
    Open_on_rhs
    
    List
    
    Monad_infix
    
    Stable
    
    V1
    
    V1_stable_unit_test
  - Bpmap
  - Cel
  - Chr
    
    Error
  - Entrez
    
    Fetch
    
    Make
    
    Dbtag
    
    F
    
    Gene
    
    Gene_ref
    
    Object_id
    
    Pubmed
    
    PubmedSummary
  - Fasta
  - Fastq
    
    Illumina
    
    MakeIO
    
    Future
    
    Deferred
    
    Let_syntax
    
    Let_syntax
    
    Open_on_rhs
    
    List
    
    Monad_infix
    
    Or_error
    
    List
    
    In_thread
    
    Pipe
    
    Reader
    
    Reader
    
    Read_result
    
    Unix
    
    Writer
  - File_mapper
  - Future
    
    S
    
    Deferred
    
    Let_syntax
    
    Let_syntax
    
    Open_on_rhs
    
    List
    
    Monad_infix
    
    Or_error
    
    List
    
    In_thread
    
    Pipe
    
    Reader
    
    Reader
    
    Read_result
    
    Unix
    
    Writer
  - Future_unix
    
    Deferred
    
    Let_syntax
    
    Let_syntax
    
    Open_on_rhs
    
    List
    
    Monad_infix
    
    Or_error
    
    List
    
    In_thread
    
    Pipe
    
    Reader
    
    Reader
    
    Read_result
    
    Unix
    
    Writer
  - GenomeMap
    
    Chromosome
    
    Make
    
    Chromosome
    
    LMap
    
    LSet
    
    Selection
    
    Signal
  - Gff
    
    Error
    
    Tags
    
    Transform
  - Histogram
  - Interval_tree
  - Iset
  - Jaspar
  - Line
  - Lines
    
    Buffer
    
    MakeIO
    
    Future
    
    Deferred
    
    Let_syntax
    
    Let_syntax
    
    Open_on_rhs
    
    List
    
    Monad_infix
    
    Or_error
    
    List
    
    In_thread
    
    Pipe
    
    Reader
    
    Reader
    
    Read_result
    
    Unix
    
    Writer
    
    Transform
  - Math
  - Msg
    
    Tree
  - MzData
    
    Precursor
  - Phred_score
  - Pos
  - Psl
    
    Error
    
    Transform
  - Pwm
  - RSet
  - Range
  - Roman_num
  - Sam
    
    Flags
    
    MakeIO
    
    Future
    
    Deferred
    
    Let_syntax
    
    Let_syntax
    
    Open_on_rhs
    
    List
    
    Monad_infix
    
    Or_error
    
    List
    
    In_thread
    
    Pipe
    
    Reader
    
    Reader
    
    Read_result
    
    Unix
    
    Writer
  - Sbml
  - Seq
  - Seq_range
    
    Identifier
    
    Map
    
    Key
    
    Provide_bin_io
    
    Key
    
    Provide_hash
    
    Key
    
    Provide_of_sexp
    
    Key
    
    Tree
    
    Provide_of_sexp
    
    K
    
    Replace_polymorphic_compare
    
    Set
    
    Elt
    
    Named
    
    Provide_bin_io
    
    Elt
    
    Provide_hash
    
    Elt
    
    Provide_of_sexp
    
    Elt
    
    Tree
    
    Named
    
    Provide_of_sexp
    
    Elt
    
    Make
    
    S
    
    Map
    
    Key
    
    Provide_bin_io
    
    Key
    
    Provide_hash
    
    Key
    
    Provide_of_sexp
    
    Key
    
    Tree
    
    Provide_of_sexp
    
    K
    
    Replace_polymorphic_compare
    
    Set
    
    Elt
    
    Named
    
    Provide_bin_io
    
    Elt
    
    Provide_hash
    
    Elt
    
    Provide_of_sexp
    
    Elt
    
    Tree
    
    Named
    
    Provide_of_sexp
    
    Elt
  - Sgr
  - Solexa_score
  - Strand
  - Table
    
    Row
    
    Error
    
    Tags
    
    Transform
  - Tfxm
    
    object_t
  - Track
    
    Error
    
    Transform
  - Transcripts
  - Vcf
    
    Transform
  - Wig
    
    Error
    
    Tags
    
    Transform
  - Zip
    
    Default
    
    Error
    
    Transform

Legend:
Library
Module
Module type
Parameter
Class
Class type

FASTQ files. The FASTQ file format is repeated sequence of 4 lines:

    \@name
    sequence
    +comment
    qualities
    ...

The name line begins with an @ character, which is omitted in the parsed item type provided by this module. Any spaces after the @ are retained, but the specification implies that there shouldn't be any such spaces. Trailing whitespace is also retained since you should not normally have such files.

The comment line, which begins with a +, is handled similarly. The purpose of the comment line is unclear and it is rarely used. Also, "comment" may not be the correct term for this line.

The name line may be structured into two parts: a sequence identifier and an optional description. We provide a function split_name to parse such a value. However, an item's name field contains the unparsed string because it is unclear whether fastq files really follow this. Also the format of the description is unspecified. When it is provided, usually it has some additional structure, so the minimal amount of parsing done by split_name isn't too useful anyway.

Illumina uses a systematic format for the name line that serves as a unique sequence identifier. Use Illumina.sequence_id_of_string to parse an item's name field when you have fastq files produced by Casava version >= 1.8. Earlier versions of Casava returned a different format, which is not currently supported in this module (it could be easily added).

The qualities line is returned as a plain string, but it is required to be decodable as either Phred or Solexa scores. Modules Phred_score and Solexa_score can be used to parse as needed.

Older FASTQ files allowed the sequence and qualities strings to span multiple lines. This is discouraged and is not supported by this module.

type item = {

name : string;
sequence : string;
comment : string;
qualities : string;

}

val item_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> item

val sexp_of_item : item -> Ppx_sexp_conv_lib.Sexp.t

val split_name : string -> string * string option

Split a name string into a sequence identifier and an optional description. It is assumed that the given string is from an item's name field, i.e. that it doesn't contain a leading @ char.

module MakeIO (Future : Future.S) : sig ... end

include sig ... end

val read : 
  Future_unix.Reader.t ->
  item Core_kernel.Or_error.t Future_unix.Pipe.Reader.t

val write : 
  Future_unix.Writer.t ->
  item Future_unix.Pipe.Reader.t ->
  unit Future_unix.Deferred.t

val write_file : 
  ?perm:int ->
  ?append:bool ->
  string ->
  item Future_unix.Pipe.Reader.t ->
  unit Future_unix.Deferred.t

module Illumina : sig ... end

Low-level Printing

val item_to_string : item -> string

This function converts item values to strings that can be dumped to a file, i.e. they contain full-lines, including all end-of-line characters.

val name_of_line : ?pos:Pos.t -> Line.t -> string Core_kernel.Or_error.t

Low-level Parsing

val sequence_of_line : ?pos:Pos.t -> Line.t -> string

val comment_of_line : ?pos:Pos.t -> Line.t -> string Core_kernel.Or_error.t

val qualities_of_line : 
  ?pos:Pos.t ->
  ?sequence:string ->
  Line.t ->
  string Core_kernel.Or_error.t

qualities sequence line parses given qualities line in the context of a previously parsed sequence. The sequence is needed to assure the correct number of quality scores are provided. If not provided, this check is omitted.