Page
Library
Module
Module type
Parameter
Class
Class type
Source
Bio_io.Fasta_record
SourceA record type for FASTA files.
If you have a fasta file something like this:
>s1 apple pie
ACTG
actg
Then you would get a record
something like this:
Fasta_record.id record (* "s1" *) Fasta_record.desc record
(* Some "apple pie" *) Fasta_record.seq record
(* "ACTGactg" *)
If you have a fasta file something like this:
>s1
ACTG
actg
Then you would get a record
something like this:
Fasta_record.id record (* "s1" *) Fasta_record.desc record (* None *)
Fasta_record.seq record
(* "ACTGactg" *)
To change a part of the Fasta_record
use the with_*
functions. E.g.,
Fasta_record.with_id "apple" record
would change give you a t
with the id
set to "apple"
.
create ~id ~desc ~seq
creates a new t
. Shouldn't raise as literally any values of the correct type are accepted.
of_header_exn header
returns a t
from a FASTA header. May raise exceptions. Used internally for parsing FASTA files, but the code consuming the bio_io
module probably won't need to use this function.
of_header header
is like of_header_exn header
except that it returns Or_error.t
rather than raising exceptions.
to_string t
returns a string representation of t
ready to print to a FASTA output file.
to_string_nl t ~nl
returns a string representation of t
ready to print to a FASTA output file, including a trailing newline (nl) string. nl
defaults to "\n"
.
serialize t
returns the Sexp
of t
as a string.
equal this other
returns true
if all fields of two t
s are the same.
id t
returns the id
of the t
.
desc t
returns the desc
(description) of the t
.
seq t
returns the seq
of the t
.
seq_length t
returns the length of the seq
of t
.
If you construct a record by hand (e.g., with create
), and there are spaces or other weird characters in the sequences, they will be counted in the length. E.g.,
let r = Fasta_record.create ~id:"apple" ~desc:None ~seq:"a a" in
assert (Int.(3 = Fasta_record.seq_length r))
with_id new_id t
returns a t
with new_id
instead of the original id
.
with_seq new_seq t
returns a t
with new_seq
instead of the original seq
.
with_desc new_desc t
returns a t
with new_desc
instead of the original desc
.