package b0
Install
dune-project
Dependency
Authors
Maintainers
Sources
sha512=e9aa779e66c08fc763019f16d4706f465d16c05d6400b58fbd0313317ef33ddea51952e2b058db28e65f7ddb7012f328c8bf02d8f1da17bb543348541a2587f0
doc/b0.std/B0_url/index.html
Module B0_urlSource
Sloppy URL processing.
URL standards are in a sorry state. This module takes a sloppy approach to URL processing. It only breaks URLs into their components and classifies them.
Warning. None of the functions here perform percent encoding or decoding. Use Percent when deemed appropriate.
URLs
The type for schemes, without the ':' separator.
The type for HOST:PORT authorities.
The type for paths.
The type for queries, without the '?' separator.
The type for fragments, without the '#' seperator.
The type for URLs.
Kinds
The type for kinds of relative references. Represents this alternation.
The type for kinds of URLs. Represents this this alternation.
kind u determines the kind of u. It decides that u is absolute if u starts with a scheme and :.
Operations
val of_url :
t ->
?scheme:scheme option ->
?authority:authority option ->
?path:path option ->
?query:query option ->
?fragment:fragment option ->
unit ->
tof_url u () is a new url whith unspecified components defaulting to those of u. If specified with None the given component is deleted.
append root u is u if kind u is `Absolute. Otherwise uses root to make it absolute according to its relative_kind. The result is guaranteed to be absolute if root is, the result may be surprising or non-sensical if root isn't (FIXME can't we characterize that more ?).
to_absolute ~scheme ~root_path transforms u depending on the value of kind u:
- If
`Absolutethen this isuitself. - If
`Relative `Schemethenuis given the schemescheme. - If
`Relative `Absolute_paththenuis given the schemescheme. - If
`Relative `Relative_paththenuis given the schemeschemeand the path ofuis prepended byroot_path(if any). - If
`Relative `Emptythenuis given the schemeschemeand the path isroot_path(if any).
Authorities
Scraping
list_of_text_scrape ?root s roughly finds absolute and relative URLs in the ASCII compatible (including UTF-8) textual data s by looking in order:
- For the next
hreforsrcsubstring then tries to parses the content of an HTML attribute. This may result in relative or absolute paths. - For next
httpsubstrings insand then delimits an URL depending on the previous characters and checks that the delimited URL starts withhttp://orhttps://.
Relative URLs are appended to root if provided. Otherwise they are kept as is. The result may have duplicates.
Formatting
pp formats an URL. For now this is just Format.pp_print_string.
pp_kind formats an unspecified representation of kinds.