Page
Library
Module
Module type
Parameter
Class
Class type
Source
Mechaml.Agent
SourceScraping agent
Mechaml is a web agent that allows to :
It is built on top of Cohttp, Lwt and Lambdasoup.
The HttpResponse module defines a type and operations to extract content and metadata from the server response
Create a new empty agent. ~max_redirect
indicates how many times the agent will automatically and consecutively follow the Location
header in case of an HTTP 302 or 303 response code, to avoid a redirect loop. Set to 0
to disable automatic redirection.
The following functions perform a get request to the specified URI. get "http://www.site/some/url" agent
sends a HTTP GET request and return the updated state of the agent together with the server response
Same as get, but work directly with links instead of URIs
The following functions send a raw post request to the specified URI
Submit a filled form
Save some downloaded content in a file
save_image "/path/to/myfile.jpg" image agent
loads the image using get
, opens myfile.jpg
, write the content in asynchronously and then returns the result
save_content "/path/to/myfile.html" content
writes the specified content in a file using Lwt asynchronous I/O
(see Cookiejar
)
Return the current Cookiejar
Set the current Cookiejar
Add a single cookie to the current Cookiejar
Remove a single cookie from the Cookiejar
Return the default headers sent when performing HTTP requests
Use the specified headers as new default headers
Add a single key/value pair to the default headers
Remove a single key/value pair from the default headers
Set the maximum consecutive redirections (to avoid infinite loops). Use 0
to disable automatic redirection)
The default maximum consecutive redirections
This module defines a monad that implicitly manages the state corresponding to the agent while being inside the Lwt monad. This is basically the state monad (for Agent.t
) and the Lwt one stacked