Page
Library
Module
Module type
Parameter
Class
Class type
Source
Mechaml.AgentSourceScraping agent
Mechaml is a web agent that allows to :
It is build on top of Cohttp, Lwt and Lambdasoup.
The HttpResponse module defines a type and operations to extract content and metadata from server response
Create a new empty agent. ~max_redirect indicates how many times the agent will automatically and consecutively follow the Location header in case of HTTP 302 or 303 response codes to avoid a redirect loop. Set to 0 to disable any automatic redirection.
Perform a get request to the specified URI. get "http://www.site/some/url" agent sends a HTTP GET request and return the updated state of the agent together with the server's response
Same as get, but work directly with links instead of URIs
Send a raw post request to the specified URI
Submit a filled form
Save some downloaded content in a file
save_image "/path/to/myfile.jpg" image agent loads the image using get, open myfile.jpg and write the content in asynchronously, and return the result
save_content "/path/to/myfile.html" content write the specified content in a file using Lwt's asynchronous IO
(see Cookiejar)
Return the current Cookiejar
Set the current Cookiejar
Add a single cookie to the current Cookiejar
Remove a single cookie from the Cookiejar
Return the default headers sent when performing HTTP requests
Use the specified headers as new default headers
Add a single pair key/value to the default headers
Remove a single pair key/value from the default headers
Max redirection to avoid infinite loops (use 0 to disable automatic redirection)
The default maximum consecutive redirections
This module defines a monad that implicitely manages the state corresponding to the agent inside the Lwt monad. This is basically the state monad (for Agent.t) and the Lwt one stacked