Library
Module
Module type
Parameter
Class
Class type
Fork child processes to perform work on multiple cores.
ForkWork is intended for workloads that a master process can partition into independent jobs, each of which will typically take a while to execute (several seconds, or more). Also, the resulting values should not be too massive, since they must be marshalled for transmission back to the master process.
Get the number of processors believed to be available. The library attempts to detect this at program startup (currently only works on Linux), and if that fails it defaults to 4.
Override the number of processors believed to be available.
These map functions suffice for many use cases.
Map a list or array, forking one child process per item to map. In general, the result type 'b
should not include anything that's difficult to marshal, including functions, exceptions, weak arrays, or custom values from C bindings.
If a child process ends with an exception, the master process waits for any other running child processes to exit, and then raises an exception to the caller. However, the exception raised to the caller may not be the same one raised in the child process (see below). If multiple child processes end with exceptions, it is undefined which one the caller learns about. Once any exception is detected, no new child processes will be forked.
Due to limitations of OCaml's marshalling capabilities, communication of exceptions from a child process to the master process is tightly restricted:
ForkWork.ChildExn lst
, the same exception is re-raised in the master process. You can put any information you want into the string list, including marshalled values.exn
, the master process sees either ForkWork.ChildExn ["_"; Printexc.to_string exn]
or ForkWork.ChildExn ["_"; Printexc.to_string exn; Printexc.get_backtrace ()]
, depending on the status of Printexc.backtrace_status ()
.ChildExn
with information to be interpreted by the master process, you probably should not put the string "_"
as the first element of the list.Another, more type-safe option is to encode errors in the result type instead of raising an exception. The disadvantage of this is that ForkWork would still proceed with running all the remaining map operations.
The lower-level interface provides much more control over child process scheduling and result retrieval. For example, the master process does not have to be blocked while child processes are running, and the result of any individual child process can be retrieved as soon as it finishes.
Types
A child process can either complete successfully with a result or end with an exception, as described above.
Forking child processes
val manager : ?maxprocs:int -> unit -> 'a mgr
Create a job manager.
ForkWork.fork mgr f x
forks a child process to compute (f x)
. If the manager already has maxprocs
outstanding jobs, then by default fork
blocks until one of them exits.
raised by fork
iff ~nonblocking:true
and the manager already has maxprocs
outstanding child processes
Retrieving results
Non-blocking query for the result of a job. By default, if a result is returned, then it is also removed from the job manager's memory, such that future calls with the same job would raise Not_found
.
Non-blocking query for any available result.
Repeated calls to any_result
with ~keep:true
may return the same result.
Get the result of the job, blocking the caller until it's available.
Get the result of any job, blocking the caller until one is available.
Repeated calls to await_any_result
with ~keep:true
may return the same result.
raised by await_any_result
iff no results are available and there are no outstanding jobs
val await_all : 'a mgr -> unit
Block the caller until all outstanding jobs are done. The results of the jobs are still stored in the manager's memory, and can be retrieved as above.
val ignore_results : 'a mgr -> unit
Convenience function for child processes launched just for side-effects: for each result currently available in the job manager's memory, remove it therefrom; and if it's an exception result, raise ChildExn
. The result values are lost! This function never blocks; results from any still-running child processes remain pending.
exception IPC_Failure of job * exn
Any of the result retrieval functions might raise IPC_Failure
if an exception occurs while trying to receive a result from a child process. This is a severe internal error, and it's probably reasonable to clean up and abort the entire program if it occurs. Possible causes include:
Killing jobs
Kill a job. The job is removed from the manager's memory and the child process is sent SIGTERM if it's still running.
val kill_all : ?wait:bool -> 'a mgr -> unit
Kill all outstanding jobs, and also remove all results from the job manager's memory. This effectively resets the job manager.
The master process SHOULD NOT:
Sys.command
, Unix.fork
, Unix.wait
, or Unix.waitpid
from multiple threads at any time. Using them in a single-threaded program is possible with the following restriction: if you fork
your own child processes and subsequently wait
/waitpid
for them, you should not interleave any ForkWork functions in between those two steps. (Sys.command
always satisfies this restriction in a single-threaded program.)Child processes SHOULD NOT:
Unix.fork
or Unix.exec*
independently of each other (fork-exec and Sys.command
are OK)Lastly, there's a pedantic chance of ForkWork operations hanging or sending SIGTERM to the wrong process if/when the kernel recycles process IDs. Do not use ForkWork for avionics, nuclear equipment, etc.