package synchronizer
Install
dune-project
Dependency
Authors
Maintainers
Sources
sha256=2ac5c39db33a7d5de715aba2cf3601f84fffa84aacab67b25ab24422b0079d83
sha512=799d449967b1a539f5554a6bfe70d1918e6949b2a24402ec50b6fb1a89f9948283b716d462c19814d02d637f8d4039840f74d6ae021970b827cfc8b644590e8a
doc/synchronizer/Synchronizer/index.html
Module SynchronizerSource
This module offers a primitive which allows several threads to synchronize "getting" and "writing" work units to a common state keeper.
The main idea of this module, is that it allows one to (more) easily go from a sequential algorithm to a parallel one.
"Get-work-to-do/write-work-done" sequential algorithms
The class of algorithms that this module helps make parallel are those that are built around a central "work to do" structure, and in a loop:
- get one work unit from the "work to do" pool
- Process it
- During processing, discover updates to write to the pool, and do so
- End the processing of this work unit
Most, if not all algorithm built around queues, stacks, and priority queues can be formulated this way: getting a work unit is simply popping from the queue, and writing updates is simply pushing new work units. However, this algorithm is more general: for example the work pool could be a tree, getting a new unit of work would entail some search of the tree, and writing updates would be adding nodes to the tree, or marking existing ones.
Turning it into a parallel algorithm
Adding "work in progress" semantics to the work pool
The first step to turn this sequential algorithm into a parallel one is to add a notion of "ongoing work" to our central data structure. When sequential, one cannot observe a work pool where some items are still being worked on, because each (sequential) iteration of the loop is the processing of exactly one work unit. Once parallel, the work pool should also (in most cases) encode a notion of the work tasks being processed.
Creating a synchronizer
Once it is done, the user should write two closures (notice that these two closures do not take the work pool as arguments, because they are closure, it is directly part of their environment).
getter: unit -> get optiongetter returns the next work unit, or None if their is (currently) no work unit to be scheduled. This is only currently because another thread may later enqueue some new work unit. The work pool does not have to deal with how many threads are still live this is done by the synchronizer itself (it is actually its main added value). The type get is the one of the work unit.
writer: write -> unitwriter takes an update to apply to the work pool, of type write.
Finally, one can create the synchronizer with
synchronizer = init getter writerThe notion of pledges
As mentioned before, when going to non-sequential, the main difference is that work can now be "on going": threads may hence still write updated while processing a work unit. Knowing whether a work pool is empty is hence not sufficient to know whether or not one should stop the loop: we must also know whether someone else may re-enqueue new work. The synchronizer supports this via pledges. A pledge is made to indicate that new updates may be done to the work pool. Each started pledge must be ended once the current work unit won't lead to more updates (corresponding to the end of the loop in the sequential algorithm).
Rewriting the main loop
The loop outline above now becomes:
- get one work unit from the "work to do" pool, atomically pledging that you may still write updates
- Process the work unit
- During processing, discover updates to write to the pool, and do so
- End the processing of this work unit by ending the pledge
This loop can also be done using the provided work_while function.
A synchronizer with work units of type 'get and updates of type 'write
Create a new synchronizer with the provided getter and writer (see module doc).
Get a work unit from the synchronizer. If pledge is true, atomically create a new pledge.
Blocks if no work unit is available but some may become in the future (i.e. there are still active pledges and the synchronizer has not been marked closed).
Returns None if no work unit is available and none will be in the future.
Make a new pledge to the synchronizer (see module doc).
Mark the synchronizer closed.
The synchronizer will return None on every subsequent get.