package xmldiff

  1. Overview
  2. Docs
On This Page
  1. Utilities
Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module XmldiffSource

Computing diffs on XML trees.

Algorithm adapted from Gregory Cobena's Phd. thesis: "Gestion des changements pour les données semi-structurées du Web (Change management of semi-structured data on the Web)"

Sourcetype name = Xmlm.name

= string * string . Representation of tags and attribute names.

Sourcemodule Nmap : Map.S with type key = name

Mapping over names.

Sourcetype 'a xmlt = [
  1. | `E of name * string Nmap.t * 'a list
    (*

    A node (tag, attributes, children)

    *)
  2. | `D of string
    (*

    CDATA leaf

    *)
]

XML tree. The type is parametrized because it is shared with an internal richer representation.

Sourcetype xmltree = xmltree xmlt

Simple XML tree.

Sourcetype label =
  1. | Node of string
  2. | Text of string
Sourcetype patch_path =
  1. | Path_cdata of int
    (*

    Path_cdata n refers to the nth CData leaf.

    *)
  2. | Path_node of Xmlm.name * int * patch_path option
    (*

    Path_node (tag, n, more) refers to the nth element with tag tag. If more <> None then the path goes on into the children of the referenced node.

    *)

A path to a node in an XML tree where to perform an operation.

Sourcetype position = [
  1. | `FirstChild
  2. | `After
]

When inserting or moving a node, this indicates whether to insert the tree as first node of the node corresponding to the given path, or as a right sibling.

Sourcetype patch_operation =
  1. | PInsert of xmltree * position
    (*

    Insert the given XML tree to the given path and position.

    *)
  2. | PDelete
    (*

    Delete the referenced node.

    *)
  3. | PUpdateCData of string
    (*

    Change the referenced node to a CData with the given contents.

    *)
  4. | PUpdateNode of Xmlm.name * string Nmap.t
    (*

    Update the referenced node to be a tag with the given attributes.

    *)
  5. | PReplace of xmltree
    (*

    Replace the referenced node by the given tree.

    *)
  6. | PMove of patch_path * position
    (*

    Move the node to the given path and position. In this case, the position must be considered when the node has been removed from the tree (this is important when moving a node under the same parent).

    *)

The patch operations. Each operation is to be performed at a given node (position) in the tree, referenced by a patch_path.

Sourcetype patch = (patch_path * patch_operation) list
Sourceval diff : ?cut:(name -> string Nmap.t -> xmltree list -> bool) -> xmltree -> xmltree -> patch

diff t1 t2 returns the patch p to change t1 into t2.

  • parameter cut

    is called on XML tree tag nodes; returning true means that that subnodes won't be compared separately, but the whole node will be compared as is. This is useful in case of big XML trees, otherwise a lot of comparisons are done between the two trees. For example, one can cut on "p", "pre", "ul" and "code" tags in a regular web page. Default behaviour is to cut nothing.

Sourceval diff_with_final_tree : ?cut:(name -> string Nmap.t -> xmltree list -> bool) -> xmltree -> xmltree -> patch * xmltree

Same as diff but returns also the patched tree, to be able to compare it with the given target tree (for testing purpose).

Utilities

Sourceval xml_of_string : string -> xmltree
Sourceval xml_of_file : string -> xmltree
Sourceval atts_of_map : string Nmap.t -> (name * string) list
Sourceval atts_of_list : (name * string) list -> string Nmap.t
Sourceval string_of_xml : ?cut:bool -> xmltree -> string
Sourceval string_of_name : (string * string) -> string
Sourceval string_of_atts : string Nmap.t -> string
Sourceval string_of_path : patch_path -> string
Sourceval string_of_patch_operation : (patch_path * patch_operation) -> string
Sourceval string_of_patch : patch -> string
OCaml

Innovation. Community. Security.