Library
Module
Module type
Parameter
Class
Class type
ACGtk depends on several OCaml libraries. Fortunately, the OCaml package manager opam
makes the installation fast and easy.
The installation typically goes that way:
opam
using your preferred distribution/packaging mode.opam
environment. opam
environments can use the Ocaml compiler provided by your distribution, or locally install a compiler. They are called switches.Don't forget to add the required libraries in the path running
$ eval `opam config env`
ACGtk depends on several external libraries, i.e., libraries that are not OCaml libraries and need to be installed by your OS distribution. This can be done by opam as well running
$ opam depext acgtk
then continue with the ACG toolkit installation running
$ opam install acgtk
This will install the binaries acgc
and acg
in the .opam/OCAML_VERSION/bin
directory (where OCAML_VERSION
is the OCaml version that you are using with opam. It can be system
in case you are using the compiler installed by your OS distribution, or a version number (such as 4.05.0) otherwise).
It will also install:
acg.el
directory in the .opam/OCAML_VERSION/share/emacs/site-lisp
directoryexamples
directory in the .opam/OCAML_VERSION/share/acgtk
directoryTo get the actual path of the share
directory, just run
$ opam var share
For best results (correct rendering of symbols in the graphical output), please also install the free DejaVu fonts for your OS distribution.
There is an ACG emacs mode acg.el
in the emacs directory.
Its main feature is to be loaded when editing an acg data file (with signatures and lexicons). It is automatically loaded for files with a .acg
extension
It basically contains compilation directives and next error searching.
M-x compile
(or C-c C-c
) to call the compiler (acgc
)M-x next error
(or C-x
) to search for the next error (if any) and highlight it.Copy the following lines in your .emacs
(setq load-path (cons "EMACS_DIR_PATH" load-path))
(setq auto-mode-alist (cons '("\\.acg" . acg-mode) auto-mode-alist))
(autoload 'acg-mode "acg" "Major mode for editing ACG definitions" t)
where EMACS_DIR_PATH
is .opam/OCAML_VERSION/share/acgtk/emacs
.
acg.el
under an acg
directory in your site-lisp directory (typically /usr/share/emacs/site-lisp/
or /usr/local/share/emacs/site-lisp/
)Create a 50acg.el
file into the /etc/emacs/site-start.d
directory and copy the following lines in it:
(setq load-path (cons "EMACS_DIR_PATH" load-path))
(setq auto-mode-alist (cons '("\\.acg" . acg-mode) auto-mode-alist))
(autoload 'acg-mode "acg" "Major mode for editing ACG definitions" t)
where EMACS_DIR_PATH
is the acg directory in your site-lisp directory (typically /usr/share/emacs/site-lisp/acg
)
Installation of the required libraries is automatically performed by opam. But here is the list of the required libraries.
ocaml
(>=4.13.0)dune
(>=3.5)menhir
(>=20211230)ocamgraph
ANSITerminal
(>=0.8)fmt
logs
mtime
(>=2.0.0)cmdliner
(>=1.1.0)sedlex
cairo2
yojson
(>=1.6.0)readline
Optionally, in order to make some changes to the code, the following libraries are required as well:
These libraries should be installed automatically running
$ opam depext acgtk
They are:
To know the required packages on your OS in order to install ACGtk, you can run:
$ opam list --rec --required-by acgtk --external
and look at the output corresponding to your system.
Example files both for acgc
and acg
are provided in the example directory.
acgc
: Compiling Abstract Categorial Grammarsacgc
is a "compiler" of ACG source code, i.e., files containing definitions of signatures and lexicons. It basically checks whether they are correctly written (syntactically and w.r.t. types and constant typing) and outputs an .acgo
object file.
Run
$ acgc --help
to get help.
For instance, let's assume a file my_acg.acg
. A basic usage of the acgc
command could be:
$ acgc -o my_acg.acgo my_acg.acg
This will produce a my_acg.acgo
file (note that this is the default name if the -o
option is not provided).
acg
: Using Abstract Categorial Grammarsacg
is an interpreter of commands of the specific scripting language to use user-defined ACGs. Parameters of the interpreter are available running acg --help
.
To get the list of available commands of the scripting language, run
$ acg
It will launch the interpreter in interactive mode. Then, on the prompt, type
ACGtk> help
Type CTRL-D
to exit from the program.
It is also possible to run scripts with acg
. Run
$ acg script.acgs
to execute the script script.acgs
in non-interactive mode.
Assuming the file my_acg.acgo
was produced as explained above, running :
$ acg
will open a prompt in which you can type:
ACGtk> load "my_acg.acgo"
to load the data contained compiled from the my_acg.acg
file into my_acg.acgo
.
Alternatively, you could directly load the data file:
ACGtk> load "my_acg.acg"
Assuming you have defined the signature Sig
and the lexicon Lex
, you can then run the following commands:
ACGtk> "lambda x.some_cst x: NP ->S" | check signature=Sig
to check whether lambda x.some_cst x
is a term of type NP ->S
according to Sig
.
Run:
ACGtk> "lambda x.some_cst x: NP ->S" | realize lexicon=Lex
to compute the image of lambda x.some_cst x
by Lex
(assuming this term and this type are correct according to the abstract signature of Lex
).
Run:
ACGtk> "John+loves+Mary: string" | parse lexicon=Lex type=S
to check whether the term John+loves+Mary
of type string
has an antecedent of type S
by Lex
, assuming that string
is the interpretation of the type S
by Lex
.
You can look at the examples/tag.acg
file for an example).
An ACG data file is a sequence of signature or lexicon declarations, separated with a ;
.
signature my_sig_name=
sig_entries
end
sig_entries
is a list of sig_entry
, separated with a ;
. A sig_entry
can be:
a type declaration as in
NP, S : type;
a type definition as in
o : type;
string = o -> o;
Note that type constructors are ->
(or →
) and =>
(or =>
) for the linear and intuitionistic arrows respectively.
a constant declarations as in
foo: NP;
bar, dummy: NP -> S;
infix + : string -> string -> string;
prefix - : bool -> bool;
binder All : (e =>t) -> t;
binder ∃ : (e =>t) -> t;
infix > : bool -> bool -> bool; (*This means implication*)
infix ∧ : bool -> bool -> bool; (*This means conjunction*)
Note that infix
and prefix
are keywords to introduce symbols. Also notes that comments are surrounded by (*
and *)
.
a constant definitions as in
n = lambda n. bar n : NP -> S;
infix + = lambda x y z.x(y z): string -> string -> string;
prefix - = lambda p.not p:bool -> bool;
everyone = lambda P. All x. (human x) > (P x) ;
someone = lambda P. ∃ x. (human x) ∧ (P x) ;
Note the syntax for binders (All
in the last example). Available construction for terms are:
lambda x y z.t
(or λ⁰ x y z.t
) for linear abstractionLambda x y z.t
(or λ x y z.t
) for non-linear abstractiont u v
for application (equivalent to (t u) v
)t SYM u
if SYM
is a infix symbol (lowest priority). It is equal to ((SYM) t) u
where SYM
is used as usual constant, with the priority of applicationSYM t
if SYM
is a prefix symbol (highest priority)BINDER x y z.t
if BINDER
is a binderPrefix operators have precedence over application, and application has precedence over infix operators. Relative precedence among infix operators can be defined.
When no associativity specification is set, the default is left associative.
When no precedence definition is given, the default is higher precedence over any infix operator defined so far.
When declaring or defining an infix operator with the keyword 'infix', the optional specification for the associativity and the relative precedence can be set.
A specification is given between square brackets. The syntax is as follows:
infix [specification] SYM …
(the remaining part of the declaration is the same as without the specification)
A specification is non-empty comma-separated list of:
Left
, Right
, or NonAssoc
. If not present, left associativity is set by default to infix operators.< SYM
(where SYM
is a symbol). It assigns to the operator being declared or defined the greatest precedence *below* the precedence of SYM
.It is possible to use an infix symbol as a normal constant by surrounding it with left and right parenthesis, so that t SYM u = (SYM) t u
.
See examples/infix-examples.acg
and examples/infix-examples-script
for examples.
There are two ways to define a lexicon:
By using the keyword lexicon
or nl_lexicon
as in :
lexicon my_lex_name(abstract_sig_name) : object_sig_name =
lex_entries
end
or
nl_lexicon my_lex_name(abstract_sig_name) : object_sig_name =
lex_entries
end
With the lexicon
keyword, lambda
(resp. ->
) is interpreted as lambda
(resp. ->
), whereas with nl_lexicon
, lambda
(resp. ->
) is interpreted as Lambda
(resp. =>
). I.e., everything is interpreted non-linearly. It is useful when not interested in linear constraints in the object signature (as, for instance, in the context-free lambda grammars).
Lex_entries
is a list of lex_entry
, separated with a ;
. A lex_entry
can be of the following forms:
abstract_atomic_type1, abstract_atomic_type2 := object_type;
abstract_const1, abstract_const2 := object_term;
By lexicon composition as in:
lexicon my_new_lex = lex_2 << lex_1
The keywords are signature
, lexicon
, nl_lexicon
, end
, type
, prefix
, infix
, binder
, lambda
, λ⁰
(means the same thing as lambda
), Lambda
, and λ
(means the same thing as Lambda
).
The reserved symbols are =
, <<
, ;
, :
, ,
, (
, )
, .
, ->
(and →
), =>
(and ⇒
), and :=
.
Inside a signature or a lexicon, signature
, lexicon
and nl_lexicon
are not considered as keywords and can be used as identifiers.
Other keywords can be used as identifiers when escaped with \
(e.g., \end
).
Identifiers may use Unicode characters. They follow the following grammar:
IDENT := id_start id_continue* (subscripts|superscripts)* '*
where:
id_start
is the class of Unicode characters
(https://www.unicode.org/reports/tr31/) with the ID_Start
propertyid_continue
is the class of Unicode characters
(https://www.unicode.org/reports/tr31/) with the ID_Continue
propertysuperscripts
and subscripts
are the corresponding classes of characters:
superscripts
: 'ᵣ', 'ᵤ', 'ᵥ', 'ᵦ', 'ᵧ', 'ᵨ', 'ᵩ', 'ᵪ', '₀', '₁', '₂', '₃', '₄', '₅', '₆', '₇', '₈', '₉', '₊', '₋', '₌', '₍', '₎', '', 'ₐ', 'ₑ', 'ₒ', 'ₓ', 'ₔ', 'ₕ', 'ₖ', 'ₗ', 'ₘ', 'ₙ', 'ₚ', 'ₛ', 'ₜ', 'ⱼ'subscripts
: '²', '³', '¹', '⁰', 'ⁱ', '⁴', '⁵', '⁶', '⁷', '⁸', '⁹', '⁺', '⁻', '⁼', '⁽', '⁾', 'ⁿ'Symbols (introduced with the keywords infix
, prefix
, or binder
) may use Unicode characters as well. They follow the following grammar:
SYMBOL := symbols+ (subscripts|superscripts)* (_ | id_continue*)? (subscripts|superscripts)*
where symbols is a set of mathematical symbols:
acgc
acgc [OPTION] … FILE …
acgc
parses each file FILE
, which is supposed to be a file containing ACG signatures or lexicons, either as source files (typically with the .acg
extension) or object files (with the .acgo
extension).
If all the files are successfully parsed, a binary containing all the ACG signatures and lexicons is created. By default, the name of the generated binary file is FILEn.acgo
where FILEn.acg
is the last parameter (see option -o
).
-c VAL, --colors=VAL
(absent=auto
) Controls the colors in output. Use yes
to enable colors, no
to disable them, or auto
to enable them if the output is a TTY.-d, --debug
Starts acgc in debug mode: it will record and print backtraces of uncaught exceptions.-I DIR, --include=DIR
Sets DIR as a directory in which file arguments can be looked for.-m, --magic
Toggles on generating magic programs (experimental feature). Parsing with magic will be available in acg. Be aware that using this option may cause generated object files to be very large.-o FILE, --output=FILE
Sets FILE as the output file, instead of the default which is derived from the last given source file. If no source file is provided (only object files are provided), then all the signatures and lexicons will be available in FILE.-v [VAL], --verbosity[=VAL]
(default=1
, absent=0
) Sets the verbosity level. When called without argument, level is 1
(smallest not null verbosity level), but positional argument(s) FILE
may need to be separated by --
if no other optional argument is provided after -v
.acg
acg [OPTION] … [FILE] …
acg
parses each file FILE (if any), which is supposed to be a file containing ACG commands, and interpret them. If no file is provided, of if option --interactive
is set, it then interactively runs the ACG command interpreter.
A list of the available commands is available by running the help
command in the interpreter.
-c VAL, --colors=VAL
(absent=auto
) Controls the colors in output. Use yes
to enable colors, no
to disable them, or auto
to enable them if the output is a TTY.-d, --debug
Starts acg
in debug mode: it will record and print backtraces of uncaught exceptions.-i, --interactive
Starts interactive mode even if script files are provided.-I DIR, --include=DIR
Sets DIR as a directory in which file arguments can be looked for.-m, --magic
Toggle on using magic set rewritten programs for parsing (experimental feature). When set, parsing commands using magic rewritten programs (if available in object files generated by acgc
).-r VAL, --seed=VAL
Seed to use for initialization of the random number generator. If this parameter is not provided, the random number generator will be initialized with a random seed.--realize=FILE
Sets the json config rendering file for the svg generated (by the realize command) files to FILE
(see Graphical Output).-s, --step-by-step
Executes scripts step by step. This means that the execution will be paused before each command, and after printing the result of commands which return terms. Also, this will print the executed script during the execution.The scripting language is described here.
If a svg="realize.svg"
option is provided when running a realize
command during an acg
session, a file realize.svg
is generated in the current directory.
Such a file contains a representation as a tree of the operations described by the term to realize (applications, abstractions). Each node contains the abstract term and its realizations by each of the lexicons specified on the command line. The graphic file can for instance been observed through a web browser.
Four rendering engines are available so far to render the terms in each node:
the "logic" engine: formulas are rendered as logical formulas: non-logical constants are in bold font, logical connectives are rendered using utf-8 if their names are as follows:
Ex
-> ∃
ExUni
-> ∃!
Ex_l
-> ∃ₗ
Ex_t
-> ∃ₜ
All
-> ∀
All_t
-> ∀ₜ
TOP
-> ⊤
The
-> ι
&
-> ∧
>
-> ⇒
~
-> ¬
[a-zA-Z]+[0-9]*
, it is rendered only using the [a-zA-Z]
part.The association between the name of a signature and a rendering engine is declared in a configuration file that can be loaded through the --realize
option of acg
and that looks like:
$ cat config.json
{
"signatures": [
{ "name": "TAG", "engine": "trees" },
{ "name": "DSTAG", "engine": "trees" },
{ "name": "CoTAG", "engine": "trees" },
{ "name": "derivations", "engine": "trees" },
{ "name": "strings", "engine" : "strings"},
{ "name": "Strings", "engine" : "strings"},
{ "name": "logic", "engine" : "logic"},
{ "name": "low_logic", "engine" : "logic"},
{ "name": "derived_trees", "engine" : "unranked trees"},
{ "name": "Derived_trees", "engine" : "unranked trees"},
{ "name": "trees", "engine" : "unranked trees"}
],
"colors": {
"node-background": (239, 239, 239),
"background": (255,255,255)
}
}
An example file is given in examples/config.json
(examples/config.json)