ACGtk user documentation
Install
Installation Process
ACGtk depends on several OCaml libraries. Fortunately, the OCaml package manager opam
makes the installation fast and easy.
The installation typically goes that way:
- First install
opam
using your preferred distribution/packaging mode. - Then initialize the
opam
environment. opam
environments can use the Ocaml compiler provided by your distribution, or locally install a compiler. They are called switches. Don't forget to add the required libraries in the path running
$ eval `opam config env`
ACGtk depends on several external libraries, i.e., libraries that are not OCaml libraries and need to be installed by your OS distribution. This can be done by opam as well running
$ opam depext acgtk
then continue with the ACG toolkit installation running
$ opam install acgtk
This will install the binaries acgc
and acg
in the .opam/OCAML_VERSION/bin
directory (where OCAML_VERSION
is the OCaml version that you are using with opam. It can be system
in case you are using the compiler installed by your OS distribution, or a version number (such as 4.05.0) otherwise).
It will also install:
- the
acg.el
directory in the .opam/OCAML_VERSION/share/emacs/site-lisp
directory - the
examples
directory in the .opam/OCAML_VERSION/share/acgtk
directory
To get the actual path of the share
directory, just run
$ opam var share
For best results (correct rendering of symbols in the graphical output), please also install the free DejaVu fonts for your OS distribution.
ACG emacs mode
There is an ACG emacs mode acg.el
in the emacs directory.
Its main feature is to be loaded when editing an acg data file (with signatures and lexicons). It is automatically loaded for files with a .acg
extension
It basically contains compilation directives and next error searching.
- First load an acg file
- then run
M-x compile
(or C-c C-c
) to call the compiler (acgc
) - then run
M-x next error
(or C-x
) to search for the next error (if any) and highlight it.
Quick Way to Have It Work
Copy the following lines in your .emacs
(setq load-path (cons "EMACS_DIR_PATH" load-path))
(setq auto-mode-alist (cons '("\\.acg" . acg-mode) auto-mode-alist))
(autoload 'acg-mode "acg" "Major mode for editing ACG definitions" t)
where EMACS_DIR_PATH
is .opam/OCAML_VERSION/share/acgtk/emacs
.
Site Distribution
- Copy
acg.el
under an acg
directory in your site-lisp directory (typically /usr/share/emacs/site-lisp/
or /usr/local/share/emacs/site-lisp/
) Create a 50acg.el
file into the /etc/emacs/site-start.d
directory and copy the following lines in it:
(setq load-path (cons "EMACS_DIR_PATH" load-path))
(setq auto-mode-alist (cons '("\\.acg" . acg-mode) auto-mode-alist))
(autoload 'acg-mode "acg" "Major mode for editing ACG definitions" t)
where EMACS_DIR_PATH
is the acg directory in your site-lisp directory (typically /usr/share/emacs/site-lisp/acg
)
Dependencies
Installation of the required libraries is automatically performed by opam. But here is the list of the required libraries.
OCaml libraries
Optionally, in order to make some changes to the code, the following libraries are required as well:
Non Ocaml Libraries
These libraries should be installed automatically running
$ opam depext acgtk
They are:
- the development library for cairo
- the development library for freetype
- the development library for readline
- pkg-config
To know the required packages on your OS in order to install ACGtk, you can run:
$ opam list --rec --required-by acgtk --external
and look at the output corresponding to your system.
Update
Whenever a version of acgc
and acg
already exists and was installed using opam, it's possible to move to a more recent version (if any) by running the following commands:
$ opam update
$ opam upgrade
Quick Start
Example files both for acgc
and acg
are provided in the example directory.
acgc
: Compiling Abstract Categorial Grammars
acgc
is a "compiler" of ACG source code, i.e., files containing definitions of signatures and lexicons. It basically checks whether they are correctly written (syntactically and w.r.t. types and constant typing) and outputs an .acgo
object file.
Run
$ acgc --help
to get help.
For instance, let's assume a file my_acg.acg
. A basic usage of the acgc
command could be:
$ acgc -o my_acg.acgo my_acg.acg
This will produce a my_acg.acgo
file (note that this is the default name if the -o
option is not provided).
acg
: Using Abstract Categorial Grammars
acg
is an interpreter of commands of the specific scripting language to use user-defined ACGs. Parameters of the interpreter are available running acg --help
.
To get the list of available commands of the scripting language, run
$ acg
It will launch the interpreter in interactive mode. Then, on the prompt, type
ACGtk> help
Type CTRL-D
to exit from the program.
Running a script
It is also possible to run scripts with acg
. Run
$ acg script.acgs
to execute the script script.acgs
in non-interactive mode.
Loading Data
Assuming the file my_acg.acgo
was produced as explained above, running :
$ acg
will open a prompt in which you can type:
ACGtk> load "my_acg.acgo"
to load the data contained compiled from the my_acg.acg
file into my_acg.acgo
.
Alternatively, you could directly load the data file:
ACGtk> load "my_acg.acg"
Type-Checking Terms
Assuming you have defined the signature Sig
and the lexicon Lex
, you can then run the following commands:
ACGtk> "lambda x.some_cst x: NP ->S" | check signature=Sig
to check whether lambda x.some_cst x
is a term of type NP ->S
according to Sig
.
Interpreting (or Realizing) Terms
Run:
ACGtk> "lambda x.some_cst x: NP ->S" | realize lexicon=Lex
to compute the image of lambda x.some_cst x
by Lex
(assuming this term and this type are correct according to the abstract signature of Lex
).
Parsing Terms
Run:
ACGtk> "John+loves+Mary: string" | parse lexicon=Lex type=S
to check whether the term John+loves+Mary
of type string
has an antecedent of type S
by Lex
, assuming that string
is the interpretation of the type S
by Lex
.
Manual
Syntax of Data Files
You can look at the examples/tag.acg
file for an example).
An ACG data file is a sequence of signature or lexicon declarations, separated with a ;
.
Signatures
signature my_sig_name=
sig_entries
end
sig_entries
is a list of sig_entry
, separated with a ;
. A sig_entry
can be:
a type declaration as in
NP, S : type;
a type definition as in
o : type;
string = o -> o;
Note that type constructors are ->
(or →
) and =>
(or =>
) for the linear and intuitionistic arrows respectively.
a constant declarations as in
foo: NP;
bar, dummy: NP -> S;
infix + : string -> string -> string;
prefix - : bool -> bool;
binder All : (e =>t) -> t;
binder ∃ : (e =>t) -> t;
infix > : bool -> bool -> bool; (*This means implication*)
infix ∧ : bool -> bool -> bool; (*This means conjunction*)
Note that infix
and prefix
are keywords to introduce symbols. Also notes that comments are surrounded by (*
and *)
.
a constant definitions as in
n = lambda n. bar n : NP -> S;
infix + = lambda x y z.x(y z): string -> string -> string;
prefix - = lambda p.not p:bool -> bool;
everyone = lambda P. All x. (human x) > (P x) ;
someone = lambda P. ∃ x. (human x) ∧ (P x) ;
Note the syntax for binders (All
in the last example). Available construction for terms are:
lambda x y z.t
(or λ⁰ x y z.t
) for linear abstractionLambda x y z.t
(or λ x y z.t
) for non-linear abstractiont u v
for application (equivalent to (t u) v
)t SYM u
if SYM
is a infix symbol (lowest priority). It is equal to ((SYM) t) u
where SYM
is used as usual constant, with the priority of applicationSYM t
if SYM
is a prefix symbol (highest priority)BINDER x y z.t
if BINDER
is a binder
About Associativity and Precedence of Operators
Prefix operators have precedence over application, and application has precedence over infix operators. Relative precedence among infix operators can be defined.
When no associativity specification is set, the default is left associative.
When no precedence definition is given, the default is higher precedence over any infix operator defined so far.
When declaring or defining an infix operator with the keyword 'infix', the optional specification for the associativity and the relative precedence can be set.
A specification is given between square brackets. The syntax is as follows:
infix [specification] SYM …
(the remaining part of the declaration is the same as without the specification)
A specification is non-empty comma-separated list of:
- an (optional) associativity specification, given by one of the keywords
Left
, Right
, or NonAssoc
. If not present, left associativity is set by default to infix operators. - an (optional) precedence declaration (if not present, the highest precedence over all the infix operators defined so far is given). It is defined as
< SYM
(where SYM
is a symbol). It assigns to the operator being declared or defined the greatest precedence *below* the precedence of SYM
.
It is possible to use an infix symbol as a normal constant by surrounding it with left and right parenthesis, so that t SYM u = (SYM) t u
.
See examples/infix-examples.acg
and examples/infix-examples-script
for examples.
Lexicons
There are two ways to define a lexicon:
By using the keyword lexicon
or nl_lexicon
as in :
lexicon my_lex_name(abstract_sig_name) : object_sig_name =
lex_entries
end
or
nl_lexicon my_lex_name(abstract_sig_name) : object_sig_name =
lex_entries
end
With the lexicon
keyword, lambda
(resp. ->
) is interpreted as lambda
(resp. ->
), whereas with nl_lexicon
, lambda
(resp. ->
) is interpreted as Lambda
(resp. =>
). I.e., everything is interpreted non-linearly. It is useful when not interested in linear constraints in the object signature (as, for instance, in the context-free lambda grammars).
Lex_entries
is a list of lex_entry
, separated with a ;
. A lex_entry
can be of the following forms:
abstract_atomic_type1, abstract_atomic_type2 := object_type;
abstract_const1, abstract_const2 := object_term;
By lexicon composition as in:
lexicon my_new_lex = lex_2 << lex_1
Keywords
The keywords are signature
, lexicon
, nl_lexicon
, end
, type
, prefix
, infix
, binder
, lambda
, λ⁰
(means the same thing as lambda
), Lambda
, and λ
(means the same thing as Lambda
).
The reserved symbols are =
, <<
, ;
, :
, ,
, (
, )
, .
, ->
(and →
), =>
(and ⇒
), and :=
.
Inside a signature or a lexicon, signature
, lexicon
and nl_lexicon
are not considered as keywords and can be used as identifiers.
Other keywords can be used as identifiers when escaped with \
(e.g., \end
).
Identifiers
Identifiers may use Unicode characters. They follow the following grammar:
IDENT := id_start id_continue* (subscripts|superscripts)* '*
where:
Symbols
Symbols (introduced with the keywords infix
, prefix
, or binder
) may use Unicode characters as well. They follow the following grammar:
SYMBOL := symbols+ (subscripts|superscripts)* (_ | id_continue*)? (subscripts|superscripts)*
where symbols is a set of mathematical symbols:
- ¬ (U+00AC), × (U+D7), · (U+B7), 0xF7 (*÷*), ∀ (U+2200), ∁ (U+2201), ∂ (U+2202), ∃ (U+2203), ∄ (U+2204), ∅ (U+2205), ∆ (U+2206), ∇ (U+2207), ∈ (U+2208), ∉ (U+2209), ∊ (U+220A), ∋ (U+220B), ∌ (U+220C), ∍ (U+220D), ∎ (U+220E), 0x220F (*∏*), ∐ (U+2210), ∑ (U+2211), − (U+2212), ∓ (U+2213), ∔ (U+2214), ∕ (U+2215), ∖ (U+2216), ∗ (U+2217), ∘ (U+2218), ∙ (U+2219), √ (U+221A), ∛ (U+221B), ∜ (U+221C), ∝ (U+221D), ∞ (U+221E), 0x221F (*∟*), ∠ (U+2220), ∡ (U+2221), ∢ (U+2222), ∣ (U+2223), ∤ (U+2224), ∥ (U+2225), ∦ (U+2226), ∧ (U+2227), ∨ (U+2228), ∩ (U+2229), ∪ (U+222A), ∫ (U+222B), ∬ (U+222C), ∭ (U+222D), ∮ (U+222E), 0x222F (*∯*), ∰ (U+2230), ∱ (U+2231), ∲ (U+2232), ∳ (U+2233), ∴ (U+2234), ∵ (U+2235), ∶ (U+2236), ∷ (U+2237), ∸ (U+2238), ∹ (U+2239), ∺ (U+223A), ∻ (U+223B), ∼ (U+223C), ∽ (U+223D), ∾ (U+223E), 0x223F (*∿*), ≀ (U+2240), ≁ (U+2241), ≂ (U+2242), ≃ (U+2243), ≄ (U+2244), ≅ (U+2245), ≆ (U+2246), ≇ (U+2247), ≈ (U+2248), ≉ (U+2249), ≊ (U+224A), ≋ (U+224B), ≌ (U+224C), ≍ (U+224D), ≎ (U+224E), 0x224F (*≏*), ≐ (U+2250), ≑ (U+2251), ≒ (U+2252), ≓ (U+2253), ≔ (U+2254), ≕ (U+2255), ≖ (U+2256), ≗ (U+2257), ≘ (U+2258), ≙ (U+2259), ≚ (U+225A), ≛ (U+225B), ≜ (U+225C), ≝ (U+225D), ≞ (U+225E), 0x225F (*≟*), ≠ (U+2260), ≡ (U+2261), ≢ (U+2262), ≣ (U+2263), ≤ (U+2264), ≥ (U+2265), ≦ (U+2266), ≧ (U+2267), ≨ (U+2268), ≩ (U+2269), ≪ (U+226A), ≫ (U+226B), ≬ (U+226C), ≭ (U+226D), ≮ (U+226E), 0x226F (*≯*), ≰ (U+2270), ≱ (U+2271), ≲ (U+2272), ≳ (U+2273), ≴ (U+2274), ≵ (U+2275), ≶ (U+2276), ≷ (U+2277), ≸ (U+2278), ≹ (U+2279), ≺ (U+227A), ≻ (U+227B), ≼ (U+227C), ≽ (U+227D), ≾ (U+227E), 0x227F (*≿*), ⊀ (U+2280), ⊁ (U+2281), ⊂ (U+2282), ⊃ (U+2283), ⊄ (U+2284), ⊅ (U+2285), ⊆ (U+2286), ⊇ (U+2287), ⊈ (U+2288), ⊉ (U+2289), ⊊ (U+228A), ⊋ (U+228B), ⊌ (U+228C), ⊍ (U+228D), ⊎ (U+228E), 0x228F (*⊏*), ⊐ (U+2290), ⊑ (U+2291), ⊒ (U+2292), ⊓ (U+2293), ⊔ (U+2294), ⊕ (U+2295), ⊖ (U+2296), ⊗ (U+2297), ⊘ (U+2298), ⊙ (U+2299), ⊚ (U+229A), ⊛ (U+229B), ⊜ (U+229C), ⊝ (U+229D), ⊞ (U+229E), 0x229F (*⊟*), ⊠ (U+22A0), ⊡ (U+22A1), ⊢ (U+22A2), ⊣ (U+22A3), ⊤ (U+22A4), ⊥ (U+22A5), ⊦ (U+22A6), ⊧ (U+22A7), ⊨ (U+22A8), ⊩ (U+22A9), ⊪ (U+22AA), ⊫ (U+22AB), ⊬ (U+22AC), ⊭ (U+22AD), ⊮ (U+22AE), 0x22AF (*⊯*), ⊰ (U+22B0), ⊱ (U+22B1), ⊲ (U+22B2), ⊳ (U+22B3), ⊴ (U+22B4), ⊵ (U+22B5), ⊶ (U+22B6), ⊷ (U+22B7), ⊸ (U+22B8), ⊹ (U+22B9), ⊺ (U+22BA), ⊻ (U+22BB), ⊼ (U+22BC), ⊽ (U+22BD), ⊾ (U+22BE), 0x22BF (*⊿*), ⋀ (U+22C0), ⋁ (U+22C1), ⋂ (U+22C2), ⋃ (U+22C3), ⋄ (U+22C4), ⋅ (U+22C5), ⋆ (U+22C6), ⋇ (U+22C7), ⋈ (U+22C8), ⋉ (U+22C9), ⋊ (U+22CA), ⋋ (U+22CB), ⋌ (U+22CC), ⋍ (U+22CD), ⋎ (U+22CE), 0x22CF (*⋏*), ⋐ (U+22D0), ⋑ (U+22D1), ⋒ (U+22D2), ⋓ (U+22D3), ⋔ (U+22D4), ⋕ (U+22D5), ⋖ (U+22D6), ⋗ (U+22D7), ⋘ (U+22D8), ⋙ (U+22D9), ⋚ (U+22DA), ⋛ (U+22DB), ⋜ (U+22DC), ⋝ (U+22DD), ⋞ (U+22DE), 0x22DF (*⋟*), ⋠ (U+22E0), ⋡ (U+22E1), ⋢ (U+22E2), ⋣ (U+22E3), ⋤ (U+22E4), ⋥ (U+22E5), ⋦ (U+22E6), ⋧ (U+22E7), ⋨ (U+22E8), ⋩ (U+22E9), ⋪ (U+22EA), ⋫ (U+22EB), ⋬ (U+22EC), ⋭ (U+22ED), ⋮ (U+22EE), 0x22EF (*⋯*), 0x22F0 (*⋰*), 0x22F1 (*⋱*), 0x22F2 (*⋲*), 0x22F3 (*⋳*), 0x22F4 (*⋴*), 0x22F5 (*⋵*), 0x22F6 (*⋶*), 0x22F7 (*⋷*), 0x22F8 (*⋸*), 0x22F9 (*⋹*), 0x22FA (*⋺*), 0x22FB (*⋻*), 0x22FC (*⋼*), 0x22FD (*⋽*), 0x22FE (*⋾*),
- arrows but linear and intuitionistic ones: ← (U+2190), ↑ (U+2191), ↓ (U+2193), ↔ (U+2194), ↕ (U+2195), ↖ (U+2196), ↗ (U+2197), ↘ (U+2198), ↙ (U+2199), ↚ (U+219A), ↛ (U+219B), ↜ (U+219C), ↝ (U+219D), ↞ (U+219E), 0x219F (*↟*), ↠ (U+21A0), ↡ (U+21A1), ↢ (U+21A2), ↣ (U+21A3), ↤ (U+21A4), ↥ (U+21A5), ↦ (U+21A6), ↧ (U+21A7), ↨ (U+21A8), ↩ (U+21A9), ↪ (U+21AA), ↫ (U+21AB), ↬ (U+21AC), ↭ (U+21AD), ↮ (U+21AE), 0x21AF (*↯*), ↰ (U+21B0), ↱ (U+21B1), ↲ (U+21B2), ↳ (U+21B3), ↴ (U+21B4), ↵ (U+21B5), ↶ (U+21B6), ↷ (U+21B7), ↸ (U+21B8), ↹ (U+21B9), ↺ (U+21BA), ↻ (U+21BB), ↼ (U+21BC), ↽ (U+21BD), ↾ (U+21BE), 0x21BF (*↿*), ⇀ (U+21C0), ⇁ (U+21C1), ⇂ (U+21C2), ⇃ (U+21C3), ⇄ (U+21C4), ⇅ (U+21C5), ⇆ (U+21C6), ⇇ (U+21C7), ⇈ (U+21C8), ⇉ (U+21C9), ⇊ (U+21CA), ⇋ (U+21CB), ⇌ (U+21CC), ⇍ (U+21CD), ⇎ (U+21CE), 0x21CF (*⇏*), ⇐ (U+21D0), ⇑ (U+21D1), ⇓ (U+21D3), ⇔ (U+21D4), ⇕ (U+21D5), ⇖ (U+21D6), ⇗ (U+21D7), ⇘ (U+21D8), ⇙ (U+21D9), ⇚ (U+21DA), ⇛ (U+21DB), ⇜ (U+21DC), ⇝ (U+21DD), ⇞ (U+21DE), 0x21DF (*⇟*), ⇠ (U+21E0), ⇡ (U+21E1), ⇢ (U+21E2), ⇣ (U+21E3), ⇤ (U+21E4), ⇥ (U+21E5), ⇦ (U+21E6), ⇧ (U+21E7), ⇨ (U+21E8), ⇩ (U+21E9), ⇪ (U+21EA), ⇫ (U+21EB), ⇬ (U+21EC), ⇭ (U+21ED), ⇮ (U+21EE), 0x21EF (*⇯*), 0x21F0 (*⇰*), 0x21F1 (*⇱*), 0x21F2 (*⇲*), 0x21F3 (*⇳*), 0x21F4 (*⇴*), 0x21F5 (*⇵*), 0x21F6 (*⇶*), 0x21F7 (*⇷*), 0x21F8 (*⇸*), 0x21F9 (*⇹*), 0x21FA (*⇺*), 0x21FB (*⇻*), 0x21FC (*⇼*), 0x21FD (*⇽*), 0x21FE (*⇾*), 0x21FF (*⇿*)
acgc
Command Line
acgc [OPTION] … FILE …
acgc
parses each file FILE
, which is supposed to be a file containing ACG signatures or lexicons, either as source files (typically with the .acg
extension) or object files (with the .acgo
extension).
If all the files are successfully parsed, a binary containing all the ACG signatures and lexicons is created. By default, the name of the generated binary file is FILEn.acgo
where FILEn.acg
is the last parameter (see option -o
).
Options
-c VAL, --colors=VAL
(absent=auto
) Controls the colors in output. Use yes
to enable colors, no
to disable them, or auto
to enable them if the output is a TTY.-d, --debug
Starts acgc in debug mode: it will record and print backtraces of uncaught exceptions.-I DIR, --include=DIR
Sets DIR as a directory in which file arguments can be looked for.-m, --magic
Toggles on generating magic programs (experimental feature). Parsing with magic will be available in acg. Be aware that using this option may cause generated object files to be very large.-o FILE, --output=FILE
Sets FILE as the output file, instead of the default which is derived from the last given source file. If no source file is provided (only object files are provided), then all the signatures and lexicons will be available in FILE.-v [VAL], --verbosity[=VAL]
(default=1
, absent=0
) Sets the verbosity level. When called without argument, level is 1
(smallest not null verbosity level), but positional argument(s) FILE
may need to be separated by --
if no other optional argument is provided after -v
.
acg
Command Line
acg [OPTION] … [FILE] …
acg
parses each file FILE (if any), which is supposed to be a file containing ACG commands, and interpret them. If no file is provided, of if option --interactive
is set, it then interactively runs the ACG command interpreter.
A list of the available commands is available by running the help
command in the interpreter.
Options
-c VAL, --colors=VAL
(absent=auto
) Controls the colors in output. Use yes
to enable colors, no
to disable them, or auto
to enable them if the output is a TTY.-d, --debug
Starts acg
in debug mode: it will record and print backtraces of uncaught exceptions.-i, --interactive
Starts interactive mode even if script files are provided.-I DIR, --include=DIR
Sets DIR as a directory in which file arguments can be looked for.-m, --magic
Toggle on using magic set rewritten programs for parsing (experimental feature). When set, parsing commands using magic rewritten programs (if available in object files generated by acgc
).-r VAL, --seed=VAL
Seed to use for initialization of the random number generator. If this parameter is not provided, the random number generator will be initialized with a random seed.--realize=FILE
Sets the json config rendering file for the svg generated (by the realize command) files to FILE
(see Graphical Output).-s, --step-by-step
Executes scripts step by step. This means that the execution will be paused before each command, and after printing the result of commands which return terms. Also, this will print the executed script during the execution.
Scripting Language
The scripting language is described here.
Graphical Output
If a svg="realize.svg"
option is provided when running a realize
command during an acg
session, a file realize.svg
is generated in the current directory.
Such a file contains a representation as a tree of the operations described by the term to realize (applications, abstractions). Each node contains the abstract term and its realizations by each of the lexicons specified on the command line. The graphic file can for instance been observed through a web browser.
Four rendering engines are available so far to render the terms in each node:
The association between the name of a signature and a rendering engine is declared in a configuration file that can be loaded through the --realize
option of acg
and that looks like:
$ cat config.json
{
"signatures": [
{ "name": "TAG", "engine": "trees" },
{ "name": "DSTAG", "engine": "trees" },
{ "name": "CoTAG", "engine": "trees" },
{ "name": "derivations", "engine": "trees" },
{ "name": "strings", "engine" : "strings"},
{ "name": "Strings", "engine" : "strings"},
{ "name": "logic", "engine" : "logic"},
{ "name": "low_logic", "engine" : "logic"},
{ "name": "derived_trees", "engine" : "unranked trees"},
{ "name": "Derived_trees", "engine" : "unranked trees"},
{ "name": "trees", "engine" : "unranked trees"}
],
"colors": {
"node-background": (239, 239, 239),
"background": (255,255,255)
}
}
An example file is given in examples/config.json.