This chapter describes the OCaml source-level replay debugger ocamldebug.
Unix: The debugger is available on Unix systems that provide BSD sockets.
Windows: The debugger is available under the Cygwin port of OCaml, but not under the native Win32 ports.
Before the debugger can be used, the program must be compiled and linked with the -g option: all .cmo and .cma files that are part of the program should have been created with ocamlc -g, and they must be linked together with ocamlc -g.
Compiling with -g entails no penalty on the running time of programs: object files and bytecode executable files are bigger and take longer to produce, but the executable files run at exactly the same speed as if they had been compiled without -g.
The OCaml debugger is invoked by running the program ocamldebug with the name of the bytecode executable file as first argument:
ocamldebug [options] program [arguments]
The arguments following program are optional, and are passed as command-line arguments to the program being debugged. (See also the set arguments command.)
The following command-line options are recognized:
On start-up, the debugger will read commands from an initialization file before giving control to the user. The default file is .ocamldebug in the current directory if it exists, otherwise .ocamldebug in the user’s home directory.
The command quit exits the debugger. You can also exit the debugger by typing an end-of-file character (usually ctrl-D).
Typing an interrupt character (usually ctrl-C) will not exit the debugger, but will terminate the action of any debugger command that is in progress and return to the debugger command level.
A debugger command is a single line of input. It starts with a command name, which is followed by arguments depending on this name. Examples:
run goto 1000 set arguments arg1 arg2
A command name can be truncated as long as there is no ambiguity. For instance, go 1000 is understood as goto 1000, since there are no other commands whose name starts with go. For the most frequently used commands, ambiguous abbreviations are allowed. For instance, r stands for run even though there are others commands starting with r. You can test the validity of an abbreviation using the help command.
If the previous command has been successful, a blank line (typing just RET) will repeat it.
The OCaml debugger has a simple on-line help system, which gives a brief description of each command and variable.
Events are “interesting” locations in the source code, corresponding to the beginning or end of evaluation of “interesting” sub-expressions. Events are the unit of single-stepping (stepping goes to the next or previous event encountered in the program execution). Also, breakpoints can only be set at events. Thus, events play the role of line numbers in debuggers for conventional languages.
During program execution, a counter is incremented at each event encountered. The value of this counter is referred as the current time. Thanks to reverse execution, it is possible to jump back and forth to any time of the execution.
Here is where the debugger events (written ⋈) are located in the source code:
(f arg)⋈
fun x y z -> ⋈ ...
function pat1 -> ⋈ expr1 | ... | patN -> ⋈ exprN
expr1; ⋈ expr2; ⋈ ...; ⋈ exprN
if cond then ⋈ expr1 else ⋈ expr2
while cond do ⋈ body done for i = a to b do ⋈ body done
Exceptions: A function application followed by a function return is replaced by the compiler by a jump (tail-call optimization). In this case, no event is put after the function application.
The debugger starts executing the debugged program only when needed. This allows setting breakpoints or assigning debugger variables before execution starts. There are several ways to start execution:
The execution of a program is affected by certain information it receives when the debugger starts it, such as the command-line arguments to the program and its working directory. The debugger provides commands to specify this information (set arguments and cd). These commands must be used before program execution starts. If you try to change the arguments or the working directory after starting your program, the debugger will kill the program (after asking for confirmation).
The following commands execute the program forward or backward, starting at the current time. The execution will stop either when specified by the command or when a breakpoint is encountered.
You can jump directly to a given time, without stopping on breakpoints, using the goto command.
As you move through the program, the debugger maintains an history of the successive times you stop at. The last command can be used to revisit these times: each last command moves one step back through the history. That is useful mainly to undo commands such as step and next.
A breakpoint causes the program to stop whenever a certain point in the program is reached. It can be set in several ways using the break command. Breakpoints are assigned numbers when set, for further reference. The most comfortable way to set breakpoints is through the Emacs interface (see section 18.10).
Each time the program performs a function application, it saves the location of the application (the return address) in a block of data called a stack frame. The frame also contains the local variables of the caller function. All the frames are allocated in a region of memory called the call stack. The command backtrace (or bt) displays parts of the call stack.
At any time, one of the stack frames is “selected” by the debugger; several debugger commands refer implicitly to the selected frame. In particular, whenever you ask the debugger for the value of a local variable, the value is found in the selected frame. The commands frame, up and down select whichever frame you are interested in.
When the program stops, the debugger automatically selects the currently executing frame and describes it briefly as the frame command does.
The debugger can print the current value of simple expressions. The expressions can involve program variables: all the identifiers that are in scope at the selected program point can be accessed.
Expressions that can be printed are a subset of OCaml expressions, as described by the following grammar:
|
The first two cases refer to a value identifier, either unqualified or qualified by the path to the structure that define it. * refers to the result just computed (typically, the value of a function application), and is valid only if the selected event is an “after” event (typically, a function application). $ integer refer to a previously printed value. The remaining four forms select part of an expression: respectively, a record field, an array element, a string element, and the current contents of a reference.
When printing a complex expression, a name of the form $integer is automatically assigned to its value. Such names are also assigned to parts of the value that cannot be printed because the maximal printing depth is exceeded. Named values can be printed later on with the commands p $integer or d $integer. Named values are valid only as long as the program is stopped. They are forgotten as soon as the program resumes execution.
A shell is used to pass the arguments to the debugged program. You can therefore use wildcards, shell variables, and file redirections inside the arguments. To debug programs that read from standard input, it is recommended to redirect their input from a file (using set arguments < input-file), otherwise input to the program and input to the debugger are not properly separated, and inputs are not properly replayed when running the program backwards.
The loadingmode variable controls how the program is executed.
The debugger searches for source files and compiled interface files in a list of directories, the search path. The search path initially contains the current directory . and the standard library directory. The directory command adds directories to the path.
Whenever the search path is modified, the debugger will clear any information it may have cached about the files.
Each time a program is started in the debugger, it inherits its working directory from the current working directory of the debugger. This working directory is initially whatever it inherited from its parent process (typically the shell), but you can specify a new working directory in the debugger with the cd command or the -cd command-line option.
In some cases, you may want to turn reverse execution off. This speeds up the program execution, and is also sometimes useful for interactive programs.
Normally, the debugger takes checkpoints of the program state from time to time. That is, it makes a copy of the current state of the program (using the Unix system call fork). If the variable checkpoints is set to off, the debugger will not take any checkpoints.
When the program issues a call to fork, the debugger can either follow the child or the parent. By default, the debugger follows the parent process. The variable follow_fork_mode controls this behavior:
The debugger is compatible with the Dynlink module. However, when an external module is not yet loaded, it is impossible to set a breakpoint in its code. In order to facilitate setting breakpoints in dynamically loaded code, the debugger stops the program each time new modules are loaded. This behavior can be disabled using the break_on_load variable:
The debugger communicate with the program being debugged through a Unix socket. You may need to change the socket name, for example if you need to run the debugger on a machine and your program on another.
On the debugged program side, the socket name is passed through the CAML_DEBUG_SOCKET environment variable.
Several variables enables to fine-tune the debugger. Reasonable defaults are provided, and you should normally not have to change them.
As checkpointing is quite expensive, it must not be done too often. On the other hand, backward execution is faster when checkpoints are taken more often. In particular, backward single-stepping is more responsive when many checkpoints have been taken just before the current time. To fine-tune the checkpointing strategy, the debugger does not take checkpoints at the same frequency for long displacements (e.g. run) and small ones (e.g. step). The two variables bigstep and smallstep contain the number of events between two checkpoints in each case.
The following commands display information on checkpoints and events:
Just as in the toplevel system (section 12.2), the user can register functions for printing values of certain types. For technical reasons, the debugger cannot call printing functions that reside in the program being debugged. The code for the printing functions must therefore be loaded explicitly in the debugger.
The value path printer-name must refer to one of the functions defined by the object files loaded using load_printer. It cannot reference the functions of the program being debugged.
The most user-friendly way to use the debugger is to run it under Emacs with the OCaml mode available through MELPA and also at https://github.com/ocaml/caml-mode.
The OCaml debugger is started under Emacs by the command M-x camldebug, with argument the name of the executable file progname to debug. Communication with the debugger takes place in an Emacs buffer named *camldebug-progname*. The editing and history facilities of Shell mode are available for interacting with the debugger.
In addition, Emacs displays the source files containing the current event (the current position in the program execution) and highlights the location of the event. This display is updated synchronously with the debugger action.
The following bindings for the most common debugger commands are available in the *camldebug-progname* buffer:
In all buffers in OCaml editing mode, the following debugger commands are also available: