<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE TIP SYSTEM "http://tcl.activestate.com/cgi-bin/tct/tip/tipxml.dtd">
<!-- Converted at Thu Feb 09 11:16:40 GMT 2012 -->
<!-- TIP AutoGenerator - written by Donal K. Fellows -->

<TIP number='86'>
<header><title>Improved Debugger Support</title><author address="mailto:peter@browsex.com">Peter MacDonald</author><author address="mailto:peter@pdqi.com">Peter MacDonald</author><status type='project' state='draft' tclversion="8.7" vote='prior'>$Revision: 1.26 $</status><history></history><created day='8' month='feb' year='2002' /></header>
<abstract>This TIP proposes the storage by Tcl of source code file-name and line-numbering information, making it available at script execution time. It also adds additional <emph style="bold">trace</emph> and <emph style="bold">info</emph> subcommands to make it easier for a debugger to control a Tcl script much as <emph style="italic">gdb</emph> can control a C program.</abstract>
<body><section title="Rationale">
<para>Currently, although Tcl provides quite reasonable information to users in error traces, the line numbers within those traces are always relative to the evaluation context containing them (often the procedure, but not always) and not to the script file containing the procedure. This is substantially different to virtually every other computer language and makes correlating errors with the source line that caused them much more difficult. This also makes coupling a Tcl interpreter to an external debugging tool more difficult. This TIP proposes adding new interfaces to the Tcl core to make such debugging activity easier.</para>
<para>A new <emph style="bold">trace execution</emph> option enables Tcl to track line number and source file information associated with statements being executed and call a single callback. A new <emph style="italic">&apos;info line</emph> option provides access to line number information. As a result, it becomes a simple matter to implement a debugger for Tcl, in Tcl. Furthermore, the implementation also serves as example usage of the C interface, enabling similar capabilities at the lower level.</para>
<para>A simple Tcl debugger, <emph style="italic">tgdb</emph>, written in Tcl and emulating <emph style="italic">gdb</emph>, is included with this TIP to demonstrate the use of this interface. <emph style="italic">tgdb</emph> runs and controls a Tcl application in a sub-interp using <emph style="bold">trace execution</emph> and <emph style="bold">interp alias</emph>. It supports breakpoints on lines/procs/vars/etc, single-stepping, up/down stack and evals. It is designed to work both as a commandline and a slave process (see <emph style="italic">Reference Implementation</emph>).</para>
<para>Finally, upon error within a procedure, the file path and absolute (as opposed to relative) line number are printed out when available, even in the case where called from an after or callback invocation. Aside from aiding the user in more easily locating and dealing with errors, the message is machine parseable For example: automatically bring the user into an editor at the offending line.</para>
</section>
<section title="Specification">
<para>A new <emph style="bold">execution</emph> subcommand to the <emph style="bold">trace</emph> command.</para>
<quote><emph style="bold">trace execution</emph> <emph style="italic">target</emph> ?<emph style="italic">level</emph>?</quote>
<para>This arranges for an execution trace to be setup for commands at nesting <emph style="italic">level</emph> or above, thereby providing a simple Tcl interface for tracing commands to say, implement a debugger. With no arguments, the current target is returned. If target is the empty string, the execution trace is removed. The <emph style="italic">target</emph> argument is assumed to be a command string to be executed. When level is not specified, it defaults to 0, meaning trace all commands. For each traced command, the following data will be produced:</para>
<itemize><item.i><para>linenumber</para><para>The line number the instruction begins on.</para></item.i><item.i><para>filename</para><para>The fully normalized file name.</para></item.i><item.i><para>nestlevel</para><para>The nesting level of the command.</para></item.i><item.i><para>stacklevel</para><para>The stack call level as per <emph style="bold">info level</emph>.</para></item.i><item.i><para>curnsproc</para><para>The current fully qualified namespace/function.</para></item.i><item.i><para>cmdname</para><para>The fully qualified command name of the command to be invoked.</para></item.i><item.i><para>command</para><para>The command line to be executed including arguments.</para></item.i><item.i><para>flags</para><para>Integer bit flags, currently bit 1 is set for breakpoint.</para></item.i></itemize>
<para>The target is presumed to be a valid Tcl command onto which is appended the above arguments before evaluation. Any return from the command other than a normal return results in the command not being executed. As with all traces, execution tracing is disabled within a trace handler.</para>
<para>Second, a new <emph style="bold">line</emph> subcommand to <emph style="bold">info</emph> gives access to the file path and line number information. It takes subcommands of its own in turn:</para>
<itemize><item.i><para><emph style="bold">info line current</emph></para><describe><item.d name='Returns a list with two items'><para>the line number and file name of the current statement.</para></item.d></describe></item.i><item.i><para><emph style="bold">info line number</emph> <emph style="italic">proc</emph> ?<emph style="italic">line</emph>?</para><para>Get/set the line number for the start of <emph style="italic">proc</emph>. In get mode, returns the definition line number in the message body.</para></item.i><item.i><para><emph style="bold">info line level</emph> <emph style="italic">number</emph></para><para>Like the info level command, but returns the line number and file name from which the call at level number originates. For use with <emph style="italic">trace execution</emph>.</para></item.i><item.i><para><emph style="bold">info line file</emph> ?<emph style="italic">proc</emph> ?<emph style="italic">file</emph>??</para><para>Get/set the sourced file path for <emph style="italic">proc</emph>. If <emph style="italic">proc</emph> is not specified, dump all known sourced file paths.</para></item.i><item.i><para><emph style="bold">info line find</emph> <emph style="italic">file line</emph></para><para>Given a file path and line number, return the <emph style="italic">proc</emph> name containing line number <emph style="italic">line</emph>. A new nonstatic procedure <emph style="italic">TclFindProcByLine()</emph> provides this function.</para></item.i><item.i><para><emph style="bold">info line relativeerror</emph> ?<emph style="italic">bool</emph>?</para><para>Set to 1 to disable absolute line number and file path on a procedure error. This demotes procedure traceback errors to the same format as all other traceback errors, that is, using the relative the line number and file name.</para></item.i></itemize>
<para>These exhibit the following behavior:</para>
<itemize><item.i><para><emph style="italic">What does this do when you redefine a proc?</emph></para><para>You get the values from the latest definition.</para></item.i><item.i><para><emph style="italic">What about when you use interp aliases?</emph></para><para>You get an error, as it is not considered a proc.</para></item.i><item.i><para><emph style="italic">And if proc itself gets redefined by someone&apos;s special debugger?</emph></para><para>If the definition is not the result of a source, the file/line come back as an empty string.</para></item.i></itemize>
<para>Third, a new <emph style="italic">info</emph> subcommand <emph style="italic">return</emph>.</para>
<itemize><item.i><para><emph style="bold">info return</emph></para><para>When in <emph style="italic">trace execution</emph> mode, returns the saved last result of the previously executed command. Otherwise returns an empty string. Commands executed as part of a trace handler do not affect or change the saved last result. </para></item.i></itemize>
<para>Forth, an additional flag option <emph style="italic">debug</emph> to <emph style="italic">trace add variable</emph></para>
<itemize><item.i><para><emph style="bold">trace add variable {read write unset debug} command</emph></para><para>Used in conjuction with <emph style="italic">read</emph>, <emph style="italic">write</emph>, and <emph style="italic">unset</emph>, this option allows a debugger to set read/write traces that will not trigger the execution trace. In other words, it specifies that the command is debugger code that is not to be traced. Normally, debugger code is entered via the <emph style="bold">trace execution</emph> handler and so has tracing disabled. This just provides a similar feature for <emph style="bold">trace variable</emph></para></item.i></itemize>
<para>Fifth, a new <emph style="bold">breakpoint</emph> subcommand to the <emph style="bold">trace</emph> command.</para>
<quote><emph style="bold">trace breakpoint</emph> ??<emph style="italic">line file</emph> ?<emph style="italic">level</emph> ...??</quote>
<para>The <emph style="italic">trace breakpoint</emph> manages a list of breakpoints that cause an <emph style="italic">execution trace</emph> to trigger, even when the nestlevel is exceeded. With no arguments it returns a ternery list of all breakpoints in sets of the triples: line, file, and state. With two arguments, the current state for the breakpoint is returned. With three or more arguments, new breakpoints are created. If created with a state of zero, the breakpoint is considered inactive. Setting the state of a breakpoint to the empty string effectively deletes the breakpoint. A state set to an N greater than zero triggers every Nth time.</para>
</section>
<section title="Changes">
<para>Sourced file paths are stored per interp in a hash table. File/line numbering information is also stored in the <emph style="italic">Interp</emph>, <emph style="italic">Proc</emph>, <emph style="italic">After</emph>, and <emph style="italic">CallFrame</emph> structures. Newline counting/shifting code was added to <emph style="italic">proc</emph>, <emph style="italic">while</emph>, <emph style="italic">for</emph>, <emph style="italic">foreach</emph>, and <emph style="italic">if</emph>. All but the non-trivial code is active only when the new TRACE_LINE_NUMBERS interp flag is active, which is the case when using <emph style="italic">trace execution</emph>.</para>
<para>Most new variables within Interp are in the struct subfield sourceInfo of type <emph style="italic">Tcl_SourceInfo</emph>, which can be retrieved via the new <emph style="italic">Tcl_GetSourceInfo(interp)</emph> stubbed/public call.</para>
</section>
<section title="Overhead/Impact">
<para>The runtime impact to Tcl should be modest: a few 10&apos;s of kilobytes of memory, even for moderately large programs. Most of the space impact occurs in storing the file paths. A typical example from a large system:</para>
<verbatim><vline encoding='base64'>ICAxMDAgc291cmNlZCBmaWxlcyAqIDEwMCBieXRlcyA9IDEwSy4=</vline></verbatim>
<para>The other space overhead adds up to several words (8 bytes on a 32-bit platform) per defined procedure, plus an additional words in the <emph style="italic">Interp</emph> structure.</para>
<para>Runtime processing overhead should be negligible.</para>
<para>However, there have been no benchmarks done to validate these assertions.</para>
</section>
<section title="Reference Implementation">
<para>This patch is against Tcl 8.4.9 and represents a complete rework of the approach.</para>
<para><url ref="http://pdqi.com/download/tclline-8.4.9.diff.gz"/></para>
<para>There is a simple demonstration debugger script: <emph style="italic">tgdb.tcl</emph>.</para>
<para><url ref="http://pdqi.com/download/tgdb.tcl"/></para>
</section>
<section title="Previous/Old Reference Implementation">
<para><url ref="http://pdqi.com/download/tclline-cvs.diff.gz"/> - Patch against CVS head.</para>
<para><url ref="http://pdqi.com/download/tclline-8.4.6.diff.gz"/> - Patch against Tcl 8.4.6</para>
<para>The CVS patch was against the CVS head is as of June 13/2004. These have been lightly tested against numerous small Tcl programs.</para>
<para>There is also an initial version of a debugger: <emph style="italic">tgdb</emph>.</para>
<para><url ref="http://pdqi.com/download/tgdb-2.0.tar.gz"/></para>
<para><emph style="italic">tgdb</emph> emulates the basic commands of <emph style="italic">gdb</emph> (<emph style="italic">s</emph>, <emph style="italic">n</emph>, <emph style="italic">c</emph>, <emph style="italic">f</emph>, <emph style="italic">bt</emph>, <emph style="italic">break</emph>, <emph style="italic">info locals</emph>, etc). This newest version also supports watchpoints and display variables. With <emph style="italic">load</emph> and <emph style="italic">run</emph> commands added, <emph style="italic">tgdb</emph> should probably work even with <emph style="italic">emacs</emph> and <emph style="italic">ddd</emph>.</para>
<para>An additional package <emph style="italic">pdqi</emph> provides <emph style="italic">tdb</emph>, a GUI front-end to <emph style="italic">gdb</emph>, modified to also work with <emph style="italic">tgdb</emph>.</para>
</section>
<section title="Possible Future Enhancements">
<para>Build and store a line number table internally during parse?</para>
<para>Line number lookup via the source string. A simple way to implement this might be to lookup string against the <emph style="italic">codePtr-&gt;source+bestSrcOffset</emph> as returned by <emph style="italic">GetSrcInfoForPc()</emph>.</para>
<para>Add special handling for eval. Cases like <emph style="italic">eval $str</emph> should eventually be changed to report a line number of 0 (or more likely the line number of the original statement) for all statements with any argument involving a sub-eval. </para>
<para>Possibly implement character offsets within a line.</para>
</section>
<section title="Notes">
<para>A test has been added to the tests/trace.test. A utility <emph style="italic">trcline.tcl</emph> is provided that the test uses to provide some measure of the accuracy of the line number tracing.</para>
</section>
<section title="Comments and Feedback">
<para>Jeff Hobbs asked what about <emph style="bold">interp alias</emph>, etc.</para>
<itemize><item.i><para>Updated TIP to document cases</para></item.i></itemize>
<para>Jeff Hobbs notes filename storage is inefficient and finalization</para>
<itemize><item.i><para>Code changed to just increment ref count</para></item.i><item.i><para><emph style="bold">TODO:</emph> What needs to be done for <emph style="italic">Tcl_Finalize</emph>?</para></item.i></itemize>
<para>Neil Madden/Stephen Trier comment on info subcommand names <emph style="italic">line</emph>, <emph style="italic">file</emph> and <emph style="italic">proc</emph> and possible future uses for <emph style="italic">line</emph></para>
<itemize><item.i><para>Changed to a single subcommand <emph style="italic">line</emph> and use sub-sub commands.</para></item.i><item.i><para>Additional subsubcommands can easily be added.</para></item.i></itemize>
<para>Donal Fellows writes: Is there a way to do an equivalent of <emph style="italic">#line</emph> directives in C</para>
<itemize><item.i><para>we can now set line number etc of a proc. Is that enough?</para></item.i></itemize>
<para>Donald Porter notes that changing Tcl_Parse breaks binary compatibility</para>
<itemize><item.i><para>Move all parse variables to Interp and save/restore values on entry/exit to Tcl_EvalEx and TclCompileScript.</para></item.i></itemize>
<para>Donald Porter notes that the hash table should be per Interp</para>
<itemize><item.i><para>Code changed to move hash table to Interp.</para></item.i></itemize>
<para>Mo DeJong notes: file path should be used in place of file name</para>
<itemize><item.i><para>TIP updated to use path where appropriate</para></item.i></itemize>
<para>Mo DeJong suggests to maybe use <emph style="italic">TclpObjNormalizePath(fileName)</emph></para>
<itemize><item.i><para>No action yet</para></item.i></itemize>
<para>Donal Fellows objects to no support for <emph style="bold">proc</emph>s in subevals and Andreas Kupries suggests defining a line number <emph style="italic">Tcl_Token</emph> type.</para>
<itemize><item.i><para>Add support for <emph style="bold">proc</emph> in subeval by addition to <emph style="italic">ResolvedCmdName</emph></para></item.i><item.i><para>This is now fixed.</para></item.i></itemize>
<para>Donal Fellows asks if trace is disabled in the execution handler, how tracing to a sub-interp would work, and clarification on the purpose and use of trace variable {debug}.</para>
<itemize><item.i><para>The documentation was updated to clarify these points.</para></item.i></itemize>
</section>
<section title="Copyright">
<para>This document has been placed in the public domain.</para>
<para><emph style="italic">tgdb</emph> and <emph style="italic">pdqi</emph> have a BSD copyright by Peter MacDonald and PDQ Interfaces Inc.</para>
</section>
</body></TIP>

