<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE TIP SYSTEM "http://tcl.activestate.com/cgi-bin/tct/tip/tipxml.dtd">
<!-- Converted at Thu Feb 09 11:00:39 GMT 2012 -->
<!-- TIP AutoGenerator - written by Donal K. Fellows -->

<TIP number='27'>
<header><title>CONST Qualification on Pointers in Tcl API&apos;s</title><author address="mailto:kennykb@acm.org">Kevin Kenny</author><status type='project' state='final' tclversion="8.4" vote='after'>$Revision: 1.6 $</status><history></history><created day='25' month='feb' year='2001' /><discussions url='news:comp.lang.tcl'/><discussions url='mailto:kennykb@acm.org'/></header>
<abstract>Many of the C and C++ interfaces to the Tcl library lack a CONST qualifier on the parameters that accept pointers, even though they do not, in fact, modify the data that the pointers designate. This lack causes a persistent annoyance to C/C++ programmers. Not only is the code needed to work around this problem more verbose than required; it also can lead to compromises in type safety. This TIP proposes that the C interfaces for Tcl be revised so that functions that accept pointers to constant data have type signatures that reflect the fact. The new interfaces will remain backward-compatible with the old, except that a few must be changed to return pointers to CONST data. (Changes of this magnitude, in the past, have been routine in minor releases; the author of this TIP does not see a compelling reason to wait for Tcl 9.0 to clean up these API&apos;s.)</abstract>
<body><section title="Rationale">
<para>When the Tcl library was originally written, the ANSI C standard had yet to be widely accepted, and the <emph style="italic">de facto</emph> standard language did not support a <emph style="italic">const</emph> qualifier. For this reason, none of the older Tcl API&apos;s that accept pointers have CONST qualifiers, even when it is known that the objects will not be modified.</para>
<para>In interfacing with other systems whose API&apos;s were designed after the ANSI C standard, this limitation becomes annoying. Code like:</para>
<verbatim><vline encoding='base64'>IGNvbnN0IGNoYXIqIGNvbnN0IHN0cmluZyA9ICIgLi4uIHdoYXRldmVyIC4uLiAiOw==</vline><vline encoding='base64'>IFRjbF9TZXRTdHJpbmdPYmooIFRjbF9HZXRPYmpSZXN1bHQoIGludGVycCApLA==</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgIChjaGFyKikgc3RyaW5nLCAvKiBIYXZlIHRvIGNhc3QgYXdheQ==</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKiBjb25zdC1uZXNzIGhlcmU=</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKiBldmVuIHRob3VnaCB0aGUgc3RyaW5n</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKiB3aWxsIG9ubHkgYmUgY29waWVk</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgKi8=</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgIC0xICk7</vline></verbatim>
<para>is more verbose than necessary. It is also unsafe: the cast allows a number of unsafe type conversions (the author of this TIP has had to debug at least one extension where an integer was cast to a character pointer in this context).</para>
<para>In an C++ environment where engineering practice forbids using C-style cast syntax, the syntax gets even more annoying, although it provides improved safety. C++ code analogous to the above snippet looks like:</para>
<verbatim><vline encoding='base64'>IGNvbnN0IGNoYXIqIGNvbnN0IHN0cmluZyA9ICIuLi53aGF0ZXZlci4uLiI7</vline><vline encoding='base64'>IFRjbF9TZXRTdHJpbmdPYmooIFRjbF9HZXRPYmpSZXN1bHQoIGludGVycCApLA==</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgIGNvbnN0X2Nhc3Q8IGNoYXIqID4oIHN0cmluZyApLCAtMSApOw==</vline></verbatim>
<para>This code is hardly a paragon of readability.</para>
<para>The popular Gnu C compiler also has a problem with the <emph style="italic">char *</emph> declaration of so many of the parameters. With the default set of compilation options, a call like:</para>
<verbatim><vline encoding='base64'>IFRjbF9TZXRTdHJpbmdPYmooIFRjbF9HZXRPYmpSZXN1bHQoIGludGVycCApLA==</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICJIZWxsbyB3b3JsZCEiLCAtMSApOw==</vline></verbatim>
<para>results in an error; suppressing this message requires either using the obscure option <emph style="italic">-fwritable-strings</emph> on the compiler command line, or else applying awkward (and unsafe) cast syntax:</para>
<verbatim><vline encoding='base64'>IFRjbF9TZXRTdHJpbmdPYmooIFRjbF9HZXRPYmpSZXN1bHQoIGludGVycCApLA==</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgIGNvbnN0X2Nhc3Q8IGNoYXIqID4oICJIZWxsbywgd29ybGQhIiApLCAtMSApOw==</vline></verbatim>
<para>Introducing CONST on parameters, however, does not bring in any incompatibility; as long as there is a prototype in scope, any ANSI-compliant compiler will implicitly cast non-CONST arguments to be type-compatible with CONST formal parameters.</para>
</section>
<section title="Specification">
<para>This TIP proposes that, wherever possible, Tcl API&apos;s that accept pointers to constant data have their signatures in <emph style="italic">tcl.decls</emph> and the corresponding source files adjusted to add the CONST qualifier.</para>
<para>The change introduces a potential incompatibility in that code compiled on a (hypothetical) architecture where pointers to constant data have a different representation from those to non-constant data will not load against the revised stub table. This incompatibility is, in fact, not thought to be a problem, since no known port of Tcl has encountered such an architecture.</para>
<para>If we confine the scope of this TIP to adding CONST only to parameters, we preserve complete compatibility with existing implementations. It is neither possible nor desirable, however, to preserve drop-in compatibility across all the API&apos;s. The earliest example in the stub table is the <emph style="italic">Tcl_PkgRequireEx</emph> function. This function is declared to return <emph style="italic">char *</emph>; the pointer it returns, however, is into memory managed by the Tcl library. Any attempt by an extension to scribble on this memory or free it will result in corruption of Tcl&apos;s internal data structures; it is therefore safer and more informative to return <emph style="italic">CONST char *</emph>. (This particular example is also highly unlikely to break any existing extension; the author of this TIP has yet to see one actually use the return value.)</para>
<para>Some of the API&apos;s, such as <emph style="italic">Tcl_GetStringFromObj</emph>, will continue to return pointers into writable memory inside the Tcl library. <emph style="italic">Tcl_GetStringFromObj</emph>, for instance, deals with memory that is managed co-operatively between extensions and the Tcl library; one simply must trust extensions to do the right thing (for instance, not overwrite the string representation of a shared object).</para>
<para>Some of the API&apos;s will not be modified, even though they appear to accept constant strings. For instance, <emph style="italic">Tcl_Eval</emph> modifies its string argument while it is parsing it, even though it restores its initial content when it returns. This behavior has sufficient impact on performance that it is probably not desirable to change it. The cases where the Tcl library does this sort of temporary modification, however, must be documented in the programmers&apos; manual. They affect thread safety and positioning of data in read-only memory. One can foresee that cleaning up the API&apos;s that do not suffer from this problem will mean that programmers will be less tempted to use unsafe casts on the ones that remain.</para>
<para>Finally, there are a handful of API&apos;s that are essentially impossible to clean up portably; the ones that accept variable arguments come to mind. These will be left alone. One particular case in point is <emph style="italic">Tcl_SetResult</emph>: its third argument determines whether its second argument is constant or non-constant. In an environment without writable strings, a call like:</para>
<verbatim><vline encoding='base64'>ICAgIFRjbF9TZXRSZXN1bHQoIGludGVycCwgIkhlbGxvLCB3b3JsZCEiLCBUQ0xfU1RBVElDICk7</vline></verbatim>
<para>or</para>
<verbatim><vline encoding='base64'>ICAgIFRjbF9TZXRSZXN1bHQoIGludGVycCwgIkhlbGxvLCB3b3JsZCEiLCBUQ0xfVk9MQVRJTEUgKTs=</vline></verbatim>
<para>cannot be handled without unsafe casting. Fortunately, several alternatives are available. The most attractive appears to be:</para>
<verbatim><vline encoding='base64'>ICAgIFRjbF9TZXRPYmpSZXN1bHQoIGludGVycCwg</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICAgIFRjbF9OZXdTdHJpbmdPYmooICJIZWxsbywgd29ybGQhIiwgLTEgKSApOw==</vline></verbatim>
<para>which is also more informative about what is really going on. Note that <emph style="italic">TCL_STATIC</emph> no longer actually carries the static pointer around. Although <emph style="italic">Tcl_SetResult</emph> appears to do so, as soon as the command returns, code in <emph style="italic">tclExecute.c</emph> converts the string result into an object result by calling <emph style="italic">Tcl_GetObjResult</emph>. The code using <emph style="italic">Tcl_SetObjResult</emph> therefore carries no greater performance cost than the original <emph style="italic">Tcl_SetResult</emph>.</para>
</section>
<section title="Reference Implementation">
<para>The changes described in this TIP cut across too many functional areas to be implemented effectively all at once. Several people have pointed out that implementing this cleanup all at once appears to be necessary to avoid &quot;CONST pollution,&quot; where the library becomes full of code that casts away the CONST qualifier. To study this issue, the author has conducted the experiment of imposing CONST strings on the first API in the stubs table: <emph style="italic">Tcl_PkgProvideEx</emph>.</para>
<para>The first concern that arose was that several other functions used the CONST strings passed as parameters, and these functions also needed to be updated. Fortunately, all were static within <emph style="italic">tclPkg.c</emph>. Next, when updating the documentation, the author discovered that five other functions were documented in the same man page, and shared a common defintion of the <emph style="italic">package</emph> and <emph style="italic">version</emph> parameters. They, too, were included in the change, and once again, the change was propagated forward into the functions that they called. (This activity is where the issue of replacing <emph style="italic">Tcl_SetResult</emph> with <emph style="italic">Tcl_SetObjResult</emph> was detected.)</para>
<para>When replacing <emph style="italic">Tcl_SetResult</emph> with <emph style="italic">Tcl_SetObjResult</emph>, the author discovered that the <emph style="italic">file</emph> parameter to <emph style="italic">Tcl_DbNewStringObj</emph> was also a constant string. With more enthusiasm than caution, he decided to attack the corresponding parameter in all the <emph style="italic">TCL_MEM_DEBUG</emph> interfaces. (In retrospect, it would probably have been easier to tackle this issue separately.) This change wound up cutting across virtually all of the external interfaces to <emph style="italic">tclStringObj.c</emph> and <emph style="italic">tclBinary.c</emph> and the associated documentation.</para>
<para>The author expects that many of the other API&apos;s will be much less closely coupled than the one studied. In particular, now that the interfaces of <emph style="italic">tclStringObj.c</emph> have been done once, they don&apos;t need to be done again! In fact, starting with the interfaces, like <emph style="italic">tclStringObj.c</emph>, that are used pervasively throughout the library and working outward would certainly have been a better course of action than tracing the dependencies forward from one function chosen almost at random.</para>
<para>The result of the experimental change was that twenty-eight external APIs, plus about a dozen static functions, needed to have the CONST qualifier added to at least one pointer. After these changes were made, the test suite compiled, linked, and passed all regression tests with all combinations of the NODEBUG and TCL_MEM_DEBUG options. It was nowhere necessary to cast away CONST-ness. </para>
<para>Possible incompatibility with existing extensions was present only in that the return values from the four functions, <emph style="italic">Tcl_PkgPresent</emph>, <emph style="italic">Tcl_PkgPresentEx</emph>, <emph style="italic">Tcl_PkgRequire</emph>, and <emph style="italic">Tcl_PkgRequireEx</emph> had the CONST qualifier added. These four functions return pointers to memory that must not be modified nor freed by the caller, so the CONST qualifier is desirable, but existing extensions may depend on storing the pointer in a variable that lacks the qualifier. This level of incompatibility in a minor release has been thought acceptable in the past; changes required to extensions are trivial, and once changed, the extensions continue to back-port cleanly to older releases.</para>
<para>An earlier version of these changes was uploaded to the SourceForge patch manager as patch number 404026. The revised version will be added under the same patch number as soon as the author&apos;s technical problems with uploading patches are resolved. (The major difference between the two patches is that the first patch implements the two-Stub approach described under &quot;Rejected alternatives&quot; below.</para>
<para>The success of this change has convinced the author of this TIP that the rest of the changes can be implemented in a staged manner, with little source-level incompatibility being introduced for extensions (and absolutely no incompatibility for stubs-enabled extensions compiled and linked against earlier versions of the library).</para>
</section>
<section title="Rejected alternatives">
<para>The initial version of this TIP attempted to preserve backward compatibility of stubs-enabled extensions, even on a hypothetical architecture where pointer-to-constant and pointer-to-nonconstant have different representations.</para>
<para>If this level of backward compatibility is desired, it will be necessary to provide entries in the existing stub table slots corresponding to the API&apos;s that lack the CONST qualifiers.</para>
<para>The slots in the stub table corresponding to the non-CONST API&apos;s can be filled with wrapper functions. For example, the following function definition of <emph style="italic">Tcl_SetStringObj_NONCONST</emph> will use the implicit casting inherent in C to call the function with the new API.</para>
<verbatim><vline encoding='base64'>dm9pZA==</vline><vline encoding='base64'>VGNsX1NldFN0cmluZ09ial9OT05DT05TVChUY2xfT2JqKiBvYmosIC8qIE9iamVjdCB0byBzZXQgKi8=</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICAgICAgICBjaGFyKiBieXRlcywgIC8qIFN0cmluZyB2YWx1ZSB0byBhc3NpZ24gKi8=</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICAgICAgICBpbnQgbGVuZ3RoKSAgIC8qIExlbmd0aCBvZiB0aGUgc3RyaW5nICov</vline><vline encoding='base64'>ew==</vline><vline encoding='base64'>ICAgIFRjbF9TZXRTdHJpbmdPYmooIG9iaiwgYnl0ZXMsIGxlbmd0aCApOw==</vline><vline encoding='base64'>fQ==</vline></verbatim>
<para>This sort of definition is so simple that <emph style="italic">tools/genStubs.tcl</emph> was extended in the original patch accompanying this TIP to generate it. For example, the declaration of <emph style="italic">Tcl_SetStringObj</emph> that once appeared as:</para>
<verbatim><vline encoding='base64'>ZGVjbGFyZSA2NSBnZW5lcmljIHs=</vline><vline encoding='base64'>ICAgIHZvaWQgVGNsX1NldFN0cmluZ09iaiggVGNsX09iaiogb2JqUHRyLCBjaGFyKiBieXRlcywgaW50IGxlbmd0aCAp</vline><vline encoding='base64'>fQ==</vline></verbatim>
<para>was replaced with:</para>
<verbatim><vline encoding='base64'>ZGVjbGFyZSA0NTggLW5vbmNvbnN0IDY1IGdlbmVyaWMgew==</vline><vline encoding='base64'>ICAgIHZvaWQgVGNsX1NldFN0cmluZ09iaiggVGNsX09iaiogb2JqUHRyLCBDT05TVCBjaGFyKiBieXRlcywgaW50IGxlbmd0aCAp</vline><vline encoding='base64'>fQ==</vline></verbatim>
<para>declaring that slot 458 in the stubs table is to be used for the new API accepting a CONST char* for the string, while slot 65 remains used for the legacy implementation.</para>
<para>The difficulty with this approach, which caused it to be rejected, is that it introduces <emph style="italic">forward</emph> incompatibility. Any extension compiled against header files from after the change will fail to load against the stubs table from before the change. This incompatibility would require extension authors to maintain sets of header files for (at least) the earliest version of Tcl that they intend to support, rather than always being able to compile against the most current set. This problem was thought to be worse than the hypothetical and possibly non-existent problem of differing pointer representations.</para>
</section>
<section title="Procedural note">
<para>The intent of this TIP is that, if approved, it will empower maintainers of individual modules to add <emph style="italic">CONST</emph> to any API where it is appropriate, provided that:</para>
<itemize><item.i><para>the change does not introduce &quot;CONST poisoning&quot;, that is, does not require type casts that remove CONST-ness;</para></item.i><item.i><para>the documentation of the API is updated to reflect the addition of the CONST qualifier; and</para></item.i></itemize>
<para>Individual TIP&apos;s detailing the changes to particular APIs shall <emph style="italic">not</emph> be required, provided that the changes comply with these guidelines.</para>
</section>
<section title="Change history">
<para>12 March 2001: Rejected the two-Stubs alternative and reworked the patches to use only one Stub per modified function.</para>
</section>
<section title="Copyright">
<para>This document has been placed in the public domain.</para>
</section>
</body></TIP>

