<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE TIP SYSTEM "http://tcl.activestate.com/cgi-bin/tct/tip/tipxml.dtd">
<!-- Converted at Thu Feb 09 07:49:47 GMT 2012 -->
<!-- TIP AutoGenerator - written by Donal K. Fellows -->

<TIP number='22'>
<header><title>Multiple Index Arguments to lindex</title><author address="mailto:dacut@kanga.org">David Cuthbert</author><author address="mailto:kennykb@acm.org">Kevin Kenny</author><author address="mailto:dgp@users.sourceforge.net">Don Porter</author><author address="mailto:fellowsd@cs.man.ac.uk">Donal K. Fellows</author><status type='project' state='final' tclversion="8.4a2" vote='after'>$Revision: 1.22 $</status><history></history><created day='19' month='jan' year='2001' /><discussions url='news:comp.lang.tcl'/><discussions url='mailto:kennykb@acm.org'/><keyword>lindex {multiple arguments} sublists</keyword></header>
<abstract>Obtaining access to elements of sublists in Tcl often requires nested calls to the <emph style="italic">lindex</emph> command. The indices are syntactically listed in most-nested to least-nested order, which is the reverse from other notations. In addition, the nesting of command substitution brackets further decreases readability. This proposal describes an extension to the <emph style="italic">lindex</emph> command that allows it to accept multiple index arguments, in least-nested to most-nested order, to automatically extract elements of sublists.</abstract>
<body><section title="Rationale">
<para>The heterogeneous nature of Tcl lists allows them to be applied to a number of useful data structures. In particular, lists can contain elements that are, themselves, valid lists. In this document, these elements are referred to as <emph style="italic">sublists.</emph></para>
<para>Extracting elements from sublists often requires nested calls to <emph style="italic">lindex.</emph> Consider, for example, the following Tcl script that prints the center element of a 3-by-3 matrix:</para>
<verbatim><vline encoding='base64'>ICAgIHNldCBBIHt7MSAyIDN9IHs0IDUgNn0gezcgOCA5fX0=</vline><vline encoding='base64'>ICAgIHB1dHMgW2xpbmRleCBbbGluZGV4ICRBIDJdIDJd</vline></verbatim>
<para>When these calls are deeply nested - e.g., embedded in an <emph style="italic">expr</emph> arithmetic expression, having results extracted through <emph style="italic">lrange,</emph> etc. - the results are difficult to read:</para>
<verbatim><vline encoding='base64'>IyBQcmludCB0aGUgc3VtIG9mIHRoZSBjZW50ZXIgaW5kaWNlcyBvZiB0d28gM3gzIG1hdHJpY2Vz</vline><vline encoding='base64'>c2V0IHAgW2V4cHIge1tsaW5kZXggW2xpbmRleCAkQSAyXSAyXSArIFtsaW5kZXggW2xpbmRleCAkQSAyXSAyXX1d</vline><vline encoding='base64'></vline><vline encoding='base64'>IyBHZXQgYWxsIGJ1dCB0aGUgbGFzdCBmb250IGluIHRoZSBmb2xsb3dpbmcgcGFyc2VkIHN0cnVjdHVyZTo=</vline><vline encoding='base64'>c2V0IHBzdHJ1Y3Qge3RleHQge2lnbm9yZWQtZGF0YQ==</vline><vline encoding='base64'>ICAgICAgICAgICAgICAgICAgICAgIHsgLi4uIH0=</vline><vline encoding='base64'>CQkgICAgICAgfQ==</vline><vline encoding='base64'>CQkgICAgICAge3ZhbGlkLXN0eWxlcw==</vline><vline encoding='base64'>CQkJICAge2p1c3RpZmljdGlvbiB7bGVmdCBjZW50ZXJlZCByaWdodCBmdWxsfX0=</vline><vline encoding='base64'>CQkJICAge2ZvbnQge2NvdXJpZXIgaGVsdmV0aWNhIHRpbWVzfX0=</vline><vline encoding='base64'>CQkgICAgICAgfQ==</vline><vline encoding='base64'>CQkgfQ==</vline><vline encoding='base64'>cmV0dXJuIFtscmFuZ2UgW2xpbmRleCBbbGluZGV4IFtsaW5kZXggJHBzdHJ1Y3QgMV0gMl0gMl0gMCBlbmQtMV0=</vline></verbatim>
<para>Note that the list of indices in the latter example is listed in the reverse order of vector indices. In most other languages/domains, the last line might take on one of the following forms:</para>
<verbatim><vline encoding='base64'>cmV0dXJuIGxpc3RfcmFuZ2UocHN0cnVjdFsyXVsyXVsxXSwgMCwgZW5kLTEpOw==</vline><vline encoding='base64'></vline><vline encoding='base64'>cmV0dXJuIHBzdHJ1Y3RbWzIsIDIsIDFdXVtbMDotMV1d</vline><vline encoding='base64'></vline><vline encoding='base64'>dGVtcCA9IHBzdHJ1Y3QoMiwgMiwgMSk7</vline><vline encoding='base64'>cmVzdWx0ID0gcmFuZ2UodGVtcCwgMCwgbGVuZ3RoKHRlbXApIC0gMSk7</vline></verbatim>
<para>Allowing the <emph style="italic">lindex</emph> command to accept multiple arguments would allow this more-natural style of coding to be written in Tcl.</para>
</section>
<section title="Specification">
<enumerate><item.e index='1'><para>Under this proposal, the syntax for the <emph style="italic">lindex</emph> command is to be modified to accept one of two forms:</para><verbatim><vline encoding='base64'>ICAgICBsaW5kZXggbGlzdCBpbmRleExpc3Q=</vline></verbatim></item.e></enumerate>
<para> or</para>
<verbatim><vline encoding='base64'>ICAgICBsaW5kZXggbGlzdCBpbmRleDEgaW5kZXgyLi4u</vline></verbatim>
<quote>In either form of the command, the <emph style="italic">list</emph> parameter is expected to be a well-formed Tcl list.</quote>
<quote>In the first form of the command, the <emph style="italic">indexList</emph> argument is expected to be a Tcl list comprising one or more list indices, each of which must be an integer, the literal string <emph style="italic">end</emph>, or the literal string <emph style="italic">end-</emph> followed by an integer with no intervening whitespace. The existing <emph style="italic">lindex</emph> command is a degenerate form of this first form, where the list comprises a single index.</quote>
<quote>In the second form of the command, each of the <emph style="italic">index</emph> arguments is expected to be a list index, which again may be an integer, the literal string <emph style="italic">end</emph>, or the literal string <emph style="italic">end-</emph> followed by an integer with no intervening whitespace.</quote>
<enumerate><item.e index='2'><para>In either form of the command, once the <emph style="italic">indexList</emph> parameter is expanded, there is a single <emph style="italic">list</emph> parameter and one or more <emph style="italic">index</emph> parameters. If there is a single <emph style="italic">index</emph> parameter <emph style="italic">N</emph>, the behavior is identical to today&apos;s <emph style="italic">lindex</emph> command, and the command returns the <emph style="italic">N</emph>th element of <emph style="italic">list</emph>; if <emph style="italic">N</emph> is less than zero or at least the length of the list, a null string is returned.</para></item.e><item.e index='3'><para>If more than one <emph style="italic">index</emph> parameter is given, then the behavior is defined recursively; the result of</para><verbatim><vline encoding='base64'>ICAgbGluZGV4ICRsaXN0ICRpbmRleDAgJGluZGV4MSAuLi4=</vline></verbatim><para>or</para><verbatim><vline encoding='base64'>ICAgbGluZGV4ICRsaXN0IFtsaXN0ICRpbmRleDAgJGluZGV4MSAuLi5d</vline></verbatim><para>is indentical to that of</para><verbatim><vline encoding='base64'>ICAgbGluZGV4IFtsaW5kZXggJGxpc3QgJGluZGV4MF0gJGluZGV4MS4uLg==</vline></verbatim><para>or, equivalently,</para><verbatim><vline encoding='base64'>ICAgbGluZGV4IFtsaW5kZXggJGxpc3QgJGluZGV4MF0gW2xpc3QgJGluZGV4MS4uLl0=</vline></verbatim><para>(This specification does not constrain the implementation, which may be iterative, recursive, or even expanded inline.)</para></item.e><item.e index='4'><para>When an invalid index is given, an error of the form, <emph style="italic">bad index &quot;invalid_index&quot;: must be integer or end?-integer?</emph>, where <emph style="italic">invalid_index</emph> is the first invalid index encountered, must be returned.</para></item.e><item.e index='5'><para>If the list argument is malformed, the error resulting from an attempt to convert the list argument to a list must be returned. This behaviour is unchanged from the current implementation.</para></item.e></enumerate>
</section>
<section title="Side Effects">
<enumerate><item.e index='1'><para>Whether the result of the <emph style="italic">lindex</emph> operation is successful, the underlying Tcl_Obj that represents the list argument may have its internal representation invalidated or changed to that of a list.</para></item.e></enumerate>
</section>
<section title="Discussion">
<para>Some attention must be paid to giving the <emph style="italic">lindex</emph> command adequate performance. In particular, the implementation should address the common case of</para>
<verbatim><vline encoding='base64'>ICAgIGxpbmRleCAkbGlzdCAkaQ==</vline></verbatim>
<para>where <emph style="italic">$i</emph> is a &quot;pure&quot; integer (that is, one whose string representation has not yet been formed). If the above specification is followed naively, the flow will be as follows.</para>
<para>Since <emph style="italic">objc</emph> is three, <emph style="italic">objv[2]</emph> is expected to be a list. Since it is not, it must be converted from its string representation. It does not have one yet, so the string representation must be formed. Now the string representation is parsed as a list. An array (of length one) of Tcl_Obj pointers is allocated to hold the list in question, and a Tcl_Obj is allocated to hold the single element. Memory is allocated to hold the element&apos;s string representation. Now the <emph style="italic">lindex</emph> command converts the first element of the list to an index (in this case an integer).</para>
<para>This elaborate ballet of type shimmering requires converting the integer to a string and back again. It also requires four calls to <emph style="italic">ckalloc:</emph></para>
<enumerate><item.e index='1'><para>Allocate a buffer for the string representation.</para></item.e><item.e index='2'><para>Allocate the array of Tcl_Obj pointers for the list representation.</para></item.e><item.e index='3'><para>Allocate the Tcl_Obj that represents the first (and only) element of the list.</para></item.e><item.e index='4'><para>Allocate a buffer for the string representation of that element.</para></item.e></enumerate>
<para>And at the end, the result is the same integer that was passed as a parameter originally.</para>
<para>To avoid all this overhead in the common case, the proposed implementation shall (in the case where <emph style="italic">objc==3</emph>)</para>
<enumerate><item.e index='1'><para>Test whether <emph style="italic">objv[2]</emph> designates an object whose internal representation holds an integer. If so, simply use it as an index.</para></item.e><item.e index='2'><para>Test whether <emph style="italic">objv[2]</emph> designates an object whose internal representation holds a list. If so, perform the recursive extraction of indexed elements from sublists described above.</para></item.e><item.e index='3'><para>Form the string representation of <emph style="italic">objv[2]</emph> and test whether it is <emph style="italic">end</emph> or <emph style="italic">end-</emph> followed by an integer. If so, use it as an index.</para></item.e><item.e index='4'><para>Attempt to coerce <emph style="italic">objv[2]</emph> to an integer; if successful, use the result as an integer.</para></item.e><item.e index='5'><para>Attempt to coerce <emph style="italic">objv[2]</emph> to a list; if successful, use the result as an index list.</para></item.e><item.e index='6'><para>Report a malformed <emph style="italic">index</emph> argument; the <emph style="italic">indexList</emph> parameter is not a well-formed list.</para></item.e></enumerate>
<para>This logic handles all the cases of singleton lists transparently; it is effectively a simple-minded type inference that optimizes away needless conversions.</para>
<para>Assuming that the related <tipref type="text" tip="33"/> is approved, this logic will most likely be combined with the identical logic required in that proposal for parsing <emph style="italic">index</emph> arguments to the <emph style="italic">lset</emph> command.</para>
<rule/>
</section>
<section title="Comments">
<para><emph style="italic">Don Porter &lt;<url ref="mailto:dgp@users.sourceforge.net"/>&gt;</emph></para>
<quote>I agree that it would be helpful to many programmers to provide a multi-dimensional array data structure that can be accessed in the manner described in this TIP. In the <emph style="italic">struct</emph> module of <emph style="italic">tcllib</emph>, several other data structures are being developed: graph, tree, queue, stack. I would support adding another data structure to that module that provides an interface like the one described in this TIP, with the intent that all of these helpful data structures find their way into the BI distribution.</quote>
<quote>I don&apos;t see any advantage to adding complexity to [lindex] as an alternative to development of a multi-dimensional array structure. Without a compelling advantage, I&apos;m inclined against making [lindex] more complex. I like having Tcl&apos;s built-in commands provide primitive operations, and leave it to packages to combine the primitives into more useful, more complex resources.</quote>
<quote>This TIP should also consider how any changes to [lindex] mesh with the whole [listx] overhaul of Tcl&apos;s [list] command that has been discussed.</quote>
<para><emph style="italic">Dave Cuthbert &lt;<url ref="mailto:dacut@kanga.org"/>&gt; responds</emph></para>
<quote>Don makes a good point -- with a good set of data structures in tcllib, the need for this TIP is lessened or even eliminated. Nonetheless, I see this as a way of implementing the structures he describes. In other words, a more powerful primitive (which, in reality, adds fairly little complexity when measured in number of lines of code changed) would benefit these structures.</quote>
<quote>As for the [listx] overhaul, there are many competing proposals for the specification it is difficult to come up with a metric. In writing this TIP, I assumed a vacuum -- that is, a listx command would not be added to the core in the near future.</quote>
<para><emph style="italic">Donal K. Fellows &lt;<url ref="mailto:fellowsd@cs.man.ac.uk"/>&gt; points out</emph></para>
<quote>Although there is tcllib and [listx] to think about, they are certainly not reasons for rejecting this TIP out of hand. The availability of tcllib is not currently anything like universal (not stating whether this is a good, bad or ugly thing) and all the [listx] work will need its own TIP to make it into the core (you tend to have availability problems if it is an extension.) It is not as if the core is short of syntactic sugar right now (the [foreach] command is ample demonstration of this.)</quote>
<para><emph style="italic">Don Porter &lt;<url ref="mailto:dgp@users.sourceforge.net"/>&gt; follows up</emph></para>
<quote>I&apos;ll leave the discussion above in place so the history of this TIP is preserved, but I have withdrawn my objection.</quote>
<rule/>
<para>There was quite a discussion on <url ref="news:comp.lang.tcl"/> about using lindex to return multiple arguments. For example:</para>
<verbatim><vline encoding='base64'>ICUgc2V0IGxpc3Qge2Ege2IxIGIyfSBjIGQgZX0=</vline><vline encoding='base64'>ICUgbGluZGV4ICRsaXN0IDE=</vline><vline encoding='base64'>IGIxIGIy</vline><vline encoding='base64'>ICUgbGluZGV4ICRsaXN0IDEgMA==</vline><vline encoding='base64'>IHtiMSBiMn0gYQ==</vline><vline encoding='base64'>ICUgbGluZGV4ICRsaXN0IHsxIDB9</vline><vline encoding='base64'>IGIx</vline></verbatim>
<para>In other words, the list index arguments can, themselves, be lists. Only when the argument is a list would the &quot;recursive selection&quot; procedure of the TIP be used. For multiple arguments, the behaviour is akin to</para>
<verbatim><vline encoding='base64'>IGxpbmRleCAkbGlzdCBhIGIgYyAgLT4=</vline><vline encoding='base64'>ICAgICAgbGlzdCBbbGluZGV4ICRsaXN0IGFdIFtsaW5kZXggJGxpc3QgYl0gW2xpbmRleCAkbGlzdCBjXQ==</vline></verbatim>
<para><emph style="italic">Summarised by Dave Cuthbert &lt;<url ref="mailto:dacut@kanga.org"/>&gt;</emph></para>
<para><emph style="italic">Donal K. Fellows &lt;<url ref="mailto:fellowsd@cs.man.ac.uk"/>&gt; points out</emph></para>
<quote>The problems with the above version of multiple indexing are that it loses the property that [lindex] always returns a single element (making writing robust code harder) and that it forces use of the [list] constructor a lot or inefficient type handling when some of the indices must be computed. Then there is the whole question of what happens when you have indexes that are lists of lists, which is a major can of worms.</quote>
<quote>Luckily, we could always put this sort of behaviour into a separate command (e.g. called [lselect]) which addresses at least the majority of my concerns, and which (in my opinion) need not even form part of this TIP.</quote>
<para><emph style="italic">Dave Cuthbert &lt;<url ref="mailto:dacut@kanga.org"/>&gt; adds</emph></para>
<quote>I intentionally left [lselect] out of the original TIP (and it is still not present in the 02-April-2001 version). As Donal points out, it is a major can of worms and, though there was general agreement on c.l.t that such a command would be useful, people had differing opinions on what form it should take.</quote>
<quote>Perhaps my view on TIPs is incorrect, but I try to include only sure-fire &quot;yeah, we ought to have done that a few versions ago&quot; items.</quote>
<rule/>
</section>
<section title="Notes on History of this TIP">
<para><emph style="italic">This TIP was originally written by Dave Cuthbert &lt;<url ref="mailto:dacut@kanga.org"/>&gt;, but ownership has passed (beginning of April) to Kevin Kenny &lt;<url ref="mailto:kennykb@acm.org"/>&gt;.</emph></para>
<para>This TIP underwent substantial revision in May of 2001, to add the syntax where all the <emph style="italic">index</emph> parameters could be grouped as a list rather than placed inline on the command line.</para>
</section>
<section title="See Also">
<para><tipref type="text" tip="33"/>.</para>
</section>
<section title="Copyright">
<para>This document has been placed in the public domain.</para>
</section>
</body></TIP>

