TIP #217 Version 1.1: Getting Sorted Indices out of Lsort

This is not necessarily the current version of this TIP.


TIP:217
Title:Getting Sorted Indices out of Lsort
Version:$Revision: 1.1 $
Author:James P. Salsman <james at bovik dot org>
State:Draft
Type:Project
Tcl-Version:8.4
Vote:Pending
Created:Thursday, 26 August 2004
Keywords:Tcl, lsort, parallel list

Abstract

An -indices option is proposed for the lsort command, returning the indices of the given list required to access its elements in the order in which they would be sorted.

Rationale

When corresponding parallel lists include a list of keys upon which one or more of the lists must be accessed in sorted order, it is necessary to obtain the indices required to access their elements in sorted order without actually sorting the list of keys. For example, one might have a list of first names and another corresponding list of last names, and wish to iterate over both in order. Tk listboxes, database I/O, and statistics applications often involve heavy use of use parallel lists. For this and other reasons, programming languages starting at least as early as APL up to present-day numerics oriented languages such as MATLAB have included the ability to directly obtain the indices required to access a list in sorted order.

Proposal

The lsort command shall accept a new option, -indices. When lsort is called with the option, it shall return a list of integer indices of the elements of the list given as the final argument to lsort, in the order that the elements would have been sorted had the -indices option not been specified.

Note that this means an alternative (though less efficient for sorting a single list) mechanism for producing a sorted list might be:

set resultList {}
foreach idx [lsort -indices $sourceList] {
    lappend resultList [lindex $sourceList $idx]
}

Reference Implementation and Notes

Available at http://www.bovik.org/lsort-indices-diff.txt

This reference implementation is a 119-line context diff, involving adding 16 lines of code to tclCmdIL.c, the allocating and assigning one int per list element, freed before lsort returns, and a net of one additional comparison if the new option is not invoked.

Documentation change proposed: In the lsort man page, under DESCRIPTION, change the first sentence:

"This command sorts the elements of list, returning a new list in sorted order."

... to read:

"This command sorts the elements of list, and returns a new list in sorted order, unless the -indices option is specified, in which case a list of integers is returned, corresponding to the indices of the given list's elements in the order that they otherwise would have been sorted."

Under EXAMPLES, at the end of the section, include the following lines:

 Obtaining ordered indices:

  % lsort -indices [list a c b]
  0 2 1
  % lsort -indices -unique -decreasing -float -index 0 \
          {{1.2 a} {34.5 b} {34.5 b} {5.6 c}}
  1 3 0

Copyright

This document has been placed in the public domain by the author.


Powered by TclThis is not necessarily the current version of this TIP.

TIP AutoGenerator - written by Donal K. Fellows