This is not necessarily the current version of this TIP.
| TIP: | 143 |
| Title: | An Interpreter Resource Limiting Framework |
| Version: | $Revision: 1.3 $ |
| Author: | Donal K. Fellows <donal dot k dot fellows at man dot ac dot uk> |
| State: | Draft |
| Type: | Project |
| Tcl-Version: | 8.5 |
| Vote: | Pending |
| Created: | Friday, 25 July 2003 |
This TIP introduces a mechanism for creating and manipulating per-interpreter resource limits. This stops several significant classes of denial-of-service attack, and can also be used to do things like guaranteeing an answer within a particular amount of time.
Currently it is trivial for scripts running in even safe Tcl interpreters to conduct a denial-of-service attack on the thread in which they are running. All they have to do is write a simple infinite loop with the for or while commands. Of course, it is possible to put a stop to this by hiding those commands from the interpreter, but even then it is possible to simulate the DoS effect with the help of a procedure like this (which even under ideal conditions will take over three days to run):
proc x s {eval $s; eval $s; eval $s; eval $s}
x {x {x {x {x {x {x {x {x {x {x {x {x {x {sleep 1}}}}}}}}}}}}}}
The easiest way around this, of course, is the resource quota as used by heavyweight operating systems, perhaps combined with the alarm(2) system call. Or at least that would be the way to do it if it wasn't for the fact that those are ridiculously heavy sledgehammers to take to this particular nut. Luckily we control the execution environment - it is a Tcl interpreter after all - so we can implement our own checks. It is this that is the aim of this TIP.
I propose to add a subcommand limit to the interp command and to the object-like interpreter access commands. It will be an error for any safe interpreter to call the limit subcommand; master interpreters may enable it for their safe slaves via the mechanism interpreter aliases, as is normal for security policies.
The first argument to the interp limit command will be the name of the interpreter whose limit is to be inspected. The second argument will be the name of the limit being modified; this TIP defines two kinds of limits:
Specifies when the interpreter will be prohibited from performing further executions. The limit is not guaranteed to be hit at exactly the time given, but will not be hit before then.
Specifies a maximum number of commands (see info commandcount) to be executed.
The third and subsequent arguments specify a series of properties (each of which has a name that begins with a hyphen) which may be set (if there are pairs of arguments, being the property name and the value to set it to) or read (if there is just the property name on its own). The set of properties is limit-specific, but always includes the following:
For controlling the actual value of the limit, and expected to be an integer value of some form when a limit is set. If no limit is set, the -value property will always be the empty string.
A Tcl script to be executed in the global namespace of the interpreter reading/writing the property when the limit is found to be exceeded in the limited interpreter. If no command callback is defined for this interpreter, the -command option will be the empty string. Note that multiple interpreters may set command callbacks that are distinct from each other; an interpreter may not see what other interpreters have installed (except by running a script in those foreign interpreters, of course.) The order of calling of the callbacks is not defined by this TIP.
Still to be defined - the result of an exceptional return of a callback command, especially in the manner of the cases outlined in TIP #90.
Limits will always be checked in locations where the Tcl_Async*() API is called, as this is called regularly by the Tcl interpreter (with a few exceptions relating to event loop handling.) However, the cost of just checking a limit can be quite appreciable (it might involve system calls, say) so this property allows the control of how frequently, out of the opportunities to check a limit, are such limits actually checked. If the granularity is 1, the limit will be checked every time. It is an error to try to set the granularity to less than 1.
When an interpreter hits a limit, it first runs all command callbacks in their respective interpreters. Once that is done, the interpreter rechecks the limit (since the command callbacks might have decided to raise or remove it) and if it is still exceeded it bails out the interpreter in the following way:
A flag is set in the interpreter to mark the interpreter as having exceeded its limits.
The currently executing command/script in the interpreter is made to return with code TCL_ERROR. (This is superior to using a novel return code, as third-party extensions are usually far better at handling error cases!)
The catch command will only catch errors if the interpreter containing it does not have the flag mentioned just above set. Similarly, further trips round the internal loops of the vwait, update and tkwait commands will not proceed with that flag set. (There will also need to be a C API for reading the flag so extensions can obey the rules as well, though I do not know what to call it at this stage.) No calls to bgerror should be made.
Once the execution unwinds out of the interpreter so that no further invocations of the interpreter are left on the call-stack, the flag is reset. The same is true if the limits are adjusted in any way. Note that attempting to execute things within the interpreter without raising the limits will result in the limit being hit immediately.
When resource limits are being used, unlimited master interpreters should take care to use the catch command when calling their limited slaves. Otherwise hitting the limit in the slave might well smash the master as well, just because of general error propagation. But that is good practise anyway.
Time limits are specified in terms of the time (in seconds from the epoch, as returned by clock seconds) when the limit will be hit. Setting the limit to the current time ensures that the limit will be immediately activated.
Where a time-limited interpreter creates a slave interpreter, the slave will get the same time-limit as the creating master interpreter.
Command limits are specified in terms of the number of commands (see info commandcount for a definition of the metric) that may be executed before the limit is hit.
Where a command-limited interpreter creates a slave interpreter, the slave will get the command-limit 0 after initialisation (i.e. after the return from the call to Tcl_CreateSlave()) and will be unable to execute and commands until the limit for the slave is raised. Master interpreters implementing security policies for safe interpreters might want to set such limits semi-automatically to something more useful by deducting command-executions from the creating interpreter to its new slave.
It is also desirable for there to be a general C API for controlling resource limits. Not only does this provide control to extension authors that can't be easily smashed by Tcl scripts by accident, but it also makes implementation of the Tcl API to the limit subsystem easier to create as well.
int Tcl_LimitReady(Tcl_Interp *interp)
Test whether a limit is ready to be checked according to its granularity rules. Result is boolean.
int Tcl_LimitCheck(Tcl_Interp *interp)
Test whether a limit has been exceeded. Result is boolean.
int Tcl_LimitExceeded(Tcl_Interp *interp)
Test whether the interpreter is in a state where a limit has been exceeded and not yet reset (by extending the relevant limit.) Result is boolean.
int Tcl_LimitTypeEnabled(Tcl_Interp *interp, int type)
Test whether the given type of limit is turned on. Result is boolean.
int Tcl_LimitTypeExceeded(Tcl_Interp *interp, int type)
Test whether the given type of limit has been hit and not yet reset. Result is boolean.
void Tcl_LimitTypeSet(Tcl_Interp *interp, int type)
Turn on the given type of limit.
void Tcl_LimitTypeReset(Tcl_Interp *interp, int type)
Turn off the given type of limit.
void Tcl_LimitAddHandler(Tcl_Interp *interp, int type, Tcl_LimitHandlerProc *handlerProc, ClientData clientData, Tcl_LimitHandlerDeleteProc *deleteProc)
Add a limit handler for the given type of limit and specify some caller context and a scheme for deleting that context.
void Tcl_LimitRemoveHandler(Tcl_Interp *interp, int type, Tcl_LimitHandlerProc *handlerProc, ClientData clientData)
Remove a limit handler for the given type of limit. It is not an error to delete a limit that was not set; in that case, nothing is changed.
void Tcl_LimitSetCommands(Tcl_Interp *interp, int commandLimit)
Set a limiting value on the number of commands, and reset whether the limit has been triggered. Does not enable the limit.
int Tcl_LimitGetCommands(Tcl_Interp *interp)
Get the current limit on the number of commands.
void Tcl_LimitSetTime(Tcl_Interp *interp, long timeLimit)
Set a limiting value on the latest time that the code may execute, and reset whether the limit has been triggered. Does not enable the limit.
long Tcl_LimitGetTime(Tcl_Interp *interp)
Get the current limit on the execution time.
void Tcl_LimitSetGranularity(Tcl_Interp *interp, int type, int granularity)
Set the checking granularity for the given type of limit.
int Tcl_LimitGetGranularity(Tcl_Interp *interp, int type)
Get the checking granularity for the given type of limit.
Above, type is either TCL_LIMIT_COMMANDS or TCL_LIMIT_TIME, handlerProc is a pointer to a function that takes two parameters (a ClientData and a Tcl_Interp*) and returns void, and deleteProc is a pointer to a function that takes a single parameter (a ClientData) and returns void. The key is that handlerProcs are called when a limit is hit (they are used to implement the guts of the -command option), and when the callback is deleted for any reason (including a call to Tcl_LimitRemoveHandler and deleting the limited interpreter) the deleteProc is called to release the resources consumed by the clientData context.
There are some obvious other things that could be brought within this framework, but which I've left out for various reasons:
We already do limiting of this, so bring that within this framework would be a nice regularisation. On the other hand, such a change would not be backward-compatible, and it might not be safe to perform the callbacks either (especially as overflowing the stack is fatal in a way that overrunning on time is not.)
Limiting the amount of memory that an interpreter may allocate for its own use would be very nice. But conceptually difficult to do (what about memory shared between interpreters?), expensive to keep track of (memory debugging is known to add a lot of overhead, and memory limiting would be more intrusive) and a really big change technically too (i.e. you'd be rewriting a significant fraction of the Tcl C API to make sure that every memory allocation knows which interpreter owns it.)
Most other things can be limited in Tcl right now without special support.
An example implementation of this TIP is available as a patch [1].
This document is placed in the public domain.
This is not necessarily the current version of this TIP.