TIP #357 Version 1.5: Export TclLoadFile

This is not necessarily the current version of this TIP.


TIP:357
Title:Export TclLoadFile
Version:$Revision: 1.5 $
Author:Kevin Kenny <kevin dot b dot kenny at gmail dot com>
State:Draft
Type:Project
Tcl-Version:8.6
Vote:Pending
Created:Thursday, 01 October 2009

Abstract

Rationale In developing TDBC, the author of this TIP was advised to look at the way that the 'oratcl' extension contrives to build on a system where Oracle is not installed as a model for TDBC extensions. Examination of the code revealed that it operates essentially by constructing at run time a Stubs table for the routines in the Oracle client library. There is a maze of '''#if''' directives selecting whether this process is accomplished by the system calls that manage Unix .so files (''dlopen'' and friends), Windows .DLL files (''LoadLibrary'' and related calls), HP-UX .shl files (''shl_load,'' etc.), and so on. Tcl already has functionality so that a caller can abstract away all this complexity. It provides the capability in the ''TclLoadFile'' call, but this call is '''MODULE_SCOPE''' and not exported even in the internal Stubs table. For this reason, it is entirely unavailable to TEA-compliant extensions. If this call were available, it would be feasible, in the 8.6 time frame, to bundle all the database-specific TDBC drivers with the core TDBC distribution, since things could be set up so that they will build anywhere, even in the absence of the databases where they connect. However, this call was never fully rationalized in the VFS world. Its strange API (with both a '''Tcl_LoadHandle''' and a '''ClientData''', and a function pointer for unloading the library) is, as several reviewers pointed out, not somehting that we want to make available to general callers. Hence, a few more routines need rework. ~ Specification The ''TclLoadFile'' call shall be renamed ''Tcl_LoadFile'' and exported in the external Stubs. Its call signature shall be changed to: > EXTERN int '''Tcl_LoadFile'''( Tcl_Interp *''interp'', Tcl_Obj *''pathPtr'', int ''symc'', const char *''symbols''[], void *''procPtrs''[], Tcl_LoadHandle *''handlePtr''); In this call, ''interp'' designates an interpreter for error reporting. ''pathPtr'' is an object containing the path of the library to be loaded. If ''pathPtr'' is a single name, the library search path of the current environment will be used to resolve it. ''symc'' is the number of symbols that are to be imported from the newly loaded library. The ''symbols'' array gives ''symc'' character strings that are the names of the imported symbols. The return value of ''Tcl_LoadFile'' is a standard Tcl result code. If the result is TCL_ERROR, the interpreter result will contain an appropriate error message. On return, ''Tcl_LoadFile'' fills in ''procPtrs'' with the addresses that correspond to the names in ''symbols.'' If a name cannot be resolved in the given library, the corresponding entry in ''procPtrs'' will be NULL. The ''loadHandle'' pointer will be a handle suitable for passing to ''Tcl_FindSymbol'' for resolving additional symbols in the library, or to ''Tcl_FSUnloadFile'' to unload the library. The ''TclpFindSymbol'' call shall be renamed ''Tcl_FindSymbol'' and exported in the external Stubs. Its call signature shall be: > EXTERN void* '''Tcl_FindSymbol'''( Tcl_Interp *''interp'', Tcl_LoadHandle *''loadHandle'', const char *''symbol''); This call searches for the given ''symbol'' in the already-loaded library identified by ''loadHandle''. If the symbol is found, a pointer to its memory address is returned. Otherwise, NULL is returned, and an error message is left in ''interp'' (if ''interp'' is not NULL). A new call, ''Tcl_FSUnloadFile'' shall be introduced and exported in the external Stubs.Its call signature shall be: > EXTERN int '''Tcl_FSUnloadFile'''( Tcl_Interp *''interp'', Tcl_LoadHandle *''loadHandle''); This call unloads the library identified by ''loadHandle''. It differs from the [[unload]] command in that no 'unload' procedure is called; the library is simply unloaded without first being given a chance to clean itself up. (This function is a lower-level interface used by [[unload]] as part of doing its work.) The return value is either TCL_OK or TCL_ERROR; when TCL_ERROR is returned, an appropriate message is left in the result of ''interp'' (if ''interp'' is not NULL). ~ Internals The '''Tcl_LoadHandle''' object shall be represented internally by a structure, declared in <tclInt.h>, looking like: | struct Tcl_LoadHandle_ { | ClientData clientData; /* Client data is the load handle in the | * native filesystem if a module was loaded | * there, or an opaque pointer to a structure | * for further bookkeeping on load-from-VFS | * and load-from-memory */ | TclFindSymbolProc* findSymbolProcPtr; | /* Procedure that resolves symbols in a | * loaded module */ | Tcl_FSUnloadFileProc* unloadFileProcPtr; | /* Procedure that unloads a loaded module */ | } The ''Tcl_FindSymbolProc'' and ''Tcl_FSUnloadFileProc'' data types are declared to be functions of the same type signature as ''Tcl_FindSymbol'' and ''Tcl_FSUnloadFile''. Virtual file systems that implement the ''loadFileProc'' are responsible for ensuring that their '''Tcl_LoadHandle'' that the ''loadFileProc'' returns conforms with this convention. As far as the author has been able to determine, no non-Core filesystem provides anything but NULL for the ''loadFileProc''. Certainly, tclvfs and trofs do not. Most other virtual filesystems layer atop tclvfs. ~ Discussion At least one reviewer made the suggestion that ''Tcl_LoadLibrary'' should fail if any symbol is not found, rather than returning NULL and continuing. The rationale was that the code for the simple case (load a library and resolve a specific symbol) is simpler. Further inspection, however, has revealed that the simple case may actually be the unusual one. Consider, for instance what the [[load]] command does. It tries to resolve four symbols (''Foo_Init'', ''Foo_SafeInit'', ''Foo_Unload'' and ''Foo_SafeUnload''). Three of the four are optional. Certainly this one case is simpler if all the work can be accomplished in one call, with simple checks for NULL symbols. The intended uses in TDBC are, for the most part, somewhat similar, with different versions of database libraries providing subtly different API's (that the drivers are prepared to handle). ~ Reference Implementation A reference implementation is being prepared. When it is complete, it will be uploaded to the SourceForge patch manager for inspection, and this TIP will be updated to indicate where to find it. ~ License This file is explicitly released to the public domain and the author explicitly disclaims all rights under copyright law.

Rationale

In developing TDBC, the author of this TIP was advised to look at the way that the 'oratcl' extension contrives to build on a system where Oracle is not installed as a model for TDBC extensions. Examination of the code revealed that it operates essentially by constructing at run time a Stubs table for the routines in the Oracle client library. There is a maze of #if directives selecting whether this process is accomplished by the system calls that manage Unix .so files (dlopen and friends), Windows .DLL files (LoadLibrary and related calls), HP-UX .shl files (shl_load, etc.), and so on.

Tcl already has functionality so that a caller can abstract away all this complexity. It provides the capability in the TclLoadFile call, but this call is MODULE_SCOPE and not exported even in the internal Stubs table. For this reason, it is entirely unavailable to TEA-compliant extensions.

If this call were available, it would be feasible, in the 8.6 time frame, to bundle all the database-specific TDBC drivers with the core TDBC distribution, since things could be set up so that they will build anywhere, even in the absence of the databases where they connect.

However, this call was never fully rationalized in the VFS world. Its strange API (with both a Tcl_LoadHandle and a ClientData, and a function pointer for unloading the library) is, as several reviewers pointed out, not somehting that we want to make available to general callers. Hence, a few more routines need rework.

Specification

The TclLoadFile call shall be renamed Tcl_LoadFile and exported in the external Stubs. Its call signature shall be changed to:

EXTERN int Tcl_LoadFile( Tcl_Interp *interp, Tcl_Obj *pathPtr, int symc, const char *symbols[], void *procPtrs[], Tcl_LoadHandle *handlePtr);

In this call, interp designates an interpreter for error reporting. pathPtr is an object containing the path of the library to be loaded. If pathPtr is a single name, the library search path of the current environment will be used to resolve it. symc is the number of symbols that are to be imported from the newly loaded library. The symbols array gives symc character strings that are the names of the imported symbols.

The return value of Tcl_LoadFile is a standard Tcl result code. If the result is TCL_ERROR, the interpreter result will contain an appropriate error message.

On return, Tcl_LoadFile fills in procPtrs with the addresses that correspond to the names in symbols. If a name cannot be resolved in the given library, the corresponding entry in procPtrs will be NULL. The loadHandle pointer will be a handle suitable for passing to Tcl_FindSymbol for resolving additional symbols in the library, or to Tcl_FSUnloadFile to unload the library.

The TclpFindSymbol call shall be renamed Tcl_FindSymbol and exported in the external Stubs. Its call signature shall be:

EXTERN void* Tcl_FindSymbol( Tcl_Interp *interp, Tcl_LoadHandle *loadHandle, const char *symbol);

This call searches for the given symbol in the already-loaded library identified by loadHandle. If the symbol is found, a pointer to its memory address is returned. Otherwise, NULL is returned, and an error message is left in interp (if interp is not NULL).

A new call, Tcl_FSUnloadFile shall be introduced and exported in the external Stubs.Its call signature shall be:

EXTERN int Tcl_FSUnloadFile( Tcl_Interp *interp, Tcl_LoadHandle *loadHandle);

This call unloads the library identified by loadHandle. It differs from the [unload] command in that no 'unload' procedure is called; the library is simply unloaded without first being given a chance to clean itself up. (This function is a lower-level interface used by [unload] as part of doing its work.) The return value is either TCL_OK or TCL_ERROR; when TCL_ERROR is returned, an appropriate message is left in the result of interp (if interp is not NULL).

Internals

The Tcl_LoadHandle object shall be represented internally by a structure, declared in <tclInt.h>, looking like:

 struct Tcl_LoadHandle_ {
     ClientData clientData;	/* Client data is the load handle in the
 				 * native filesystem if a module was loaded
 				 * there, or an opaque pointer to a structure
 				 * for further bookkeeping on load-from-VFS
 				 * and load-from-memory */
     TclFindSymbolProc* findSymbolProcPtr;
 				/* Procedure that resolves symbols in a
 				 * loaded module */
     Tcl_FSUnloadFileProc* unloadFileProcPtr;
 				/* Procedure that unloads a loaded module */
 }

The Tcl_FindSymbolProc and Tcl_FSUnloadFileProc data types are declared to be functions of the same type signature as Tcl_FindSymbol and Tcl_FSUnloadFile.

Virtual file systems that implement the loadFileProc are responsible for ensuring that their 'Tcl_LoadHandle that the loadFileProc returns conforms with this convention. As far as the author has been able to determine, no non-Core filesystem provides anything but NULL for the loadFileProc. Certainly, tclvfs and trofs do not. Most other virtual filesystems layer atop tclvfs.

Discussion

At least one reviewer made the suggestion that Tcl_LoadLibrary should fail if any symbol is not found, rather than returning NULL and continuing. The rationale was that the code for the simple case (load a library and resolve a specific symbol) is simpler. Further inspection, however, has revealed that the simple case may actually be the unusual one. Consider, for instance what the [load] command does. It tries to resolve four symbols (Foo_Init, Foo_SafeInit, Foo_Unload and Foo_SafeUnload). Three of the four are optional. Certainly this one case is simpler if all the work can be accomplished in one call, with simple checks for NULL symbols. The intended uses in TDBC are, for the most part, somewhat similar, with different versions of database libraries providing subtly different API's (that the drivers are prepared to handle).

Reference Implementation

A reference implementation is being prepared. When it is complete, it will be uploaded to the SourceForge patch manager for inspection, and this TIP will be updated to indicate where to find it.

License

This file is explicitly released to the public domain and the author explicitly disclaims all rights under copyright law.


Powered by TclThis is not necessarily the current version of this TIP.

TIP AutoGenerator - written by Donal K. Fellows