NeoWebScript™ Theory of Operations

Apache Module Info

Programming information on the Apache API, by Robert S. Thau

General Theory

C-level module startup

If you put mod_neoscript.c (or mod_neoinclude.c) in your Configuration file and run Configure, it'll put our module into the build for your webserver.

Note -- you will also need our log module, mod_log_neo.c, for estimate_hits_per_hour to work.

Here's what happens in our module startup C code:

At startup of the webserver, when it is still a single thread (the parent thread), init_neoincludes runs.

It creates a Tcl interpreter, records a start time for the server, runs Tcl_Init to initialize the Tcl interpreter, Tclx_Init to add Extended Tcl, and Tcl_InitExtensions to add in our webserver-oriented extensions (NeoWebScript™).

...and then it sources init.tcl from your server root's conf directory.

Tcl-level module startup

init.tcl loads in a number of standard functions from conf/common.tcl from the server root directory.

These functions can be included by external Tcl programs that you write to administer your NeoWebScript™ databases.

Next the NeoWebScript™ procs are defined. If these are procs that run in the trusted interpreter in response to requests received from the untrusted interpreter, those commands begin with SAFE_.

SAFE commands typically manipulate data within the untrusted interpreter using the child interpreter access functions provided by Tcl.

Finally procs are defined to handle requests to execute code in response to the server-side neoscript requests seen in webpages. These are called things like handle_server_side_variable, handle_server_side_expr, etc.

Processing a Webpage

When a webpage is requested, if it matches the criteria for a server side include and a neoscript directive is found in the page, the C handle_neoscript routine is called from the C code in the module that sends parsed content.

We make use the request_config data in the request record structure. This allows each module to set a pointer and find it later. We follow the request_rec structure from ours back to the main one (if we aren't the main one) to find a non-NULL pointer for our module in the request_config structure. If there isn't one to be found, it is the first invocation of NeoWebScript™ for this page.

If it's the first use of NeoWebScript™ for this page, or if this page is being included in another page but is not owned by the same owner as the other page, we create a safe interpreter, set it into the request_config structure for our module, register for the cleanup routine to destroy the interpreter, and configure and propagate the variables to the safe interpreter.

Tcl_request_rec is a global pointer to the current request_rec being processed. We save the previous value on the stack while we're executing send_parsed_content, and put it back afterwards. That way Tcl procedures don't have to all pass around a magic cookie to find the request rec, yet multiple interpreters can exist and code in existant interpreters can be invoked for subordinate includes when the criteria for doing that are met.

If it's the first use of NeoWebScript™ for this page, then if a safe interpreter previously existed, it is destroyed. A new safe interpreter is created, and various environment variables, server variables, etc, are exported to it by propagate_vars_to_neoscript.

handle_neoscript builds up the command to be executed. It sets up to call handle_neoscript_request (which is written in Tcl). The first argument is the name of the safe interpreter. The second is the key of the key-value pair read from the server-side-include. Last comes the value of the key-value pair.

It is most important that neither the key nor the value are ever evaluated within the trusted interpreter, or the user will have defeated the protection mechanisms provided by the safe interpreter.

Since we build up the command to be executed as a list, the key and value will be quoted to make a valid list element, so they won't be evaluated by the trusted interpreter when handle_neoscript_request is called.

We then evaluate the constructed command. If there is an error, we send the error result to the webpage being constructed.

Finally we restore the Tcl_request_rec global request rec pointer with whatever was in there before. It is intialized to NULL at the start, and should automatically be restored to that when the top level page of NeoWebScript™ code completes.

handle_neoscript_request

This command is responsible for actually processing embedded NeoWebScript™ requests from within webpages. It is defined in conf/init.tcl.

It switches on the tag, which can currently be return, code, eval, var or expr. Anything else is treated as an error.

Valid tags are handled using handle_server_side_return, handle_server_side_eval, handle_server_side_variable, and handle_server_side_expr. These routines are called with the name of the safe interpreter and the code (or variable, or expression) to evaluate. The code is evaluated within the safe interpreter. Errors are caught and traced back.

These routines were defined as procs by init.tcl, and they use the interp eval command to do the right thing to the untrusted interpreter to cause code to be executed, results to be returned, etc.

devel.tcl

devel.tcl is used by people who are modifying NeoWebScript™ by changing code that runs in the trusted code base. This should only be done by people who really know what they're doing, because errors here can compromise the security of your webserver.

If debugging is set to 1 in init.tcl, then every time a page begins NeoWebScript™ execution, devel.tcl is sourced in, and devel_setup is executed. This makes it easy to work on new functions, without having to restart the webserver all the time.

If debugging is set to 0 in init.tcl, then devel.tcl is loaded it at server startup time only, although devel_setup is still run each time an interpreter is set up to do something to a page.

Unknown Packages and Procs in Safe Interpreters

As of version 2.0, there are better ways to support package require and unknown procedures within the safe interpreter. Previously, only unknown procedures could be resolved, and they were resolved by providing access to the source command (to a restricted list of directories) among others. In this way, an essentially standard unknown and auto_loading procedure could take place within the safe interpreter. There has been no previous support for package require.

Unknown Procedures

Borrowing from Michael McClennan's Itcl2.1 implementation of unknown, the mechanism for resolving calls to unknown procs in the safe interpreter has been extended to include the ability for the master interpreter to have first crack at resolving unknown procs by creating the defining the proc, defining an alias, or allowing the normal auto_loading to be attempted.

In the master interpreter, the global variable SafePaths is defined as the following:

$tcl_library
$tclx_library
SERVER_ROOT/neoscript-tcl
the directory of the script

Thus the user has access to Tcl routines defined in any of these directories, especially the directory containing the script, if a tclIndex file is present. This is the last resort for resolving an unknown proc, as the master interpreter is given first crack at it. SafePaths is generated for the slave interpreter in the procedure setup_slave_interp_unknown and may be changed at the descretion of the system administrator. Since one should be able to subvert the safeness of a safe interpreter via Tcl code, a valid alternative to $tcl_library and $tclx_library would be concat in $auto_path instead.

The Tcl proc SAFE_autoload is given first chance to resolve a command for the safe interpreter. It checks two global arrays, the safe_proc_cache and the safe_alias_cache. These arrays contain the proc or alias definition to be evaled into the safe interpreter. If it does not have a valid definition handy, SAFE_autoload returns a 0, which tells unknown in the safe interpreter to continue looking.

Resolving Package Requests

There are three ways that a package require command in the safe interpreter is fulfilled:

1. During the setup of the slave, a package ifneeded command can be evaled into the slave giving a specific script to execute when the package is requested.

2. During webserver startup, the global array SafeIfNeeded may be initialized with code to be executed in the event of a package require.

The Tcl proc SAFE_pkgUnknown is provided to handle requests for unknown packages. It first checks the SafeIfNeeded array to see if code exists for providing the package. If it exists, eval is used to execute the code within the context of the SAFE_pkgUnknown proc. It is important to remember that it is not executed in the global context. For example to provide a package "Special" which consists of Tcl procedure definitions contained in some_file.tcl, SafeIfNeeded(Special) might be set to the following:

set code [read_file some_file.tcl]
$safeInterp eval {
	$code
	package provide Special
}

Note that the package version, if any, is ignored. The variables webenv and safeInterp are available.

3. If no script is provided by setting SafeIfNeeded(packagename), then SAFE_pkgUnknown attempts to provide the package by first performing the package require command in the master interpreter. If the require command extends auto_path, the change is appended to SafePaths. Then [info loaded] is scanned for the package name and, if found, the load command is used to load the package into the safe interpreter. The load must execute a Tcl_PackageProvide(3) command to signify success to the safe interpreter.