High-Performance ACGIs in C

Ken Urquhart

Asynchronous Common Gateway Interface (ACGI) programs allow
Macintosh HTTP servers to do external processing tasks ranging
from custom HTML forms processing to controlling hardware
devices. ACGIs are usually written in AppleScript (which limits
them to handling only one server request at a time).
High-performance ACGIs, ones that are capable of handling
multiple simultaneous requests, need to be written in a high-level
language like C. The resulting ACGI will work with any HTTP server
that supports the WebSTAR WWW Apple event suite.

Now that you've got your HTTP server up and running on your Macintosh, people are
flocking to your Web site by the thousands. The only problem is that you've written all
of your Asynchronous Common Gateway Interface programs (ACGIs) in AppleScript and
their performance is leaving much to be desired. You know you should be writing your
ACGIs in C for speed, but you think that will be a lot of work.

Well, have I got news for you! A full-blown, multithreaded, high-performance ACGI
program for use with Macintosh HTTP servers is easier to write than you think. If
you've worked through one of the introductory Macintosh programming books, you
already know just about everything you need to.

When all is said and done, an ACGI is little more than a simple, Apple event-aware
application that knows how to process Apple events in threads. Most of the work is
concentrated in decoding the Apple event parameters that make up each server request.
Hopefully you won't feel so overwhelmed by ACGIs written in C (or any other
high-level language) after you've read this article, and you can get on with using them
to hot-rod your Web site!

I've made writing an ACGI easier for you by providing a generic ACGI program, which
accompanies this article on this issue's CD and develop's Web site. I designed the
program (which I'll be referring to as an ACGI "shell") in such a way that you can
create your own ACGIs just by customizing a handful of routines. The messy details of
accepting multiple requests from an HTTP server, and then handling each request in
its own thread of execution, are taken care of for you. The program even relieves you
of the burden of URL-decoding the post and search arguments (including breaking up
all of the name=value pairs and translating them from the ISO-8859 Latin-1
character encoding used by most browsers into the standard Macintosh Roman
encoding).

I've also provided a rich set of convenience routines that perform the following tasks:

I've tried to provide enough support to make it possible for you to forget most of the
details of interacting with an HTTP server and concentrate on writing the code needed
to implement your custom form processing.

The ACGI shell program, compiled under CodeWarrior as a PowerPC application with
no optimizations, takes up a little under 42K on disk (not including custom code that
you must add to process your requests). Memory requirements are dictated by the
number of concurrent requests you want to handle and how much stack space you
allocate to each running thread. In a typical case, the shell should provide uniform
response to about five to ten concurrent requests in a 1 MB memory footprint.

WHAT'S AN ACGI?

Before I can tell you what an ACGI is, I need to explain what a CGI is. This requires a
bit of background on what HTTP servers are all about.

WHAT'S A CGI?

HTTP servers are designed to do one thing and to do it very well: respond to requests
from Web browsers. If the request is for a file that resides somewhere in the server's
directory tree, the server locates the file, reads its contents, and then sends the
information back to the browser. Other requests such as image map or form processing
are handed off to auxiliary programs that communicate with the server by using the
Common Gateway Interface (CGI) protocol. When the server receives a request that
must be handled by a CGI program, the server starts up the CGI (if it wasn't already
running) and passes it the request. The CGI is responsible for parsing and decoding the
request parameters, processing them, and then composing the HTML response. The
server takes care of returning the response to the requesting browser.

Being a computer program, a CGI can readily interact with databases, transaction
processing systems, or even connected serial devices to process a given request. So
CGIs allow your Web site to serve up a wide variety of dynamic information.

The structure of a CGI program is dictated by the HTTP server and by the operating
system. The first Macintosh HTTP server was MacHTTP, written by Chuck Shotton. He
used Apple events for server/CGI communication and defined a special event suite
(WWW ) for this purpose. He later extended this suite, adding several more
parameters, when he wrote WebSTAR -- the commercial version of MacHTTP. His
suite has become the de facto standard for server/CGI interaction on the Macintosh. As
such, you can be sure that most other Macintosh HTTP servers will support it.

          Copies of Chuck Shotton's Macintosh HTTP servers, both a fully
          functional copy of MacHTTP and a time-limited copy of WebSTAR, are
          available athttp://www.starnine.com/software/software.html.*

WebSTAR-like servers use custom Apple events to communicate with CGIs and can call
them either synchronously or asynchronously.

Asynchronous calls are almost always preferable for a popular Web site that's
receiving several connection requests a second.

SO NOW WILL YOU TELL ME WHAT AN ACGI IS?

An ACGI is a CGI that's called asynchronously by the HTTP server (you're surprised to
hear this?). Furthermore, when an ACGI is written to handle each request in a
separate thread of execution (enabling it to deal with multiple requests
simultaneously), it's referred to as a threaded ACGI.

To write a threaded ACGI for the Macintosh, you need to understand the following:

While it would be just about impossible to describe each of these points in detail in one
short article, I do provide brief overviews as I talk about the functions of the ACGI
shell.

          For more information on writing a threaded ACGI, refer to the book
          Planning and Managing Web Sites on the Macintosh: The Complete Guide to
          WebSTAR and MacHTTP, which covers this topic in detail and is a good general
          reference. Chapters 10 through 15 provide a wealth of information,
          especially Chapter 13, "Writing CGI Applications," and Chapter 15,
          "Developing CGIs in C."*

Like other threaded ACGI solutions (described in "Other Techniques for Developing a
Threaded ACGI"), my technique uses cooperative threads as opposed to preemptive
threads. This allows you to call any Toolbox routine you want when you're carrying out
your form processing. Preemptive threads currently have many Toolbox calling
restrictions (see the article "Concurrent Programming With the Thread Manager" in
develop Issue 17).

          ______________________________

       OTHER TECHNIQUES FOR DEVELOPING A THREADED
       ACGI

          Processing Apple events in threads has been dealt with by several authors, and
          there are a variety of solutions available.

          The first solution was presented by Steve Sisak in late 1994 in his MacTech
          Magazine article "Adding Threads to Sprocket." His AEThreads library allows
          you to choose which Apple events to process in threads and gives you complete
          control over all thread creation parameters.

          A second, rather different approach can be found in the source code for the
          Mail Tools ACGI written by Jon Norstad (available
          athttp://charlotte.acns.nwu.edu/mailtools/techinfo.html).

          Greg Anderson, in his article "Futures: Don't Wait Forever" in develop Issue
          22, presented a third solution involving a predispatch Apple event handler
          that transparently threaded all Apple events.

          John O'Fallon described a fourth method in his MacTech article "Writing CGI
          Applications in C." In 1996, Grant Neufeld came up with a fifth solution in
          conjunction with his CGI framework in his MacTech article "Threading Apple
          Events."

          Not wishing to break with this long tradition, the program described in this
          article presents yet a sixth variation on the theme.

          ______________________________

THE STRUCTURE OF THE ACGI SHELL

Just as there are many ways of writing a Macintosh application, there are many ways
to write an ACGI shell. I've taken the simplest possible approach and avoided using an
application framework like MacApp or PowerPlant. My ACGI shell is written in plain C
and consists of three logically separate code sections:

The code is split into two source files (acgi.c and www.c), two include files (acgi.h and
www.h), and one resource file (acgi.rsrc). The main application and the convenience
routines are located in acgi.c, while the routines that you'll need to customize are in
www.c. The include file acgi.h contains the public prototypes for the convenience
functions you can call from www.c, while the include file www.h contains the function
prototypes and data structure definitions used by routines in both source files.

THE ROUTINES YOU NEED TO CUSTOMIZE

The file www.c contains six routines that you'll need to customize to implement your
own custom form processing. Four routines are called exactly once by the main
program while the ACGI is running. A fifth routine is called at idle time in the main
event loop, while the last one is called to process each HTTP request.

WWWGETLOGNAME

When the ACGI starts up, one of the first things the main program does is open a log
file to write progress messages to. It gets the name of the file by calling this routine:

char *WWWGetLogName(void);

Customizing WWWGetLogName allows you to specify the name of the log file. All you
typically need to do is write something like this:

char *WWWGetLogName(void)
{
   return "acgi.log";
}

The one gotcha here is that I've used ANSI file I/O routines to simplify the program
code. So you must always be sure to return a valid ANSI filename (a plain filename
fewer than 31 characters long with no full or partial Macintosh file path prepended to
it). Note that some Macintosh ANSI libraries will allow filenames prefixed by partial
paths as long as the total length of the string is no longer than 255 characters.

WWWGETHTMLPAGES

After the log file is opened, the main program will ask you to build four HTML error
pages that are returned to the HTTP server when one of these general errors occurs:

The routine you use to construct your pages is as follows:

void WWWGetHTMLPages(Handle refused, Handle tooBusy,
   Handle noMemory, Handle unexpectedError);

The main program passes in four handles. Each handle contains a standard HTTP
response header, and you're responsible for appending whatever HTML text you want
for the error pages. This allows you to control the "look and feel" of the error
messages returned by your ACGI. Perhaps the simplest approach here is to put the
HTML error pages into text files located in the same directory as your ACGI and then
append them to the handles with the convenience routine HTMLAppendFile:

void WWWGetHTMLPages(Handle refused, Handle tooBusy,
      Handle noMemory, Handle unexpectedError
{
   HTMLAppendFile(refused, "acgiRefused.html");
   HTMLAppendFile(tooBusy, "acgiTooBusy.html");
   HTMLAppendFile(noMemory, "acgiNoMemory.html");
   HTMLAppendFile(unexpectedError, "acgiUnexpected.html");
}

Other convenience routines allow you to read the text from string and text resources,
so you have some flexibility here. The idea behind WWWGetHTMLPages is to allow you
to create your HTML error pages early in the initialization phase so that they'll always
be available for use.

WWWINIT

After the main program has completed its initialization steps, you're given a chance to
carry out any private initialization you need to do before beginning form processing.
This might include calling the ACGI runtime-tuning routines, initializing your own
global variables, reading resources into memory, building HTML template pages, or
opening connections to external databases and other computers. The prototype is

OSErr WWWInit(void);

If you run into problems during your initialization, simply return a nonzero code. The
main program checks the return code and immediately quits to the Finder when the
code is nonzero.

If you have no special initialization to do, you could write this routine as follows:

OSErr WWWInit(void)
{
   return (noErr);
}

WWWQUIT

When the main program exits its main event loop, it calls this next routine to give you
one last chance to clean up after yourself (close files, database connections, and so on):

void WWWQuit(void);

If you don't need to do any cleaning up, you can write something as simple as this:

void WWWQuit(void) { }

WWWPERIODICTASK

The main program allows you to carry out idle-time processing by calling the
following routine at the end of each pass through the main event loop:

OSErr WWWPeriodicTask(void);

This is where you'd place code to check that connections to other computers are still
alive or carry out any background processing initiated by previous server requests. If
you have no idle-time processing, you could write the following:

OSErr WWWPeriodicTask(void)
{
   return (noErr);
}

The main program checks the return code from this routine and, if the code is nonzero,
quits to the Finder (after trying to gracefully abort all currently running threads).

WWWPROCESS

The last routine you must customize is the one that processes a server request:

OSErr WWWProcess(WWWRequest request);

When the HTTP server sends the ACGI a request through an Apple event, the main
program creates a new thread and passes the Apple event data into the thread. The
thread extracts the request data from the Apple event and packs it into a private data
structure. The thread then calls WWWProcess, passing a pointer to the private data
structure in the request parameter. You extract information from the data structure
with the convenience routines (described later).

If you need to abort the processing of a request, you can return one of the four error
codes errWWWRefused, errWWWTooBusy, errWWWNoMemory, and
errWWWUnexpected. These cause the corresponding HTML error pages that you built
in the routine WWWGetHTMLPages to be returned to the server.

THE MAIN PROGRAM

As mentioned previously, the main program is a simple Macintosh application --
simpler than most of the programs described in introductory Macintosh programming
books. It's important to remember that an ACGI is meant to interact with HTTP
servers, not live users. It doesn't need any windows, complex menus, or even an About
box. Its purpose in life is to respond to Apple events and not mouse clicks or
keystrokes.

Furthermore, you cannot assume that a human will always be watching the server
screen, ready to react to dialog boxes or alerts. If an ACGI runs into trouble, it should
try to recover as best it can and keep going. For example, if a required external
database shuts down, an ACGI might return an "out of service" response to each request
until the database comes back online. If an ACGI runs out of memory, it might simply
quit and allow the HTTP server to launch a fresh copy of it the next time a request
comes in. Hopefully, that would cure the problem in the short term.

An efficient, low-overhead ACGI is therefore a windowless, Apple event-aware
program that posts no alerts or dialogs. It implements only the Apple and File menus.
For simplicity, the About item in the Apple menu does nothing except show the name of
the ACGI (although there's nothing to stop you from implementing an About box if you
want to). The File menu contains the single item Quit. A log file is used to record all
informational, error, and debugging messages.

As shown in Listing 1, the main program starts by calling ACGIInit to set itself up.
Then it runs the main event loop, calling ACGIEvent to process each new event, until
the global gDone flag is set and all threads have completed. The program then cleans up
after itself by calling ACGIQuit.

______________________________

Listing 1. The ACGI main program

// Include files and function prototypes
...

static Boolean         gDone = false;
static unsigned long   gThreads = 0;
static long            gThreadSleep = 4;
static long            gIdleSleep = 0x7FFFFFFF;
static long            gWNEDelta = 8;

void main(void)
{
   EventRecord     theEvent;
   long            sleep;
   unsigned long   nextWNE;

   ACGIInit();
   while (!gDone || gThreads > 0) {
      if (gThreads > 0)
         sleep = gThreadSleep;
      else
         sleep = gIdleSleep;
      if (WaitNextEvent(everyEvent, &theEvent, sleep, nil))
         ACGIEvent(&theEvent);
      nextWNE = TickCount() + gWNEDelta;
      do {
         YieldToAnyThread();
      } while (TickCount() <= nextWNE);
      ACGIPeriodicTask();
   }
   ACGIQuit();
}

______________________________

THREADS AND THE MAIN EVENT LOOP

The presence of threads affects the main event loop shown in Listing 1 in three ways.
First, the loop doesn't exit as long as there are active threads. This ensures that all
threads processing HTTP server requests complete their work before the ACGI shuts
down. Second, there are two different sleep times for WaitNextEvent: gThreadSleep
when threads are running and gIdleSleep when they're not. We need idle time to give
the threads a chance to run. This means we should use a rather small value for sleep
when gThreads is greater than 0. On the other hand, when there are no outstanding
requests, we should set sleep to a large value to avoid wasting CPU time. The exception
to this rule is when you have periodic tasks, in which case you should call
ACGISetSleeps in WWWInit to set gIdleSleep to get the idle time you need.

Third, there's the inner loop that repeatedly calls YieldToAnyThread. This routine
causes the Thread Manager to turn control over to the oldest running thread. This
thread keeps control until it too calls YieldToAnyThread to turn control over to the
next running thread. This continues until the newest thread calls YieldToAnyThread and
control returns to the main event loop (see "Concurrent Programming With the
Thread Manager" in develop Issue 17).

It's important to call YieldToAnyThread frequently inside your request-processing
code, usually after you complete a logical step in your processing and no less than
every 1 to 2 ticks of the Macintosh clock (1 tick = 1/60th of a second). Don't bother
putting your calls to YieldToAnyThread inside a timed loop as we did in the main event
loop. Just call it often throughout your code: it's a very low overhead call. The secret to
uniform response time to all requests is not to allow any one thread to hog the CPU.

YieldToAnyThread is enclosed in a timed loop to give threads enough time to do useful
work when running on a Power Macintosh. Currently, there's a context switch from
native PowerPC mode to 680x0 emulation mode when WaitNextEvent is called. In
addition, historical reasons guarantee that WaitNextEvent always waits at least 1 tick
before it returns. Calling YieldToAnyThread only once per pass through the main event
loop means that threads would get time only once every 1/60th of a second and a lot of
useful CPU time would be wasted in mode switches. The timed loop could result in a
thousandfold performance increase -- without noticeably affecting other applications
-- for ACGIs running compute-bound threads that frequently yielded.

THE INITIALIZATION ROUTINE ACGIINIT

ACGIInit carries out seven distinct steps to get the ACGI going:

  1. Initialize the Toolbox.
  2. Get the name of the log file by calling WWWGetLogName and then open it.
  3. Check to see that both Apple events and the Thread Manager are present.
  4. Set up the menu bar.
  5. Install the Apple event handlers.
  6. Call WWWGetHTMLPages to build the four generic HTML error pages.
  7. Call WWWInit to initialize your processing environment.

If ACGIInit runs into trouble, it calls ACGIFatal to write an error message to the log
file and quit. If you run into trouble in WWWInit you should write a meaningful error
message to the log with ACGILog and return a nonzero result code. ACGIInit will write
the code to the log and then quit.

THE LOGGING ROUTINES ACGILOG AND ACGIFATAL

Two routines that write zero-terminated strings to the log -- ACGILog and ACGIFatal
-- are shown in Listing 2. In these routines, gLog is an ANSI FILE*variable that's
local to the source file acgi.c. It points to the open log file.

______________________________

Listing 2. Logging routines

void ACGILog(char *msg)
{
   DateTimeRec   dt;
   ThreadID      theThread;

   if (gLog == NULL)
      return;
   GetTime(&dt);
   GetCurrentThread(&theThread);
   fprintf(gLog, "%4d/%02d/%02d\t%02d:%02d:%02d\t%010lu\t%s\n",
      dt.year, dt.month, dt.day, dt.hour, dt.minute, dt.second,
      theThread, msg);
   fflush(gLog);
}

void ACGIFatal(char *reason)
{
   if (gLog != NULL) {
      ACGILog(reason);
      ACGILog("That was a fatal error...shut down.");
   }
   ExitToShell();
}

 

______________________________

ACGILog prefixes each message with the date and time and the ID number of the thread
it was called in. The items are tab-separated so that you can later import the log into a
spreadsheet and sort it by date, time, or thread ID. This can be useful when you're
trying to debug an ACGI or gather statistics based on the messages you wrote into the
log during processing. ACGIFatal calls ACGILog to write its message to the log and then
quits the program immediately without waiting for running threads to complete. It's
meant to be called only from within ACGIInit.

PERIODIC TASKS AND THE TERMINATION ROUTINE

ACGIPeriodicTask runs periodic tasks by calling your WWWPeriodicTask routine and
then checking for a nonzero result code (in which case it writes the code to the log and,
if the code is positive, sets gDone to true). The termination routine ACGIQuit is the last
routine called by the main program. It shuts down processing by calling your
WWWQuit routine and then closes the log.

EVENT HANDLING IN THE MAIN EVENT LOOP

Since an ACGI is basically a simple Macintosh application with no windows, no About
box, and only the Apple menu and File menu (which supports the single item Quit),
you don't have to worry about activate and update events, and suspend/resume events
only need to set the cursor to an arrow. Keystrokes are important only if they're
Command-key equivalents that might represent a menu selection. This limited event
handling is carried out entirely in the routine ACGIEvent and its small support routine
DoMenu (for menu and Command-key handling). ACGILog is used to report any errors
that are encountered.

ACGIEvent doesn't need to do any special processing at this level to handle threaded
Apple events. It just calls AEProcessAppleEvent like any other application. Details of
the threading process are hidden away in the Apple event handler that's called in
response to HTTP server requests.

APPLE EVENT SUPPORT IN THE ACGI

The ACGI must support the four core Apple events and the custom event sent by HTTP
servers and must be able to process HTTP events in threads. Here are the details of
how the ACGI shell implements the required Apple events and the threading of the
server requests.

SUPPORTING CORE APPLE EVENTS

Any application that supports Apple events must support the four core events (Open
Application, Open Document, Print Document, and Quit Application), as well as any
custom Apple events needed for communication with other programs. Because the ACGI
doesn't have any documents, doesn't do any printing, and does all the application
initialization before accepting the first Apple event, it can deal with the four core
events with the single handler HandleAECore:

#define kQuitCoreEvent 1
#define kOtherCoreEvent 0
static pascal OSErr HandleAECore(AppleEvent *event,
   AppleEvent *reply, long refCon)
{
   if (refCon == kQuitCoreEvent)
      gDone = true;
   return (noErr);
}

The ACGI sets the handler reference constant, refCon, to kOtherCoreEvent for the
'oapp', 'odoc', and 'pdoc' events and to kQuitCoreEvent for the 'quit' event. When the
handler is called, it simply returns noErr if the refCon is kOtherCoreEvent and sets
gDone to true if the refCon is kQuitCoreEvent.

THREADING HTTP SERVER REQUESTS

The WWW Apple event class defines a single event ID ('sdoc') to pass requests to ACGI
programs. This is the event that the ACGI shell responds to. To handle multiple server
requests at once, the ACGI must process each request in its own thread of execution.

This leads to some complications in the code because the Apple Event Manager was
designed to have only one event active at any given time. To process multiple Apple
events in threads, the ACGI will have to suspend each new Apple event in the main
thread of execution, put each suspended event into its own thread for processing, and
then let each thread resume its suspended Apple event at the end of processing so that
replies are returned to the HTTP server.

The one catch here is that when an event is suspended, the pointers to the event and
reply data structures become invalid. The ACGI must therefore make copies of the
event and reply data structures (and not just the pointers) before suspending an event.
These copies of the AEDescs are passed into the thread for processing.

So, the processing flow for threading HTTP server requests is as follows:

  1. ACGIInit makes HandleSDOC the handler for HTTP server requests.
  2. The main event loop (running in the main thread) receives an HTTP
    server request and calls AEProcessAppleEvent as usual.
  3. HandleSDOC (also running in the main thread) receives the Apple event.
  4. If there are too many threads running or the ACGI is refusing connections,
    the handler immediately returns an HTML page indicating that the server
    request cannot be processed. Otherwise, the handler allocates a handle called
    params to hold copies of the Apple event and its reply. Note that the complete
    data structures must be copied, not just the pointers to them, because the
    pointers become invalid when the event is suspended.
  5. HandleSDOC creates a new thread and passes params into it. If the thread
    cannot be created, params is disposed of and the error code is returned.
  6. HandleSDOC increments the count of running threads and then suspends
    the current Apple event and returns. The main event loop is now free to accept
    another server request.
  7. The main event loop regains control and calls YieldToAnyThread almost
    immediately. Each processing thread is given time to run, and control
    eventually passes to the new thread.
  8. The new thread begins life by calling SDOCThread. This routine makes
    local copies of the suspended Apple event and its reply and then disposes of the
    params handle that was passed to it by HandleSDOC.
  9. SDOCThread extracts parameters from the Apple event, URL-decodes
    them, and then calls WWWProcess to process the server request.
    WWWProcess calls YieldToAnyThread frequently to give time to other threads
    and to allow the main thread to accept new Apple events. When WWWProcess
    finishes, it returns a handle containing the HTML response page.
  10. SDOCThread places the response into its copy of the Apple event reply and
    then resumes execution of the suspended event. The event in this thread is now
    considered complete. You're guaranteed that no other Apple event will be
    "current" at this time because HandleSDOC suspends each new event before any
    of the processing threads are given time to run.
  11. The thread decrements the global counter gThreads and then returns
    (causing the thread to be disposed of).

With this processing flow as a guide, the associated code practically writes itself. The
HandleSDOC routine is shown in Listing 3.

______________________________

Listing 3. Handling HTTP server requests

static unsigned long   gMaxThreads = 10;
static Boolean         gRefusing = false;
static long            gThreadStackSize = 0;
static ThreadOptions   gThreadOptions
                          = kCreateIfNeeded | kFPUNotNeeded;
typedef struct AEParams {
   AppleEvent   event;
   AppleEvent   reply;
} AEParams;

void SDOCThread(void *threadParam);
OSErr ACGIReturnHandle(AppleEvent *reply, Handle h);

pascal OSErr HandleSDOC(AppleEvent *event, AppleEvent *reply,
   long refCon)
{
   AEParams**   params;
   ThreadID     newThreadID;
   OSErr        err;

   // [1]   Too many threads already running?
   if (gThreads >= gMaxThreads)
      return (ACGIReturnHandle(reply, gHTMLTooBusy));
   
   // [2]   Should we handle this request?
   if (gDone || gRefusing)
      return (ACGIReturnHandle(reply, gHTMLRefused));

   // [3]   OK to run...make copies of event and reply.
   params = (AEParams**) NewHandle(sizeof(AEParams));
   if (params == nil)
      return (errAEEventNotHandled);
   (*params)->event = *event;   // Copy the data structures...
   (*params)->reply = *reply;   // ...not just the pointers to them!
       
   // [4]   Create the thread, passing in the copies of event and
   //       reply.
   err = NewThread(kCooperativeThread,
         (ThreadEntryProcPtr) SDOCThread, (void*) params,
         gThreadStackSize, gThreadOptions, nil, &newThreadID);
   if (err != noErr) {
      DisposeHandle((Handle) params);
      return (err);
   }

   // [5]   Increment the count of running threads and then suspend
   //       the current event so that we can accept new events.
   gThreads++;
   return (AESuspendTheCurrentEvent(event));
}

______________________________

Global variables guide the actions of HandleSDOC. The maximum number of concurrent
processing threads is controlled by gMaxThreads. You can get and set this value with
the convenience routines ACGIGetMaxThreads and ACGISetMaxThreads. If gRefusing is
true, the handler will return the HTML page stored in gHTMLRefused and not process
the event (you build this page in your custom routine WWWGetHTMLPages). You set
gRefusing by calling ACGIRefuse. If you're really concerned about heap fragmentation,
you might want to create a pool of preallocated threads during initialization with the
number of threads in the pool equal to gMaxThreads. Threads are recycled into the
pool, limiting fragmentation. This is the approach taken by Grant Neufeld in his ACGI
framework (see "Threading Apple Events" in the April 1996 issue of MacTech
Magazine).

The globals gThreadStackSize and gThreadOptions give you control over how threads
are created. The convenience routines ACGIGetThreadParams and ACGISetThreadParams
allow you to get and set their values. The default stack size of 0 causes the Thread
Manager to allocate a 24K stack to each thread. (Thread creation options are described
in detail in "Concurrent Programming With the Thread Manager" in develop Issue 17.)

If your WWWProcess routine (or any routine that it calls) uses a lot of stack space
for local variables, you might have to increase the thread stack size. You should do this
in your WWWInit routine. You'll know if you're running out of stack space in your
ACGI because your server computer will usually lock up when a running thread's stack
overflows the heap space allocated to it. So remember, if your server keeps freezing
up or bombing, and you don't think your code is the problem, try increasing the stack
size allocated to your threads and then increase the ACGI memory allocation by roughly
the increase in stack size multiplied by your chosen value of gMaxThreads.

          The Thread Manager has routines that check how much stack space a
          given thread is using. You could therefore write a debugging macro that logs
          the stack space remaining before calling YieldToAnyThread. This could be
          useful in isolating where the problem is after the crash -- but it wouldn't
          actually stop the thread from exhausting its stack space because that happens
          between yields.*

HTTP REQUEST PROCESSING

Each thread created by HandleSDOC won't start running until the main event loop calls
YieldToAnyThread. When it's time for a new thread to run, the Thread Manager saves
the state of the thread that just yielded, sets up the new thread's environment, and then
calls SDOCThread. This routine is where all the real work of the ACGI takes place --
and where your custom processing routine WWWProcess is invoked.

SDOCThread is the longest and most complicated routine in the ACGI. It's responsible
for extracting all request parameters, URL-decoding the search and post arguments,
packing the parameters into a WWWRequest data structure, calling WWWProcess to
process the request, placing the HTML response page into the server reply, and then
resuming the Apple event to send the reply back to the server.

Before looking at the code, it's a good idea to go over exactly what's packed into the
'sdoc' Apple event. A client browser asks the server to run an ACGI either by
referencing the ACGI's URL or by submitting HTML form pages that specify the ACGI as
its action.

A direct reference is just the URL of the ACGI:

http://www.test.com/test.acgi

To invoke an ACGI as the action of a form, you need to write HTML code like this:

<FORM METHOD=GET ACTION="http://www.test.com/test.acgi">
   ...form input items...
</FORM>

or similar code for METHOD=POST. In both cases, you can supply extra arguments to
the ACGI by adding them to the end of the URL like this:

http://www.test.com/test.acgi$path_args?search_args

The path arguments are everything between the dollar sign ($) and the question mark
(?), while the search arguments are everything following the question mark. The
order of the $ and the ? are important. If you put the ? before the $, everything
following the ? (including the $ and what comes after it) is considered part of the
search arguments.

When you're using forms, you can specify a method of either GET or POST. All of your
form's input variables are URL-encoded. If you specify GET, the input variables are
tacked onto the end of the search arguments; if you use POST, they're placed into a
separate parameter called the post arguments and sent separately.

URL encoding isn't particularly fancy. All it means is that the input field names and
field values are written out as name=value pairs, and all such pairs are placed into one
long parameter with each pair separated from the next by an ampersand (&). All
spaces in the original input variables are replaced by plus signs (+) and any special
characters are replaced by their ISO-8859 Latin-1 hexadecimal equivalents in the
form %xx (where xx represents the two hex digits identifying the character).

Any or all of these arguments (if present), along with a series of parameters that
describe the client browser and the server, are placed into the 'sdoc' Apple event and
sent to the ACGI by the HTTP server. Each parameter is identified in the Apple event by
4-character keyword names. The ACGI passes these keyword names to the Apple Event
Manager to extract the various parameters.

          For a full description of the keywords, refer to Planning and Managing
          Web Sites on the Macintosh: The Complete Guide to WebSTAR and MacHTTP,
          Chapters 13 and 15.*

The five most important keywords to be aware of are as follows:

The path, search, and post arguments hold the data that makes up a request. The
browser name lets you decide which HTML features you might want to include in your
response page. For example, you might not want to use the latest HTML features of
Netscape NavigatorTMin your response page if the browser name says that the client is
an old version of Mosaic that doesn't understand tables and frames.

Most of the code in SDOCThread (excerpted in Listing 4) deals with extracting
parameters from the event and then breaking up the search and post arguments into
name=value pairs.

______________________________

Listing 4. The SDOCThread routine

static void SDOCThread(void *threadParam)
{
   WWWRequest   request;
   Size         spaceNeeded, responseSize;
   OSErr        err;
   
   // [1]   Copy event and reply to local storage.
   AEParams** params = (AEParams**) threadParam;
   AppleEvent event = (*params)->event;
   AppleEvent reply = (*params)->reply;

   DisposeHandle((Handle) params);

   // [2]   Initialize request structure.
   memset(&request, 0, sizeof(request));
   
   // [3]   Allocate storage for params/args.
   spaceNeeded = ACGIParamSize(&event);
   request.storage = NewHandleClear(spaceNeeded);
   if (request.storage == nil) {
      char   msg[128];

      sprintf(msg, "SDOCThread: no storage memory: %lu bytes.",
         spaceNeeded);
      ACGILog(msg);
      err = ACGIReturnHandle(&reply, gHTMLNoMemory);
      gDone = true;
      goto Done;
   }
   HLockHi(request.storage);

   // [4]   Copy params/args into position.
   err = ACGICopyArgs(&event, &request);
   if (err != noErr) goto Done;
   
   // [5]   Decode URL-encoded search and post arguments.
   if (strlen(*request.storage + (long) request.searchArgs) > 0) {
      err = ACGIURLDecode(
               *request.storage + (long) request.searchArgs,
               &request.searchNum, &request.searchNames,
               &request.searchValues);
      if (err != noErr) goto Done;
   }
   if (strlen(*request.storage + (long) request.postArgs) > 0) {
      err = ACGIURLDecode(*request.storage + (long) request.postArgs,
               &request.postNum, &request.postNames,
               &request.postValues);
      if (err != noErr) goto Done;
   }
   HUnlock(request.storage);

   // [6]   Allocate HTML response.
   request.response = NewHandleClear(gHTTPHeaderLen);
   if (request.response == nil) {
      gDone = true;
      err = ACGIReturnHandle(&reply, gHTMLNoMemory);
      goto Done;
   }
   BlockMoveData(gHTTPHeader, *request.response, gHTTPHeaderLen);

   // [7]   Call the custom processor.
   err = WWWProcess(&request);
   
   // [8]   Put the response into the reply and resume the Apple
   //       event.
Done:
   if (request.storage != nil) DisposeHandle(request.storage);
   if (request.searchNames != nil)
      DisposeHandle(request.searchNames);
   if (request.searchValues != nil)
      DisposeHandle(request.searchValues);
   if (request.postNames != nil) DisposeHandle(request.postNames);
   if (request.postValues != nil) DisposeHandle(request.postValues);
   
   responseSize = GetHandleSize(request.response);
   if (err == noErr && request.response != nil
         && responseSize > gHTTPHeaderLen)
      err = ACGIReturnHandle(&reply, request.response);
   else
      switch (err) {
         case errWWWNoMemory:
            err = ACGIReturnHandle(&reply, gHTMLNoMemory);
            break;
         case errWWWRefused:
            err = ACGIReturnHandle(&reply, gHTMLRefused);
            break;
         case errWWWTooBusy:
            err = ACGIReturnHandle(&reply, gHTMLTooBusy);
            break;
         case errWWWUnexpected:
            err = ACGIReturnHandle(&reply, gHTMLUnexpectedError);
            break;
         default:
            err = ACGIReturnHandle(&reply, gHTMLUnexpectedError);
            break;
      }
   if (request.response != nil) DisposeHandle(request.response);
   
   // [9]   Put error code into the Apple event (if needed).
   if (err != noErr) {
      long errorResult = err;      // Must be long integer.
      AEPutParamPtr(&reply, keyErrorNumber, typeLongInteger,
         &errorResult, sizeof(long));
   }
   // [10]   Resume the event, decrement running thread count, write
   //        to the log.
   AEResumeTheCurrentEvent(&event, &reply,
                  (AEEventHandlerUPP) kAENoDispatch, 0);
   gThreads--;
   ACGILog("Done.");
   return;
}

______________________________

The only item passed to your custom WWWProcess routine is a pointer to the
WWWRequestRecord. You access the items stored in the record using the convenience
routines that are defined later.

EXTRACTING PARAMETERS FROM THE APPLE EVENT

The routines ACGIParamSize and ACGICopyArgs repeatedly call the Apple Event
Manager to get the size and the text of each parameter. ACGICopyArgs moves each
successive parameter into the request.storage handle in the WWWRequestRecord
data structure (see acgi.h). It also places the offset of each parameter, relative to the
start of the handle, into corresponding pointer variables inrequest. Because most
parameters are only 10 to 100 bytes in length, it seemed far more efficient to pack
them all into a single handle. This avoids the overhead of making multiple calls to the
Memory Manager to allocate one handle for each parameter and then make multiple
calls to HLock and HUnlock when manipulating the parameters during processing.

          Another way of storing the parameters is to place each parameter into
          its own handle. See Jon Norstad's Mail Tools code (available
          athttp://charlotte.acns.nwu.edu/mailtools/techinfo.html) for an example of
          this other approach.*

All parameters are stored as text strings, even the connection ID (a long integer).
Missing or empty parameters are stored as zero-length strings so that the ACGI can
handle requests from HTTP servers that only partially implement the full WWW
Apple event suite (there's no guarantee a given server program will pass your ACGI all
the parameters defined in the suite). You can get the numeric value of any parameter
by calling the convenience routine HTTPGetLong.

DECODING URL-ENCODED POST ARGUMENTS

The search and the post argument strings are URL-decoded by the routine
ACGIURLDecode following the prescription outlined in Chapter 13 of Planning and
Managing Web Sites on the Macintosh: The Complete Guide to WebSTAR and MacHTTP.

The routine begins by counting all of the name=value pairs in the given string by
looking for & separators. Two handles are then allocated to hold thechar* pointers.
The string is then scanned, and the offset of each argument name and its associated
value are recorded in the arrays. Finally, the routine ACGIDecodeCStr is called to
convert each name=value pair from ISO-8859 Latin-1 encoding to the standard
Macintosh Roman encoding. The conversion table used by the popular Netscape
Navigator browser is employed here for compatibility. If you want to substitute
another 256-character translation table, you'll need to replace the ID=1000 'xlat'
resource located in the resource file acgi.rsrc.

CONVENIENCE ROUTINES

There are three sets of convenience routines that allow you to extract parameters from
a server request, build your HTML response page, and fine-tune the runtime
performance of the ACGI.

PARAMETER AND ARGUMENT EXTRACTION ROUTINES

Seven routines, identified by the prefix "HTTP," can be used to extract parameters or
post and search arguments from the WWWRequestRecord that's passed to the
WWWProcess routine. The enumeration WWWParameter contains the name by which
an individual parameter must be referenced:

typedef enum WWWParameter {
   p_path_args = 0,
   p_username,
   p_password,
   p_from_user,
   p_client_address,
   p_server_name,
   p_server_port,
   p_script_name,
   p_content_type,
   p_referer,
   p_user_agent,
   p_action,
   p_action_path,
   p_method,
   p_client_ip,
   p_full_request,
   p_connection_id
} WWWParameter;

Following are descriptions of the routines.

 

Boolean HTTPLockParams(WWWRequest r);

Locks down the request parameters. Several items in the WWWRequestRecord are
stored as handles and must be locked down before the ACGI can access them.
HTTPLockParams locks the items down for you and HTTPUnlockParams (below)
releases them. It might be a good idea to unlock your parameters before calling
YieldToAnyThread.

Convenience routines that return const char* pointers to parameters implicitly call
HTTPLockParams to lock down the WWWRequestRecord before they return the
pointers. Note that the request record remains locked when the routines return. The
routines that copy parameters and arguments into the character strings you pass in
will lock the request record while they're copying the information and then unlock it
before they return (but only if the data structure wasn't already locked on entry).

void HTTPUnlockParams(WWWRequest r);

Unlocks the request parameters.

const char *HTTPGetParam(WWWRequest r, WWWParameter par);

Gets a pointer to one of the parameter strings. This leaves r locked.

Boolean HTTPGetLong(WWWRequest r, WWWParameter par, long *i);

Gets the integer value of a parameter. The result is returned in i. The routine returns
false if the parameter is not an integer.

Boolean HTTPCopyParam(WWWRequest r, WWWParameter par, char *result,
long len,
   long *actualLen);

Copies the parameter text into the character variable result. The length of result is
in len; the actual length of the parameter is returned in actualLen. The routine
returns false if the parameter identifier par is invalid.

long HTTPGetNumSrchArgs(WWWRequest r);
long HTTPGetNumPostArgs(WWWRequest r);

Gets the number of search or post arguments.

Boolean HTTPGetSrchArgAt(WWWRequest r, long index, char *name,
   long nameLen, long *actualNameLen, char *value, long valueLen,
   long *actualValueLen);
Boolean HTTPGetPostArgAt(WWWRequest r, long index, char *name,
   long nameLen, long *actualNameLen, char *value, long valueLen,
   long *actualValueLen);

Gets a search or post argument by absolute position. index is between 1 and the total
number of such arguments. name receives the name of the argument at position
index, and value receives the value. The lengths of the character array's name and
value are in nameLen and valueLen. The actual lengths of the items are returned in
actualNameLen and actualValueLen. The routine returns false if index is out of
range.

Boolean HTTPGetSrchArgCount(WWWRequest r, char *name,
   long *numValues);
Boolean HTTPGetPostArgCount(WWWRequest r, char *name,
   long *numValues);

Gets the number of search or post arguments that have the field name name. The
number is returned in numValues. The routine returns false if there's no search or
post argument called name.

const char *HTTPGetMultipleSrchArg(WWWRequest r, char *name,
   long index);
const char *HTTPGetMultiplePostArg(WWWRequest r, char *name,
   long index);

Tries to get the instance index of a multivalued search or post argument. The routine
returns an empty string if index is out of range or if name doesn't exist. The routine
leaves r locked on exit. index starts at 1.

Boolean HTTPGetLongMultipleSrchArg(WWWRequest r, char *name,
   long index, long *i);
Boolean HTTPGetLongMultiplePostArg(WWWRequest r, char *name,
   long index, long *i);

Gets the integer value of the instance index of a multivalued search or post argument
called name. The routine returns the value in i, and returns false if index is out of
range or the argument is not an integer. index starts at 1.

Boolean HTTPCopyMultipleSrchArg(WWWRequest r, char *name, long index,
   char *value, long len, long *actualLen);
Boolean HTTPCopyMultiplePostArg(WWWRequest r, char *name, long index,
   char *value, long len, long *actualLen);

Copies the contents of the instance index of a multivalued search or post argument
called name. The routine returns text in value. The length of the value string is in
len; the actual length of the value string is returned in actualLen. The routine
returns false if index is out of range or name doesn't exist. indexstarts at 1.

HTML PAGE COMPOSITION ROUTINES

There are ten routines, all prefixed with "HTML," to help you compose the HTML
response pages. The routines that allow you to append different types of data to the
response page are shown in Table 1; the handle to the response page is obtained by
calling HTMLGetResponseHandle.

Handle HTMLGetResponseHandle(WWWRequest r);

Gets the handle to the HTML response page.

OSErr HTMLClearPage(Handle r);

Clears the current response page (except for the HTTP header) and starts over.

______________________________

Table 1. Routines that append data to the HTML response page

Routine

OSErr HTMLAppendHandle(Handle r, Handle h);

OSErr HTMLAppendTEXT(Handle r, long iTEXTResID);

OSErr HTMLAppendString(Handle r, long iSTRResID);

OSErr HTMLAppendIndString(Handle r, long iSTRResID, long index);

OSErr HTMLAppendFile(Handle r, char *localFileName);

OSErr HTMLAppendCString(Handle r, char *cString);

OSErr HTMLAppendPString(Handle r, StringPtr pString);

OSErr HTMLAppendBuffer(Handle r, char *buffer, long len);

______________________________

ACGI RUNTIME-TUNING ROUTINES

There are 13 routines that allow you to fine-tune the runtime behavior of the ACGI
without having to modify the code in acgi.c or directly set global variables.

void ACGIShutdown(void)

Shuts down the ACGI as soon as all current threads are finished.

Boolean ACGIIsShuttingDown(void)

Tests whether the ACGI is shutting down.

Boolean ACGIRefuse(Boolean refuse)

Sets whether to accept or reject requests.

unsigned long ACGIGetRunningThreads(void)

Gets the number of active threads.

unsigned long ACGIGetMaxThreads(void)
void ACGISetMaxThreads(unsigned long newThreads)

Gets or sets the maximum number of threads allowed to run at the same time.

void ACGIGetSleeps(long *whenThreads, long *whenIdle)
void ACGISetSleeps(long whenThreads, long whenIdle)

Gets or sets the sleep settings.

long ACGIGetWNEDelta(void)
void ACGISetWNEDelta(long newDelta)

Gets or sets the time between calls to WaitNextEvent.

void ACGIGetThreadParams(Size *stack, ThreadOptions *options);
void ACGISetThreadParams(Size stack, ThreadOptions options);

Gets or sets the thread stack size and creation options.

const char *ACGIGetHTTPHeader(void)

Gets a pointer to the standard HTTP header.

CUSTOMIZABLE ROUTINES

The six customizable routines in www.c allow you to adapt the ACGI shell to suit your
needs. I've supplied simple, straightforward samples of the routines in the file www.c.

 

The default version of the WWWProcess routine is shown in Listing 5. It returns a
page that displays all of the HTTP server request parameters in a nicely formatted
table. Note the use of the YIELD macro here. It provides a convenient way of yielding to
other threads and automatically aborting should the ACGI signal that it wants to quit.

______________________________

Listing 5. The default version of WWWProcess

#define
YIELD() { YieldToAnyThread(); \
                  if (ACGIIsShuttingdown()) \
                     return (errWWWRefused); }

OSErr WWWProcess(WWWRequest request)
{
   Handle   r = HTMLGetResponseHandle(request);
   char     s[1024], name[512], value[512];
   long     len, i, n, iName, iValue;
   Boolean  gotOne;
   OSErr    err;

   // Build a table to display the WebSTAR request parameters.
   err = HTMLAppendPString(r,
      "\p<HTML><HEAD><TITLE>ACGI</TITLE></HEAD>\r\n");
   YIELD();
   err = HTMLAppendCString(r,
      "<BODY><H1>ACGI Parameters</H1><TABLE BORDER=0>");
   YIELD();
   err = HTMLAppendCString(r,
      "<TR><TD ALIGN=RIGHT NOWRAP><B>Path
      arguments:</B></TD><TD>");
   YIELD();

   if (HTTPCopyParam(request, p_path_args, s, 1023, &len))
      err = HTMLAppendCString(r, s);
   YIELD();

   ...  // and so on, for all the other parameters

   // Now show all the search arguments.
   err = HTMLAppendCString(r,
      "</TD></TR><TR><TD ALIGN=RIGHT NOWRAP VALIGN=TOP>"
      "<B>Search Arguments:</B></TD><TD>");
   YIELD();

   n = HTTPGetNumSrchArgs(request);
   if (n > 0) {
      for (i = 1; i <= n; i++) {
         gotOne = HTTPGetSrchArgAt(request, i, name, 511, &iName,
                                 value, 511, &iValue);
         if (gotOne) {
            if (i > 1)
               err = HTMLAppendCString(r, "<BR>");
            err = HTMLAppendCString(r, name);
            err = HTMLAppendCString(r, " = ");
            err = HTMLAppendCString(r, value);
         }
         YIELD();
      }
   }
   else
      err = HTMLAppendCString(r, "(none)");

   ...   // and similarly for the post arguments

   err = HTMLAppendCString(r,
      "</UL></TD></TR></TABLE>\r\n</BODY>\r\n</HTML&g;\r\n");
   return (err);
}

______________________________

OVER TO YOU

That's about it for writing threaded, high-performance ACGIs in C. I bet you thought it
was a lot more difficult than this, didn't you?

A threaded ACGI written in a high-level language offers a significant performance
increase compared to an equivalent ACGI written in AppleScript. If you've been using
AppleScript exclusively to do your HTML form processing, I hope this article will
whet your appetite to try something a bit more daring. It's time to kick your Web site
into high gear and move it over into the fast lane!

REFERENCES

KEN URQUHART received his Ph.D. in physics in  1989 and has been dividing his
time between physics and computer science ever since. Ken's work has taken him and
his wife from North America to Japan and back again. Their cats (who travel with
them wherever they go) have been extremely good sports about international travel.
Ken's pretty sure the cats understand English perfectly well -- they're simply
choosing to ignore him unless they want food, body heat, or the litter box cleaned.*

Thanks to our technical reviewers Kevin Arnold, Steve Sisak, and Michelle Wyner.*