MACINTOSH DEBUGGING: A WEIRD JOURNEY INTO THE BELLY
OF THE BEAST

BO3B JOHNSON AND FRED HUXHAM

ADAPTED FROM THEIR TALK AT THE WWDC BY DAVE
JOHNSON

Macintosh debugging is a strange and difficult task. This article provides a collection
of tried-and-true debugging techniques Bo3b and Fred discussed at Apple's Worldwide
Developers Conference in May 1991. These techniques can ease your debugging woes
and make your life a lot simpler. They're guaranteed to help you find your bugs earlier
on, saving you hours of suffering.

The first thing you should know is that debugging is hard . Drinking gallons of
Mountain Dew won't help much, nor will seeking magic formulas or spreading fresh
goat entrails around your keyboard and chanting. The only way to get better at it is to
do it a lot, and even then it's still hard. What we're going to talk about are a number of
techniques that will make debugging a little bit easier.

Notice that the title of this article is "MacintoshDebugging " and not
"MacintoshDebuggers ." We're not going to do a comparative review of debuggers.
We're not going to show you how to use them. In fact, we recommend that you buy and
useall the ones described here. Each has useful features that the others don't have.
Which you use most often is up to you--pick one as your main debugger and really get
to know it, but keep all of them around.

The main Macintosh debuggers are

MacsBug from Apple
TMON (we often refer to version 2.8.4 as Old TMON) and TMON
Professional (version 3.0, called TMON Pro for short) from Icom
Simulations, Inc.
The Debugger from Jasik Designs (we'll call it "Jasik's debugger" here,
because Steve Jasik wrote it, and that's what everybody calls it in
conversation)

We'll touch on many of the individual features of these debuggers in this article.

The hardest bugs to find are those that are not reproducible. If you have a crashing bug
that can be reproduced 100 percent of the time, you're well on your way to fixing it.
But a bug that crashes your application only once every few hours, at seemingly
random times . . . well, that kind can take days or weeks to find. Often the ultimate
failure of a program is caused by code that executed long ago. Tracing back to find the
real problem can be difficult and extremely time consuming. The techniques we show
you in this article will help turn most of your random bugs into completely
reproducible ones. These techniques are designed to make your software crash or to
otherwise alert you as close as possible to where your code is doing something wrong.

We explain what each technique is, why it works, and any gotchas you need to be aware
of. Then we tell you how to turn it on or invoke it and list some of the common
Macintosh programming errors it will catch. Finally, we show a code sample or two.
The code samples were chosen for a number of reasons:

The errors in many of them are subtle. We couldn't tell what was wrong
with some of them after not looking at them for a couple months, and we wrote
them in the first place.
The mistakes are common. We've seen people make these same mistakes
time and time again.
They're short. They had to fit on one slide at our Worldwide Developers
Conference presentation.

So, on to our first technique . . . .

SET $0 TO $50FFC001

The basic idea here is that the number $0 comes up a lot when things go wrong on the
Macintosh. When you try to allocate memory or read in a resource, and it fails, what
gets returned is $0. Programs should always check to see that when they ask for
something from the Toolbox, they actually get it.

Programs that don't check and use $0 as an address or try to execute from there are
asking for trouble. The code will often work without crashing, but presumably it's not
doing what it was meant to do, since there isn't anything down there that even remotely
resembles resources or data in a program.

Why $50FFC001? Old TMON used this number when we turned on Discipline (more on
Discipline later). This fine number has the following characteristics:

Used as a pointer (address), $50FFC001 is in funny space on all
Macintosh computers--that is, it's in I/O space, which is currently just
blank. Any relative addresses close by are going to be in I/O space as well, so
positive or negative offsets from that as a base will crash, too. These types of
offset are common when referencing globals or record fields.
When used as an address, it will cause a bus error on 68020, '030, and
'040 machines. Because there's no RAM there, and no device to respond, the
hardware returns a bus error, crashing the program at the exact instruction.
Without this handy number, you not only won't crash, you won't even know the
bug exists (for a while . . . ).
On 68000 machines, $50FFC001 will cause an address error because it's
an odd number. This also stops the offending code at the exact line that has a
bug.
If the program tries to execute the code at memory location $0, it will
crash with an illegal instruction, since the $50FF is not a valid opcode. This is
nice when you accidentally JSR to $0 and the program tries to run from there.
Those low- memory vectors are certainly not code but don't usually cause a
crash until much later.
It's easy to recognize because it doesn't look like any normal number. If a
program uses memory location $0 as a source for data, this funny number
will be copied into data structures. If you see it in a valid data structure
someplace else you know there's a bug lurking in the program that's getting
data from $0 instead of from where it should.

Many different funny bus error numbers can be used. Take your pick.

AVAILABILITY
You can find various programs that set up memory location $0 in this helpful way, or
you can build your own.

EvenBetterBusError (included on theDeveloper CD Series disc) is a
simple INIT that sets memory location $0 to $50FFC003. It also installs a
VBL to make sure no one changes it.
Under System 7, the thing that used to be MultiFinder (now the Process
Manager) takes whatever is in memory location $0 when it starts up and
saves it. Periodically it stuffs that saved number back in. If it were a bus
error number at system startup (from an INIT, say), that number would be
refreshed very nicely. With MacsBug, it would be easy to build a dcmd that
stuffs memory location $0 during MacsBug INIT, and MultiFinder would then
save and restore that number.
Jasik's debugger has a flag that allows you to turn the option on or off.
Old TMON will set up the bus error number when Discipline is turned on.
TMON Pro has a script command, NastyO, that will also do this.
You can put code in your main event loop that stuffs the bus error number
into memory location $0. Be sure to remove it before you ship.

ERRORS CAUGHT
The most obvious catch using this technique is the inadvertent use of NIL handles (or
pointers). NIL handles can come back from the Resource Manager and the Memory
Manager during failed calls. If a program is being sloppy and not checking errors, it's
easy to fall into using a NIL handle, and this technique will flush it out. A double
dereference of a NIL handle will crash the computer. Something like

newArf := aHandle^^.arf;

will crash if aHandle is $0 and we've installed this nice bus error number.

This technique will tell when a program inadvertently jumps off to $0 as a place to
execute code, which can happen from misaligned stacks or from trying to execute a
purged code resource.

By watching for the funny numbers to show up in data structures, you can find out
when NIL pointers are being used as the source for data. This is surely not what was
meant, and they're easy to find when a distinctive number points them out. These uses
won't crash the computer, of course.

CODE SAMPLE

theHandle = GetResource('dumb', 1);
aChar = **theHandle;

This is easy: the GetResource call may fail. If the 'dumb' resource isn't around,
theHandle becomes NIL. Dereferencing theHandle when it's NIL is a bug, since aChar
ends up being some wacko number out of ROM in the normal case (ROMBase at $0) and
cannot be assumed to be what was desired. This bus error technique will crash the
computer right at the **theHandle, pointing out the lack of error checking.

HEAP SCRAMBLE AND PURGE

With this option on, all movable blocks of memory (handles) are moved, and all
purgeable blocks are purged, whenever memory can be moved or purged--which is
different from moving and purging memory whenever it needs to be moved or purged.
This technique is excellent at forcing those once- a-month crashing bugs to crash
more often--like all the time. You should run your entire program with this option
on, in combination with the bus error technique, using all program features and really
putting it through its paces. You'll be glad you did. Because this debugger option
simulates running under low-memory conditions all the time, it stress-tests the
program's memory usage.

AVAILABILITY
All the debuggers have this option, but the one most worth using is in Old TMON and
TMON Pro, since it implements both scramble (moving memory) and purge. MacsBug
and Jasik's debugger both have scramble, but they're too slow, and neither has a purge
option.

ERRORS CAUGHT
This technique will catch improper usage of dereferenced or purgeable handles,
mistakes that fall into the "easy to make, hard to discover" category. The technique
will also catch blocks that are overwritten accidentally, since there's an implicit heap
check each time the heap is scrambled. Warning: The bugs you find may not be yours.

CODE SAMPLE

aPicture = GetPicture(1);
FailNil(aPicture);
aPtr = NewPtr(500000);
FailNil(aPtr);
aRect = (**aPicture).picFrame;
DrawPicture(aPicture, &aRect);

Here, if the picture is purgeable, it might be purged to make room in the heap for the
large pointer allocated next. This would make aRect garbage, and DrawPicture wouldn't
work as intended, probably drawing nothing. Here's a similar example in Pascal:

aPicture := GetPicture(kResNum);
FailNil(aPicture);
WITH aPicture^^ DO
BEGIN
    aPtr := NewPtr(500000);
    FailNil(aPtr);
    aRect := picFrame;
END; {WITH}

Here, even if the picture isn't purged, the NewPtr call might move it, invalidating the
WITH statement and resulting, again, in a bad aRect.

ZAPPING HANDLES

The idea here is to trash disposed memory at the time it's disposed of in order to catch
subsequent use of the free blocks. The technique fills disposed memory with bus error
numbers, so that if you attempt to use disposed memory later, the program will crash.
A related option is MPW Pascal's -u option, which initializes local and global
variables to $7267.

AVAILABILITY
This technique is implemented as a part of Jasik's Discipline option and is also a dcmd,
available on theDeveloper CD Series disc, for TMON Pro or MacsBug. You can also just
write it into your program by writing bottleneck routines for disposing of memory
(such as MyDisposHandle, MyDisposPtr) that fill blocks with bus error numbers just
before freeing them. The problem with this is that memory freed by other calls
(ReleaseResource, for instance) isn't affected. We recommend the dcmd or Jasik's
Discipline.

ERRORS CAUGHT
This technique will catch reusing deallocated memory or disposing of memory in the
wrong order. It can also catch uninitialized variables, since after you've been running
it for a while, much of the free memory in the heap will be filled with bus error
numbers.

CODE SAMPLE

SetWRefCon(aWindowPtr, (long)aHandle);. . .
DisposeWindow(aWindowPtr);
DisposHandle((Handle) GetWRefCon(aWindowPtr));

The GetWRefCon will work on a disposed window, but it's definitely a bug. Zapping the
handles sets the refCon to a bus error number, forcing the DisposHandle call to fail.

CHECKSUM $0

Once again, we're dealing with the address $0. This technique, however, is sort of the
opposite of the first one: it catches writing to $0 rather than reading or executing
from it.

AVAILABILITY
This one is easy: you can set up a checksum so that you'll drop into the debugger
whenever the value at $0 changes. All the debuggers have a way to do this. Also,
EvenBetterBusError sets up a VBL to detect if $0 changes, but since VBL tasks don't
run very often (relative to the CPU, anyway), you'll probably be far away in your
code by the time it notices. It's still much better than nothing, though, since knowing
the bug exists is the first step toward fixing it.

Note that on the IIci the Memory Manager itself changes $0, so you'll get spurious
results. EvenBetterBusError knows about this and ignores it.

ERRORS CAUGHT
The errors caught by this technique are much the same as those caught by the first
technique, except that this one catches writes rather than reads. This way, if your code
tries to write to address $0 (by dereferencing a NIL handle or pointer), you'll know.

CODE SAMPLE

aPtr = NewPtr(kBuffSize);
BlockMove(anotherPtr, aPtr, kBuffSize);

This one's pretty obvious: if the NewPtr call fails, aPtr will be NIL, and the BlockMove
will stomp all over low memory. If kBuffSize is big enough, this will take you right
out, trashing all your low- memory vectors and your debugger, too.

DISCIPLINE

Discipline is a debugger feature that checks for bogus parameters to Toolbox calls. It
would of course be nice if the Toolbox itself did more error checking, but for
performance reasons it can't. (Be forewarned that some versions of the system have
errors that Discipline will catch.) Discipline is the perfect development-time test. It
catches all those stupid mistakes you make when typing your code that somehow get
past the compiler and may persist for some time before you discover them. It can
literally save you hours tracking down foolish parameter bugs that should never have
happened in the first place.

AVAILABILITY
Old TMON has an early version of Discipline, but there are no checks for Color
QuickDraw calls or later system calls, so its usefulness is limited. There is an INIT
version of Discipline (on theDeveloper CD Series disc with MacsBug) that works in
conjunction with MacsBug or TMON Pro that's quite usable, if slow and clunky. Jasik's
version of Discipline is far and away the best; use it if you can.

ERRORS CAUGHT
As you'd expect, Discipline catches Toolbox calls with bad arguments, like bogus
handles, and also sometimes catches bad environment states, like trying to draw into a
bad grafPort.

CODE SAMPLE

aHandle = GetResource('dumb', 1);
FailNil(aHandle);
. . .
DisposHandle(aHandle);

The problem here is that a resource handle has to be thrown away with
ReleaseResource, not DisposHandle. Otherwise, the Resource Manager will get confused
since the resource map won't be properly updated. Sometime later (maybe much
later) Very Bad Things will happen.

32-BIT MEMORY MODE

Running in full 32-bit mode in System 7 forces the Memory Manager and the program
counter to use full 32-bit addresses: this is something new on the Macintosh. The
old-style (24-bit) Memory Manager used the top byte of handles to store the block
attributes (whether or not the handle was locked, purgeable, and so forth). By running
your program in 32-bit mode, you'll flush out any code that mucks with the top bits of
an address, for any reason, accidentally or on purpose. In the past, many programs
examined or modified block attributes directly. This is a bad idea. Use the Toolbox calls
HGetState and HSetState to get and set block attributes.

AVAILABILITY
You get 32-bit memory mode with System 7, of course! You use the Memory cdev to
turn on 32-bit addressing, available only on machines that have 32-bit-clean ROMs
(Macintosh IIfx, IIci, IIsi). You should also install more than 8 MB of RAM and launch
your application first, so that it goes into memory thatrequires 32-bit addressing
(within the 8 MB area, addresses use only 24 bits). We also recommend using TMON's
heap scramble in 32-bit mode, since the block headers are different.

ERRORS CAUGHT
You can inadvertently mess up addresses in a bunch of ways. Obviously, any code that
makes assumptions about block structures is suspect. Doing signed math on pointers is
another one that comes up pretty often. Any messing with the top bytes of addresses can
get you into big trouble, jumping off into weird space, where you have no business.

CODE SAMPLE

aHandle = (Handle) ((long) aHandle | 0x80000000);

Naturally, this method of locking a handle is not a good idea, since in 32-bit mode the
locked bit isn't even there. Use HLock or HSetState; they'll do the right thing.

FULL COMPILER AND LINKER WARNINGS

Always develop your code with full warnings on. When you're compiling and linking
your program, any number of errors or warnings will be emitted. The errors are for
things that are just plain wrong, so you'll have to fix those immediately. Warnings,
however, indicate things that aren't absolutely wrong, but certainly are questionable
as far as the compiler or linker is concerned.

We think you should fix every problem as soon as a warning first appears, even if
there's "nothing wrong" with the code. If you leave the warnings in, little by little
they'll pile up, and pretty soon you'll have pages full of warnings spewing out every
time you do a build. You know you won't read through them every time. You'll probably
just redirect the warnings to a file you never look at so that your worksheet won't be
sullied. Then the one warning thatwill cause a problem will sneak right by you, and
much later you'll find out that the totally nasty, hard-to-find bug that you finally
corrected was one the compiler warned you about a month ago. To avoid this painful
experience, deal with the warnings when they appear, even if they're false alarms.

AVAILABILITY
Use the compiler and linker options that turn on full warnings:

MPW C++: The "-w2" option turns on the maximum compiler warnings.
MPW C: Use "-warnings full" ("-w2" does the same thing). In addition,
the "-r" option will warn you if you call a function with no definition.
MPW Linker: The "-msgkeyword " option controls the linker warnings.
Keyword is one or more of these: dup, which enables warnings about duplicate
symbols; multiple, which enables multiple warnings on undefined references
to a label (you can thus find all the undefined references in one link); and
warn, which enables warnings.
THINK C: Because the compile is stopped when a warning is encountered,
it forces you to fix all warnings. Some people like this; others don't. We do,
but you decide. Be sure that "Check Pointer Types" is turned on in the
compiler options.
Pascal: Most of the things that cause warnings in C are automatically
enforced.

If you're coding in C, it's also a good idea to prototypeall your routines. This avoids
silly errors.

ERRORS CAUGHT
The compiler and linker will tell you about lots of things. Some examples are

the use of uninitialized variables (which is a real bug)
bad function arguments
unused variables (these confuse the code and may be real bugs)
argument mismatches (probably bugs)
signed math overflow

In C++, overriding operator new without overriding operator delete is probably a bug
and unintentional. Even if a warning is caused by something intentional, fix it so that
the warning won't appear.

CODE SAMPLE

#define kMagicNumber 12345
. . .
short result;
result = kMagicNumber*99;

The problem with this code is that the multiplication is overflowing a 16-bit short
value. If you have full compiler warnings on, the MPW compiler will let you know this
with the following error message:

### Warning 276 This assignment may lose some significant bits

MEMORY PROTECTION

This is something you've always wanted: a way to get a protected memory model for the
Macintosh. With memory protection on, memory accesses outside the application's
RAM space would be caught as illegal, giving you the chance to find bad program
assumptions and wild references. Only Jasik's debugger has this feature now.

The protected mode is only partly successful, though, since the Macintosh has nothing
that resembles a standard operating system. The problems stem from how programs
are expected to run, in that references to some low-memory globals are OK, and code
and data share the same address space. Given the anarchy in the system, the way Jasik
set it up is to allow protection of applications only. The protected mode also protects
the CODE resources in the application from being overwritten.

Although this protected mode is not as good as having the OS support protected memory
spaces, it's still a giant leap ahead in terms of finding bugs in your programs. By
catching these stray references during development, you can be assured that the user
won't get random crashes because of your program. This is an ideal development tool
for catching latent bugs that don't often show up. Who knows what a write to a random
spot in memory may hit? Sometimes you're just lucky, and those random "stomper"
bugs remain benign, but more often they're insidiously nasty.

AVAILABILITY
This tool is currently implemented only in Jasik's debugger. The memory protection is
implemented using the MMU, and it slows down the machine by around 20 percent. It's
a mixed blessing, since it will crash on any number of spurious errors-- use it
anyway.

ERRORS CAUGHT
If the application writes to low memory or to the system heap, it's probably not what
was desired. A few cases could be deemed necessary, but in general, any references
outside the application heap space are considered suspect. Certainly, modifying system
variables is not a common task that applications need to support. This memory
protection will catch those specific references and give you the chance to be sure that
they're valid and necessary.

Writing to I/O space or screen RAM is another problem this technique will catch.
Writing directly to the screen is bad form, and only tacky programs (and games,
which must do it) stoop this low. Even HyperCard writes directly to the screen;
please don't emulate it. Some specialized programs could make an argument for writing
to I/O space, since they may have a device they need to use up there. This protection
will catch those references and point out a logically superior approach, which is to
build a driver to interface to that space, instead of accessing it directly.

CODE SAMPLE

*((long*) 0x16A) = aLong;

The low-memory global Ticks is being modified. Writing to low-memory globals is a
Very Bad Thing to do. This will be caught by memory protection.

LEAKS

A memory leak occurs when a program allocates a block of memory with either
NewHandle or NewPtr (or even with Pascal New or C malloc, both of which turn into
NewPtr at a lower level), but that block is never disposed of, and the reference to it is
lost or written over. If a program does this often enough, it will run out of RAM and
probably crash. This leads to the famous statement: "Properly written Macintosh
programs will run for hours, even days, without crashing"--a standing joke in
Developer Technical Support for so long we've forgotten the original source.
Naturally, if the program is leaking in the main event loop, it will crash sooner than
if it leaks from some rare operation. If it leaks at all, it will ultimately fail and crash
some poor user.

AVAILABILITY
A simple technique that all debuggers support can tell you whether or not the program
is leaking. Do a Heap Total and check the amount of free space and purgeable space
that's available. Run the program through its paces and then see if the amount of free
space plus purgeable space has dropped. If it has, try again, under the assumption
that the program might have loaded some code or other data the first time around. If
it's still smaller, it's likely to be a leak. This approach, of course, only shows that
youhave a leak; tracking it down is the hard part. But, hey, you can't start tracking
till you know it's there.

There's a dcmd called Leaks (on theDeveloper CD Series disc) that runs under both
TMON Pro and MacsBug. The basic premise is to watch all the memory allocations to
see if they get disposed of correctly. Leaks patches the traps NewHandle, DisposHandle,
NewPtr, and DisposPtr. When a new handle or pointer is allocated on the heap, Leaks
saves the address into an internal buffer. When the corresponding DisposHandle or
DisposPtr comes by, Leaks looks it up in the list and, if it finds the same address,
dumps that record as having been properly disposed of. Now all those records on the
Leaks list that didn't have the corresponding dispose are candidate memory leaks. The
Macintosh has a lot of fairly dynamic data, so Leaks often ends up getting a number of
things on its list that haven't been disposed of but are not actually leaks. They're just
first-time data, or loaded resources. To avoid false alarms, the Leaks dcmd requires
that you perform the operation under question three times, in order to get three or
more items in its list that are similar in size and allocated from the same place in the
program. An operation can be as simple or complex as desired, since every memory
allocation is watched. An example of an operation to watch is to choose New from a
menu and then choose Close, under the assumption that those are complementary
functions. If you do this three times in a row with Leaks turned on, anything that
Leaks coughs out will very likely be a memory leak for that operation.

The dcmd saves a shortened stack crawl of where the memory is being allocated, so that
potential leaks can be found back in the source code.

One problem with Leaks as a dcmd is that if it's installed as part of the TMON Pro
startup, it patches the traps using a tail patch. Tail patches are bad, since they disable
bug fixes the system may have installed on those traps. This could cause a bug to show
up in your program that isn't there in an unpatched system. It's still probably worth
the risk, given the functionality Leaks can provide. The problem doesn't exist with
MacsBug, since the traps are patched by the dcmd before the system patches them.

A vastly superior way around this problem is to provide the Leaks functionality as
debugging code, instead of relying on an external tool. By writing an intermediate
routine that acts as a "wrapper" around any memory allocations your program does,
you can watch all the handles and pointers go by, do your own list management to know
when the list should be empty, and dump out the information when it isn't. By
wrapping those allocations, you avoid patching traps (always a good idea). Be sure to
watch for secondary allocations, such as GetResource/DetachResource pairs. You may
still want to run Leaks when you notice memory being lost, but your wrappers don't
notice it.

ERRORS CAUGHT
Potential memory leaks, but you knew that already.

CODE SAMPLE

anIcon := GetCIcon(kIconId);
PlotCIcon(aRect, anIcon);
DisposHandle(Handle (anIcon));

This orphans any number of handles, because the GetCIcon call will create several
extra handles for pixMaps and color tables. This is an easy error to make, since the
GetCIcon returns a CIconHandle, which seems a lot like a PicHandle. A PicHandle is a
single handle, though, and a CIconHandle is a number of pieces. Always use the
corresponding dispose call for a given data structure. In this case, the appropriate call
is DisposCIcon.

STRESS ERROR HANDLING

Here the goal is to see how the program deals with less than perfect situations. Your
program won't always have enough RAM or disk space to run smoothly, and it's best to
plan for it. The first step is to write the code defensively, so that any potential error
conditions are caught and handled in the code. If you don't put in the error-handling
code, you're writing software that never expects to be stressed, which is an
unreasonable assumption on the Macintosh.

AVAILABILITY
Try running the program in a memory-critical mode, where it doesn't have enough
RAM even to start up. Users can get into this unfortunate situation by changing the
application's partition size. Rather than crash, put up an alert to tell users what
went wrong, and then bail out gracefully. Try running with just enough RAM to start
up, but not enough to open documents. Be sure the program doesn't crash and does give
the user some feedback. Try running in situations where there isn't enough RAM to edit
a document, and make sure it handles them. What happens if you get a memory-low
message, and you try to save? If you can't save, the user will be annoyed. What
happens when you try to print?

Run your program on a locked disk, and try to save files on the locked disk. The errors
you get back should be handled in a nice way, giving the user some feedback. This will
often find assumptions in the code, like, "I'm sure it will always be run from a hard
disk."

To see if you handle disk-full errors in a nice way, be sure to try a disk that has
varying amounts of free space left. Here again, if you've only ever tested on a big, old,
empty hard disk, it may shock you to find out that your users are running on a
double-floppy-disk Macintosh SE and aren't too happy that disk-full errors crash the
program. A particularly annoying common error is saving over a file on the disk. Some
programs will delete the old file first and then try to save. If a disk-full error occurs,
the old copy of the data has been deleted, leaving the user in a precarious state. Don't
force a user to switch disks, but allow the opportunity.

Especially with the advent of System 7, you should see how your program handles the
volume permissions of AppleShare. Since any Macintosh can now be an AppleShare
server, you can definitely expect to see permission errors added to the list of possible
disk errors. Try saving files into folders you don't have permission to access, and see
if the program handles the error properly.

ERRORS CAUGHT
Inappropriate error handling, unnecessary crashes, lack of robustness, and general
unfriendliness.

CODE SAMPLE

i := 0;
REPEAT
    i := i + 1;
    WITH pb DO
    BEGIN
        ioNamePtr := NIL;
        ioVRefnum := 0;
        ioDirID := 0;
        ioFDirIndex := i;
    END;
    err := PBGetCatInfo (@pb, False);
UNTIL err <> noErr;

This sample is trying to enumerate all files and directories inside a particular
directory by calling PBGetCatInfo until it gets an error. (Note that this sample does
one very important thing: initializing the ioNamePtr field to NIL to keep it from
returning a string at some random place in memory.) The problem with this loop is
that it assumes that any error it finds is the loop termination case. For an
AppleSharevolume, you may get something as simple as a permission error for a
directory you don't have access to. This is probably not the end of the entire search,
but the code will bail out. This bug would be found by trying the program with an
AppleShare volume. The appropriate end case would be to look for the exact error of
fnfErr instead or, better, to add the permErr to the conditional.

MULTIPLE CONFIGURATION TESTS

This technique goes beyond merely finding the crash-and-burn bugs to help ensure
that the program will run in situations that weren't originally expected. Just fixing
crash-and-burn bugs is for amateurs. Professional software developers want their
programs to be as bug-free as possible. As a step toward this higher level of quality,
testing in multiple configurations can give you more confidence that you haven't made
faulty assumptions about the system. The idea is to try the program on a number of
machines in different configurations, looking for combinations that cause unexpected
results.

AVAILABILITY
Multiple configuration tests should use the Macintosh Plus as the low-end machine to
be sure that the program runs on 68000-based machines and on ones that have a lot of
trap patches. Some of the code the system supports is not available, like Color
QuickDraw. If you use anything like that, you will crash with an unimplemented trap
number error, ID=12. The Macintosh Plus is a good target for performance testing as
well, since it's the slowest machine you might expect to run on. Its small screen can
also point out problems that your users might see in the user interface. For example,
some programs use up so much menu bar space that they run off the edge of the screen.
That might not be noticed until you run the program on a machine with a small screen.
If your program specifically doesn't support low-end machines, you should still put in
a test for them and warn the user. Crashing on a low-end machine is unacceptable,
especially when all you needed was a simple check.

Naturally, the multiple configurations include a Macintosh II-class machine to be sure
that assumptions about memory are caught. Because most development is done on
Macintosh II computers, this case will likely be handled as part of the initial testing.
It's virtually certain that your program will be used on a Macintosh II by some users.

Using multiple monitors on a single system can point out some window- or
screen-related assumptions. The current version of the old 512 x 342 fixed-size bug
is the assumption that the MainGDevice is the only monitor in the system. Testing with
multiple monitors will point out that although sometimes the main device is black and
white, there's a color device in the system. Should your users have to change the main
device and reboot just to run your program in color?

By testing the program within a color environment, even if it doesn't use color, you'll
find any assumptions about how color might be used or the way bitmaps look. It's a
rare (albeit lame) program that gets to choose the exact Macintosh it should run on.

Try the program under Virtual Memory to see if there are built-in assumptions
regarding memory.

Use the program under both System 6 and 7. If the program requires System 7, but a
user runs it under System 6, it should put up an alert and definitely not crash. For the
short term, it's obvious that you cannot assume all users will have either one system
or the other. The number of fundamental differences between the systems is
sufficiently large that the only way to gain confidence that the program will behave
properly is to run it under both systems. Some bugs that were never caught under
System 6 may now show up under System 7. The bugs may even be in your code, with
implicit assumptions about how some Toolbox call works.

Doing a set of functionality tests on these various types of systems will ensure that you
can handle the most common variations of a Macintosh. Tests of this form will give you
a better feeling for the limits of your program and the situations it can handle
gracefully. There's usually no drawback to getting a user's-eye view of your program.

There is a tool called Virtual User (APDA #M0987LL/B) that can help a lot with these
kinds of tests. It allows you to script user interactions so that they can be replayed
over and over, and it can execute scripts on other machines remotely, over AppleTalk.
So, for instance, you could write a script that puts your program through its paces,
and then automatically execute that script simultaneously on lots of differently
configured Macintosh systems.

ERRORS CAUGHT
As discussed above, this technique attempts to flush out any assumptions your code
makes about the environment it's running in: color capabilities, screen size, speed,
system software version, and so on.

CODE SAMPLE

void Hoohah(void)
{
    long localArray[2500];

    . . .
}

Naturally, this little array is stack hungry and will consume 10K of stack. On a
Macintosh II machine, this is OK, as the default stack is 24K. On the Macintosh Plus,
the stack is only 8K, so when you write into this array you will be writing over the
heap, most likely causing a problem. This type of easy-to-code bug may not be caught
until testing on a different machine. Merely because the code doesn't crash on your
machine doesn't mean it's correct.

ASSERTS

Asserts are added debugging code that you put in to alert you whenever a situation is
false or wrong. They're used to flag unexpected or "can't happen" situations that your
code could run into. Asserts are used only during development and testing; they'll be
compiled out of the final code to avoid a speed hit.

AVAILABILITY
You could write a function called ASSERT that takes a result code and drops into the
debugger if the result is false--or, better yet, writes text to a debugging window. In
MPW, you can use __FILE__ and __LINE__ directives to keep track of the location in
the source code. Another thing to check for is bogus parameters to calls, sort of like
Discipline. Basically, you want to check any old thing that will help you ensure
consistency and accuracy in your code, the more the merrier, as long as the asserts
don't "fire" all the time. Fix the bugs pointed out by an assert, or toughen up the
assert, but don't turn it off. If you just can't stand writing code to check every possible
error, temporarily put in asserts for the ones that will "never" happen. If an assert
goes off, you'd better add some error- handling code.

The following sample code shows one way to implement ASSERT.

#if DEBUG
#define ASSERT(what) do \
    { if(!(what)) dbgAssert(__FILE__,__LINE__); } while(0)
#else
#define ASSERT(what) ((void)0)
#endif

void dbgAssert(const char* filename, int line)
{
    char msg[256];

    sprintf(msg, "Assertion failed # %s: %d", filename, line);
    debugstr((Str255)msg);
}

In this example, ASSERT is defined by a C macro. If DEBUG is true, the macro expands
to a block of code that checks the argument passed to ASSERT. If the argument is false,
the macro calls the function dbgAssert, passing it the filename and line number on
which the ASSERT occurs. If DEBUG is false, the macro ASSERT expands to nothing.
Making the definition of ASSERT dependent on a DEBUG flag simplifies the task of
compiling ASSERTs out of final code.

ERRORS CAUGHT
This technique catches all sorts of errors, depending, of course, on how you implement
it. Logic errors, unanticipated end cases that show up in actual use, and situations that
the code is not expecting are some of the possibilities.

CODE SAMPLE

numResources = Count1Resources('PICT');
for(i=1; i<=numResources; i++) {
    theResource = Get1IndResource('PICT', i);
    ASSERT(theResource != nil);
    RmveResource(theResource);
}

The problem here is that the code doesn't account for the fact that Get1IndResource
always starts at the beginning of the available resources. So the first time through, we
get the resource with index 1, and we remove it. The next time through, we ask for
resource 2, but since we removed the resource at the front of the list, we get what
used to be resource 3; we've skipped one. The upshot is that only half the resources are
removed, and then Get1IndResource fails. This is a great example of a "never fail"
situation failing. The ASSERT will catch this one nicely; otherwise, you might not know
about it for a long time. The solution is to always ask for the first resource.

TRACE

Trace is a compiler option that causes a subroutine call to be inserted at the beginning
and end of each of your functions. You have to implement the two routines (%__BP and
%__EP), and then the compiler inserts a JSR %__BP just after the LINK instruction
and a JSR %__EP just before UNLK. This gives you a hook into every procedure that's
compiled, which can be extremely useful. Like asserts, trace is debugging code and will
be compiled out of the final version.

AVAILABILITY
Trace is available in all the MPW compilers and in THINK Pascal. THINK C's profiler
can be configured and used in the same sort of way.

ERRORS CAUGHT
By being able to watch every call in your program as it's made, you can more easily
spot inefficiencies in your segmentation and your call chain: If two often-called
routines live in different segments, under low-memory situations you may be
swapping code to disk constantly. If you're redrawing your window 12 times during an
update event, you could probably snug things up a little and gain some performance.
You can watch the stack depth change, monitor memory usage and free space, and so on.
Think up specific flow-of-control questions to ask and then tailor your routines to
answer them. Expect to generate far more data than you can look at. Really get to know
your program. Go wild.

CODE SAMPLE

PROCEDURE HooHah
VAR
    localArray: ARRAY[1..2500] OF LongInt;
BEGIN
    . . .
END; {HooHah}

Once again, we're building a stack that's too big for a Macintosh Plus. The stack sniffer
will catch it eventually, but since VBL tasks don't run very often, you may be far away
by then. Trace could watch for it at each JSR and catch it immediately.

USEFUL COMBINATIONS

All these techniques are powerful by themselves, but they're even better when used in
combination. Use them as early and as often as you can. Some of them are a bit of
trouble, but that smidgen of extra work is paid back many times over in the time saved
by not having to track down the stupid bugs. Use them throughout development, right
up to the end. Many bugs show up through interactions that only begin near the end of
the process. Diligent use of these techniques is guaranteed to find many of the easy
bugs, so you can spend your time finding the hard ones, which is much more
interesting and worthwhile.

OK, now armed to the teeth with useful techniques, you're ready to stomp bugs. You
know what to look for and how to flush them out. But you know what? Debugging isstill
hard.

THE INSIDE STORY OF THE DEBUGGER

BY STEVE JASIK

WHY WRITE A DEBUGGER
Since I didn't have the right connections for selling illegal drugs, I had to consider the
alternative of selling legal addictive drugs to Macintosh developers.

OK, seriously, I wanted to learn about the 68000 architecture. Given my experience
writing compilers and code generators for superscalar RISC mainframes, I decided to
write a disassembler for and on the Macintosh. I introduced my first product,
MacNosy, in January 1985. It allowed a fair number of developers to discover the
innards of the Macintosh ROMs, as well as to curse at me for its original TTY interface.

Unhappy with the state of Macintosh debuggers, I decided to write one of my own, using
MacNosy as a foundation. The resulting product, The Debugger, made its international
debut in London in November 1986. Since then, it's been expanded to become a system
debugger (it runs at INIT time and is available to debug any process), include an
incremental Linker for MPW compiled programs, and more.

THE MACINTOSH INTERFACE
The Debugger uses the Macintosh user interface, or at least my interpretation of it.
The windows, menus, dialogs, and text processing are standard for the Macintosh.

The only real problem was the switch in context. I had to swap in all of low memory
($0 to $1E00 on a Macintosh II-class machine). This may appear to be a bit
expensive, but in comparison with the screen swap, which is a minimum of 22K on a
small-screened Macintosh, it's trivial. The biggest problem in this area is that some
of the values have to be "cross-fertilized" between worlds, and many of the
low-memory globals are not documented.

Using the Macintosh interface became a royal pain as the System 7 group extended the
system in such a way that the basic ROM code assumed the existence of a Layer Manager
and MultiFinder functions. In many cases, I had to "unpatch" the standard code and
substitute my own in order to keep The Debugger functional.

MMU PROTECTION
MMU protection was initially designed so that The Debugger would try to protect the
system from destruction no matter what program was running. As we implemented the
design, we found that this goal was impossible because many of the applications (MPW
Shell, ResEdit, Finder) diddled with the system heap. I ended up protecting the rest of
the system only when an application that's being debugged is running.

EASE OF USE
Users have had an influence on the design and feature set in The Debugger. For
example, the initial version of the watchpoint (memory watch) command was very
simple. When a user pointed out the usefulness of an auto reset feature in the
command, we added it.

I've tried to use simple commands for the most frequently performed operations in The
Debugger. The idea has been to make common things easy to do. Some of the more
complicated operations are difficult to keep simple, as the scripting capability is
limited. SADE, in contrast, has an extensive scripting capability but is cumbersome to
use.

TMON, THEN AND NOW

BY WALDEMAR HORWAT

The first version of TMON was released in late 1984. TMON was a summer project for
me at TMQ Software when I was a junior in high school. I wrote it because I was
dreaming about a one-Macintosh debugger (MacsBug required a terminal at the time)
that had a direct-manipulation user interface. Direct manipulation meant more than
just having windows--it meant you would be able to change memory or registers
simply by typing over your values, assemble instructions by typing in a disassembly
window, and so on.

THE ORIGINAL TMON
Memory constraints of the Macintosh 128K forced me to write TMON entirely in
assembly language--the original version used only 16K plus a little additional
memory to save the screen. TMON used its own windowing system to avoid reentrancy
problems with debugging programs that call the system. TMON also included a "User
Area," a block of code that could extend TMON. The source code was provided for the
standard user areas, and Darin Adler took great advantage of this facility to add
numerous features to TMON in his Extended User Area.

Writing TMON took a little ingenuity. I didn't have anything that could debug it, so I
wrote the entire program, assembled it, ran it on a Macintosh, and watched it crash.
After a couple of dozen builds, I got it to display its menu bar on the screen. By about
build 100, I had a usable memory dump window that I could then use to debug the rest
of TMON.

TMON PRO
Improving a program written entirely in tight assembly language designed for a
Macintosh 128K became intractable, so I switched to MPW C++. Version 3.0 of TMON
(TMON Pro) is written half in assembly language and half in C++. Using C++ turned
out to be one of the best ways to debug a program: C++ features such as constructors
and destructors prevented a lot of pesky programming errors. The downside of using a
high-level language is that code size grows explosively--TMON 3.0's code is about ten
times larger than TMON 2.8's.

When writing TMON 3.0, I reevaluated earlier design decisions. I opted to continue to
concentrate on debugging at the assembly language level for two reasons. First, there
are many bugs that can arise on a Macintosh that pure source-level debuggers can't
handle. Second, I find that I use TMON at least as much for learning about the Macintosh
as I do for debugging.

I sometimes wish I could use the Macintosh windows in TMON. Nevertheless, I decided
to remain with TMON's custom windows for reasons of safety. Until the Macintosh has a
real reentrant multitasking system that can switch to another task at any point in the
code, writing such a debugger would either make it prone to crashing if it was entered
at the wrong time or require the debugger to be more dependent on undocumented
operating system internals than I like.

I found that writing TMON 3.0 was much harder and took much longer than writing the
original TMON. Part of this was due to the second-system effect--the product just
kept on growing over time. Nevertheless, I also found that writing TMON 3.0 was
difficult because of the loss of the Macintosh "standard." There are now over a dozen
Macintosh models, using the 68000 through the 68040, some with third-party
accelerators, various ROM versions, 24- and 32-bit mode, virtual memory, several
versions of the operating system, and numerous INITs, patches, video cards, and other
configuration options. These options present unique challenges to a low-level debugger
such as TMON, which must include special code for many of them.

Despite the frustration, I think that writing TMON was worth it--it made many
developers' lives easier. I plan to continue to evolve TMON in the future and
incorporate suggestions for improvements.

A WORD TO THE WISE FROM FRED

What we've described in this article are a number of tools for doing Macintosh
software development. Some of you are about to say, "Oh, those sound really great, but
I don't have time to use them--I'm about to ship," or whatever. I'd like to tell you a
story that a man of sound advice, Jim Reekes, told me: A young boy walked into a room
and saw a man pushing a nail into the wall with his finger. The boy asked him, "Hey,
mister, why don't you go next door and get a hammer?" The man replied, "I don't have
time." So the boy went next door, got a hammer, and came back. The man was still
pushing the nail into the wall with his finger. So the boy hit the man in the head with
the hammer, killed him, and took the nail.

BO3B JOHNSON AND FRED HUXHAM didn't want a bio, except to say that they are
cohosts of "Lunch with Bo3b and Fred." We also feel compelled to tell you that in
Bo3b's name, the "3" is silent. *

THIRD-PARTY COMPATIBILITY TEST LAB Apple maintains a Third-Party
Compatibility Test Lab for the use of Apple Associates and Partners. The Lab features
many preconfigured domestic and international systems, extensive networking
capabilities, support from staff engineers, and so on. If you're an Apple Associate or
Partner, and you'd like to make a test-session appointment or get more information,
contact Carol Lockwood at (408)974-5065 or AppleLink LOCKWOOD1. Or you can
write to Apple Third-Party Test Lab, Apple Computer, Inc., 20525 Mariani Avenue
M/S 35-BD, Cupertino, CA 95014.*

RELATED READINGDebugging Macintosh Software with MacsBug by Konstantin
Othmer and Jim Straus (Addison-Wesley, 1991) and How to Write Macintosh
Software by Scott Knaster (Hayden Books, 1988).*

THANKS TO OUR TECHNICAL REVIEWERS Jim Friedlander, Pete Helme, Jim
Reekes*

MACINTOSH DEBUGGING: A WEIRD JOURNEY INTO THE BELLY OF THE BEAST

BO3B JOHNSON AND FRED HUXHAM

ADAPTED FROM THEIR TALK AT THE WWDC BY DAVE JOHNSON

SET $0 TO $50FFC001

HEAP SCRAMBLE AND PURGE

ZAPPING HANDLES

CHECKSUM $0

DISCIPLINE

32-BIT MEMORY MODE

FULL COMPILER AND LINKER WARNINGS

MEMORY PROTECTION

LEAKS

STRESS ERROR HANDLING

MULTIPLE CONFIGURATION TESTS

ASSERTS

TRACE

USEFUL COMBINATIONS

THE INSIDE STORY OF THE DEBUGGER

TMON, THEN AND NOW

A WORD TO THE WISE FROM FRED

MACINTOSH DEBUGGING: A WEIRD JOURNEY INTO THE BELLY
OF THE BEAST

ADAPTED FROM THEIR TALK AT THE WWDC BY DAVE
JOHNSON