The new PCI-based Power Macintosh computers bring with them a subset of the
functionality to be offered by the next generation of I/O architecture. New support for
device drivers makes it possible to develop forward-compatible drivers for PCI
devices, while at the same time making them much easier to write and eliminating
their dependence on the Macintosh Toolbox. Key features of the new driver model are
described in this article and illustrated by the accompanying sample PCI device driver.
Writing Macintosh device drivers has always been something of a black art. Details of
how to do it are hidden in obscure places in the documentation and often discovered only
by developers willing to disassemble Macintosh ROMs and system files. But this art
that's flourished for more than a decade is about to get a lot less arcane.
The PCI-based Power Macintosh computers are the first of a new generation of
computers with support for a driver model that's independent of the 68000 processor
family and the Macintosh Toolbox. Existing 680x0 drivers will continue to work on
the PCI machines (although this may not be true for future systems); a third-party
NuBusTM adapter enables the use of existing hardware devices and drivers without
change. But drivers for PCI hardware devices must be written in accordance with the
driver model supported in the new system software release, which makes them
simpler to develop and maintain.
This article will give you an overview of the new device driver model, without
attempting to cover everything (which would fill a book and already has). After
discussing key features, it suggests how you might go about converting an existing
driver to drive a PCI device. The remainder of the article looks at some of the
individual parts of a forward-compatible PCI device driver. The sample code excerpted
here and included in its entirety on this issue's CD offers a complete device driver that
illustrates most of the features of the new driver model. Of course, you won't be able to
use the driver without the hardware, and you'll need updated headers and libraries to
recompile it.
How to write device drivers for PCI-based Macintosh computers is explained in
detail in Designing PCI Cards and Drivers for Power Macintosh Computers .*
The following list of features will give you some idea of the rationale behind the move
away from a device driver architecture that's served the Macintosh operating system
for more than a decade. Some of these features address problems of the old
architecture, while some anticipate new requirements.
A simplified set of driver services independent of the Macintosh Toolbox
The existing Device Manager design is closely tied to specific features of the
Macintosh Toolbox. The new system software release supports only a small set of
driver services, which are independent of the Toolbox and are limited to just those
things that drivers need to do; they don't let drivers display dialogs, open files, read
resources, or draw on the screen. This greatly simplifies both the driver's task (the
driver interacts only with the actual hardware) and the operating system's task (the
OS needn't have a file system or screen available when starting up drivers).
Independence from the 68000 processor family
The old device driver architecture is highly dependent on specific features of the
680x0 processor architecture. For example, the way code segments are organized and
the conventions for passing parameters depend on the 680x0 architecture and make
the old driver code different from other code modules. This means that drivers can't be
written in native PowerPC code -- or must make use of computationally expensive
mixed-mode switches.
Also, in the 680x0 architecture, critical sections and atomic operations use
assembly-language sequences to disable interrupts. The PowerPC processor has a
completely different interrupt structure, effectively making these techniques
impossible to transport directly to native PowerPC code.
In the new system software, support for the driver model is independent of any
particular processor, hiding processor-specific requirements in operating system
libraries. Drivers can be compiled into native PowerPC code and can be written in a
high-level language such as C. Because they're standard PowerPC code fragments, they
aren't bound by the segment size limitations of the 680x0 architecture; they can be
created with standard compilers and debugged with the Macintosh two- machine
debugger.
A more flexible configuration facility
Driver configuration in the old architecture requires the ability to read resources
from a parameter file, or from a 6-byte nonvolatile RAM area indexed by NuBus slot.
These ad hoc configuration mechanisms based on the Resource Manager, File Manager,
and Slot Manager are replaced in the new system software by a more flexible
configuration facility that's used throughout the system.
Drivers use a systemwide name registry for configuration. Each device has an entry in
the Name Registry containing properties pertinent to that device. Device drivers can
also store and retrieve private properties. Device configuration programs (control
panels and utility applications) should use the registry to set and retrieve device
parameters.
System-independent device configuration
Devices can use Open Firmware to provide operating system configuration as well as
system- independent bootstrap device drivers. Open Firmware is an
architecture-independent IEEE standard for hardware devices based on the FORTH
language. When the system is started up, it executes functions stored in each device's
expansion ROM that provide parameters to the system. A device can also provide FORTH
code to allow the system to execute I/O operations on the device. This means a card can
be used to bootstrap an operating system without having operating system-specific
code in its expansion ROM.
Open Firmware and the bootstrap process are described in detail in IEEE
document 1275 -- 1994 Standard for Boot (Initialization, Configuration) Firmware
.*
Grouping by family
Drivers are grouped into generalfamilies , and family-specific libraries simplify
their common tasks. Currently, four families are defined: video, communications,
SCSI (through SCSI Manager 4.3), and NDRV (a catch-all for other devices, such as
data acquisition hardware). The sample code is for a device driver in the NDRV family.
Direct support for important capabilities
The existing Device Manager doesn't directly support certain capabilities, such as
concurrent I/O (required by network devices) and driver replacement. Driver
writers who need these capabilities have had to implement them independently, which
is difficult, error-prone, and often dependent on a particular operating system
release. The new system software supports these capabilities in a consistent manner.
A choice of storage
Drivers can be stored in the hardware expansion ROM or in a file of type 'ndrv' in the
Extensions folder. A later driver version stored in this folder can replace an earlier
version stored in the hardware expansion ROM.
Forward compatibility
Device drivers written for the new system software will run without modification
under Copland, the new generation of the Mac OS forthcoming from Apple, if they use
only the restricted system programming interface and follow the addressing guidelines
inDesigning PCI Cards and Drivers for Power Macintosh Computers .
For more on Copland, see "Copland: The Mac OS Moves Into the Future" in this
issue of develop .*
To illustrate how you'd go about converting an existing device driver to drive a PCI
device, let's suppose you've developed a document scanner with an optical character
recognition (OCR) facility. The document scanner is currently controlled by a NuBus
board that you designed, and you're building a PCI board to support the scanner on
future Macintosh machines.
A useful way to approach the conversion effort is to conceptualize the device driver as
consisting of three generally independent layers:
At the same time, you might also organize the code in each of these three layers into the
following functional groups:
Let's look at what you would do to each of these layers and groups.
First, you would throw out the high-level component in your driver that interacts
with the Device Manager and replace it with the considerably simpler request
processing of the new system software release. You would need to add support for the
Initialize, Finalize, Superseded, and Replace commands (discussed later), as they have
no direct counterpart in the existing Device Manager. You would also need to revise the
way you complete an I/O request: instead of storing values in 68000 registers and
jumping to jIODone, your driver would call IOCommandIsComplete.
The mid-level component in your driver would include scanner management and, in
particular, OCR algorithms. These algorithms comprise the intelligence that sets your
product apart from its competition. To convert your driver to a PCI device driver, you
would recompile (or rewrite) the algorithms for the PowerPC processor. If the
algorithms were in 68000 assembly language, you could get started by making
mixed-mode calls between the new driver and the existing functions; however, this
won't work with Copland, and I would recommend "going native" as soon as possible.
You would replace the low-level bus interface that manipulates registers on a NuBus
card with code that manipulates PCI registers. Because this is specific to a particular
hardware device, it won't be discussed in this article, but the sample driver on the CD
shows you how to access PCI device registers.
You would also create Open Firmware boot code to allow your card to be recognized
during system initialization. Because the new driver model doesn't use Macintosh
Toolbox services, you would have to redesign your driver to (1) use the Name
Registry for configuration instead of resources and parameter files, and (2) use the
new timer services, replacing any dependency on the accRun PBControl call (the
sample code shows how to call timer services, although it's not discussed here).
How your new driver code would look will become clearer in the next sections, where
we examine key parts of the sample device driver. To get the whole picture, see the
sample driver in its entirety on the CD.
The remainder of this article introduces a number of new operating system functions,
as well as a few new libraries, managers, and such. "A Glossary of New Operating
System Terms" will help you navigate through the new territory.
A GLOSSARY OF NEW OPERATING SYSTEM TERMS
CheckpointIO. A function that releases memory that had been configured by
PrepareMemoryForIO.
DoDriverIO. A function provided by the driver that carries out all device driver
tasks. When you build a driver, it must export this function to the Device Manager.
DriverDescription. An information block named TheDriverDescription that the
Driver Loader Library uses to connect a device driver with its associated hardware.
When you build a driver, it must export this block to the Driver Loader Library.
Driver Loader Library. A library of functions used by the Device Manager to
locate and initialize all drivers. It uses the DriverDescription structure to match a
driver with the hardware actually present on a machine.
Driver Services Library. A family-independent library of driver services
limited to just those things that drivers need to do.
Expansion Bus Manager. A library that provides access to PCI configuration
registers.
GetInterruptFunctions. A function that retrieves the current interrupt service
functions established for this device.
GetLogicalPageSize. A function that retrieves the size of the physical page.
Normally called once when the driver is initialized.
InstallInterruptFunctions. A function that replaces the current interrupt
functions with functions specific to this device driver.IOCommandIsComplete. A
function that completes the current request by returning the final status to the caller,
calling an I/O completion routine if provided, and starting the next transfer if
necessary.
MemAllocatePhysicallyContiguous. A function that allocates a contiguous block
of memory whose address can be passed, as a single unit, to a hardware device. This is
essential for frame buffers and similar memory areas that must be accessed by both
the CPU and an external device.
Name Registry. A database that organizes all system configuration information. Each
device's entry in the registry contains a set of properties that can be accessed with
RegistryPropertyGet and RegistryPropertyGetSize.
PoolAllocateResident. A function that allocates and optionally clears memory in the
system's resident pool. This replaces NewPtrSys, which isn't available to
forward-compatible PCI device drivers.
PoolDeallocate. A function that frees memory allocated by PoolAllocateResident.
PrepareMemoryForIO. A function that converts a logical address range to a set of
physical addresses and configures as much as possible of the corresponding physical
memory space for subsequent direct memory access.
QueueSecondaryInterrupt. A function that runs a secondary interrupt service
routine at a noninterrupt level.
RegistryPropertyGet, RegistryPropertyGetSize. Functions that retrieve,
respectively, the contents and the size of a property, given its name and a value that
identifies the current Name Registry entity.
Software task. An independently scheduled software module that can call driver
services, including PrepareMemoryForIO. Software tasks can be used to replace
time-based processing that previously used the PBControl accRun service.
SynchronizeIO. A function that executes the processor I/O synchronize ( eieio)
instruction.
Now we'll look at key pieces of the sample driver, starting with the code for
configuration and control. As mentioned earlier, the sample driver is a member of the
NDRV family. To the operating system, an NDRV driver is a PowerPC code fragment
containing two exported symbols: TheDriverDescription and DoDriverIO. (Although all
drivers have a TheDriverDescription structure, the particular driver family they
belong to determines which other exported symbols are required.)
TheDriverDescription is a static structure, shown in Listing 1, that provides
information to the operating system about the device that this driver controls. The
driver will be loaded only if the device is present. TheDriverDescription also indicates
whether the driver is controlled by a family interface (such as Open Transport for the
communications family) and specifies the driver name to be used by operating system
functions to refer to it. The Driver Loader extracts TheDriverDescription from the
code fragment before the driver executes; thus it must be statically initialized.
Listing 1. TheDriverDescription
DriverDescription TheDriverDescription = {
/* This section lets the Driver Loader identify the structure
version. */
kTheDescriptionSignature,
kInitialDriverDescriptor,
/* This section identifies the PCI hardware. It also ensures
that the correct revision is loaded. */
"\pMyPCIDevice", /* Hardware name */
kMyPCIRevisionID, kMyVersionMinor,
kMyVersionStage, kMyVersionRevision,
/* These flags control when the driver is loaded and opened,
and control Device Manager operation. They also name the
driver to the operating system. */
( (1 * kDriverIsLoadedUponDiscovery) /* Load at system startup */
| (1 * kDriverIsOpenedUponLoad) /* Open when loaded */
| (0 * kDriverIsUnderExpertControl)/* No special family expert */
| (0 * kDriverIsConcurrent) /* Driver isn't concurrent */
| (0 * kDriverQueuesIOPB) /* No internal IOPB queue */
),
"\pMyDriverName", /* PBOpen name */
0, 0, 0, 0, 0, 0, 0, 0, /* For future use */
/* This is a vector of operating system information, preceded by
an element count (here, only one service is provided). */
1, /* Number of OS services */
kServiceTypeNdrvDriver, /* This is an NDRV driver */
kNdrvTypeIsGeneric, /* Not a special type */
kVersionMajor, kVersionMinor, /* NumVersion information */
kVersionStage, kVersionRevision
};
DoDriverIO is a single function called with five parameters to perform all driver
services (see Table 1). The overall organization of the driver thus is very simple, as
shown in Listing 2.
Table 1. DoDriverIO parameters
| Parameter Type | Usage |
| addressSpaceID | Used for operating system memory management. |
| Currently, only one address space is supported; | |
| future systems will support multiple address spaces. | |
| ioCommandID | Uniquely identifies this I/O request. The driver |
| passes it back to the operating system when the | |
| request completes. | |
| ioCommandContents | Varies depending on the ioCommandCode value. For |
| example, for Read, Write, Control, Status, and KillIO | |
| commands, it's a pointer to a ParamBlockRec. | |
| ioCommandCode | Defines the type of I/O request. |
| ioCommandKind | Specifies whether the command is synchronous or |
| asynchronous, and whether it's immediate. |
Listing 2. DoDriverIO
OSErr DoDriverIO(AddressSpaceID addressSpaceID,
IOCommandID ioCommandID,
IOCommandContents ioCommandContents,
IOCommandCode ioCommandCode,
IOCommandKind ioCommandKind)
{
OSErr status;
switch (ioCommandCode) {
case kInitializeCommand:
status = DriverInitialize(ioCommandContents.initialInfo);
break;
case kFinalizeCommand:
status = DriverFinalize(ioCommandContents.finalInfo);
break;
case kSupersededCommand:
status =
DriverSuperseded(ioCommandContents.supersededInfo);
break;
case kReplaceCommand:
status = DriverReplace(ioCommandContents.replaceInfo);
break;
case kOpenCommand:
status = DriverOpen(ioCommandContents.pb);
break;
case kCloseCommand:
status = DriverClose(ioCommandContents.pb);
break;
case kReadCommand:
status = DriverRead(addressSpaceID, ioCommandID,
ioCommandKind, ioCommandContents.pb);
break;
case kWriteCommand:
status = DriverWrite(addressSpaceID, ioCommandID,
ioCommandKind, ioCommandContents.pb);
break;
case kControlCommand:
status = DriverControl(addressSpaceID, ioCommandID,
ioCommandKind, (CntrlParam *) ioCommandContents.pb);
break;
case kStatusCommand:
status = DriverStatus(addressSpaceID, ioCommandID,
ioCommandKind,
(CntrlParam *) ioCommandContents.pb);
break;
case kKillIOCommand:
status = DriverKillIO();
break;
}
/* Force a valid result for immediate commands. Other commands */
return noErr if the operation completes asynchronously. */
if ((ioCommandKind & kImmediateIOCommandKind) == 0) {
if (status == kIOBusyStatus) /* Our "in progress" value */
status = noErr; /* I/O will complete later */
else
/* To prevent a subtle race condition, the driver must
not store final status in the caller's parameter
block. This prevents a problem where the caller can
reuse the parameter block before the caller's
completion routine is called. */
status = IOCommandIsComplete(ioCommandID, status);
}
return (status);
}
The driver must ensure that immediate operations (those that must complete without
delay) return directly to the caller and that completed synchronous and asynchronous
requests call IOCommandIsComplete. (The sample driver handler functions return the
final status if they handled the request, and a private value, kIOBusyStatus, if an
asynchronous interrupt will eventually complete the operation.)
In the sample driver, individual subroutines carry out the functions. I'll describe the
administration routines first, then the process of carrying out an I/O operation.
Currently, drivers perform all of their initialization when called with PBOpen and
generally ignore PBClose. The new system software provides six commands for
initialization and termination, as shown in Table 2. Since drivers are code fragments,
they can also use the Code Fragment Manager initialization and termination routines,
although this probably isn't necessary.
For details on the Code Fragment Manager, see Inside Macintosh: PowerPC
System Software .*
Table 2. Driver commands for initializing and terminating
| ioCommandCode Value | Usage |
| kInitializeCommand | Carries out normal initialization. Called once when |
| the driver is first loaded. | |
| kReplaceCommand | Indicates that this driver is replacing a currently |
| loaded driver for the device (for example, a ROM | |
| driver is being replaced by a driver loaded from the | |
| system disk). | |
| kOpenCommand | Begins servicing of device requests. |
| kCloseCommand | Stops servicing of device requests. |
| kSupersededCommand | Indicates that this driver will be replaced by |
| another. | |
| kFinalizeCommand | Shuts down the device and releases all resources. |
| Called once just before the driver is to be unloaded. |
When you look at the sample driver, you'll see that most of the work is done by Replace
and Superseded, with Open and Close having no function there.
Here are the tasks that a driver needs to perform when initialized, whether by
Initialize or Replace:
Listing 3 shows how to extract the physical addresses of your device and use the
"AAPL,address" property to get the corresponding logical addresses. Unlike address
space assignments on NuBus machines, where the slot number directly corresponds to
the device's 32-bit address range, PCI address space assignments are dynamic. Devices
define a set of registers, and the system initialization process (Open Firmware) uses
this information, together with information about buses and PCI bridges, to bind the
device to its 32-bit physical address range. (Actually, although addresses use 32 bits,
the low 23 bits select the physical address, while the high 9 bits select between main
memory and PCI bus address spaces. The device driver uses the logical address to
reference device registers.) Open Firmware code updates the Name Registry to show
the device's binding. Note that the driver must search for the required address register
and can't rely on any particular address being in a specific location within the
property.
Listing 3. Fetching the device's logical address range
typedef struct AssignedAddress {
UInt32 cellHi; /* Address type */
UInt32 cellMid;
UInt32 cellLow;
UInt32 sizeHi;
UInt32 sizeLow;
} AssignedAddress, *AssignedAddressPtr;
#define kAssignedAddressProperty "assigned-addresses"
#define kAAPLAddressProperty "AAPL,address"
#define kIOMemSelectorMask 0x03000000
#define kIOSpaceSelector 0x01000000
#define kMemSpaceSelector 0x02000000
#define kDeviceRegisterMask 0x000000FF
OSErr GetDeviceAddress(UInt32 selector, UInt32 deviceRegister,
LogicalAddress *logicalAddress)
{
OSErr status;
RegPropertyValueSize size;
AssignedAddressPtr addressPtr;
LogicalAddress *logicalAddressVector;
int nAddresses, i;
UInt32 cellHi;
addressPtr = NULL;
logicalAddressVector = NULL;
status = GetThisProperty(kAssignedAddressProperty,
(RegPropertyValue *) &addressPtr, &size);
/* See Listing 6. */
if (status == noErr) {
/* GetThisProperty returned a vector of assigned-address
records. Search the vector for the desired address
type. */
status = paramErr; /* Presume "no such address." */
nAddresses = size / sizeof (AssignedAddress);
for (i = 0; i < nAddresses; i++) {
cellHi = addressPtr[i].cellHi;
if ((cellHi & kIOMemSelectorMask) == selector
&& (cellHi & kDeviceRegisterMask) == deviceRegister) {
if (addressPtr[i].sizeLow == 0)
/* Open Firmware was unable to assign an address
to this memory area. We must return an error
to prevent the driver from starting up (status
is still paramErr). */
break;
/* This is the desired address space. Find the
corresponding LogicalAddress by resolving the
"AAPL,address" property. We want the i'th
LogicalAddress in the vector. */
status = GetThisProperty(kAAPLAddressProperty,
(LogicalAddress *) &logicalAddressVector, &size);
if (status == noErr) {
nAddresses = size / sizeof (LogicalAddress);
if (i < nAddresses)
*logicalAddress = logicalAddressVector[i];
else status = paramErr;
}
break; /* Exit the for loop. */
} /* Check for the requested register. */
} /* Loop over all address spaces. */
DisposeThisProperty((RegPropertyValue *) &addressPtr);
DisposeThisProperty
((RegPropertyValue *) &logicalAddressVector);
} /* If we found our "assigned-addresses" property */
return (status);
}
When the driver reads the "assigned-addresses" property, it looks at the address type
(I/O or memory) and may also need to examine other information to make sure the
address range is appropriate. For example, a device may have two memory address
ranges -- one for the device's registers and a separate range for its on-card
firmware. The GetDeviceAddress function in Listing 3 uses the register number to
determine which of several address ranges to use, but this may not work for all
hardware. This function also resolves the logical address range that corresponds to the
device's physical address range using an Apple-specific property that records device
logical addresses. This is important for devices that require I/O cycles: using the
logical address lets the driver treat thesedevices as if they used normal memory
addresses, eliminating the overhead of the Expansion Bus Manager routines.
Listing 4 shows how a driver might use the Expansion Bus Manager to enable a device
to become bus-master and respond to either memory or I/O accesses. It also shows how
to read a device register with the DeviceProbe function. While the actual values are
specific to the NCR 53C825 chip, the technique is generally useful. Note that the
command word was changed using a read-modify-write sequence.
Listing 4. Checking for the correct hardware device
Listing 4. Checking for the correct hardware device
OSErr InitializeMyHardware(void)
{
OSErr status;
UInt8 ctest3;
UInt16 commandWord;
status = ExpMgrConfigReadWord(
&gDeviceEntry, /* kInitializeCommand param */
(LogicalAddress) 0x04, /* Command register */
&commandWord); /* Current chip values */
if (status == noErr)
status = ExpMgrConfigWriteWord(
&gDeviceEntry, /* kInitializeCommand param */
(LogicalAddress) 0x04, /* Command register */
commandWord | 0x0147); /* New chip values */
if (status == noErr)
status = DeviceProbe(
gDeviceBaseAddress + 0x9B,
/* Chip Test 3 register */
&ctest3, /* Store value here */
k8BitAccess);
if (status == noErr && (ctest3 & 0xF0) != 0x20)
status = paramErr; /* Wrong chip revision */
return (status);
}
The code for initializing the interrupt service routine, including connecting the
primary interrupt service routine to the operating system, is shown in Listing 5. This
code installs a single interrupt handler; if your device supports multiple interrupts
(for example, if it supports several serial lines), you may want to use the new
interrupt management routines in the Driver Services Library to build a hierarchy of
interrupt service routines.
Listing 5. Initializing the interrupt service routine
#define kInterruptSetProperty "driver-ist"
OSErr InitializeInterruptServiceRoutine(void)
{
OSErr status;
OSStatus osStatus;
RegPropertyValueSize size;
InterruptSetMember *interruptSetMember;
status = GetThisProperty(kInterruptSetProperty,
(RegPropertyValue *) &interruptSetMember, &size);
if (status == noErr) {
if (size < (sizeof (InterruptSetMember)) {
DisposeThisProperty
((RegPropertyValue *) &interruptSetMember);
status = paramErr;
}
}
if (status == noErr) {
/* We have the interrupt set ID and member number. Save the
current interrupt set and get the current functions for
this interrupt set. */
gInterruptSetMember = *interruptSetMember;/* Save globally */
DisposeThisProperty
((RegPropertyValue *) &interruptSetMember);
osStatus = GetInterruptFunctions(gInterruptSetMember.setID,
gInterruptSetMember.member, &gOldInterruptSetRefCon,
&gOldInterruptServiceFunction,
&gOldInterruptEnableFunction,
&gOldInterruptDisableFunction);
if (osStatus != noErr)
status = paramErr;
}
if (status == noErr) {
/* We have the information we need. Install our own interrupt
handler function. If successful, call the old enabler to
enable interrupts (we don't install a private enabler). */
osStatus = InstallInterruptFunctions(
gInterruptSetMember.setID,
gInterruptSetMember.member,
NULL, /* No refCon */
DriverInterruptServiceRoutine,
/* See Listing 11. */
NULL, /* No new enable function */
NULL); /* No new disable function */
if (osStatus != noErr)
status = paramErr;
}
if (status == noErr)
(*gOldInterruptEnableFunction)(gInterruptSetMember,
gOldInterruptSetRefCon);
return (status);
}
Interrupt management routines are described in Chapter 9 of Designing PCI
Cards and Drivers for Power Macintosh Computers .*
GetThisProperty (Listing 6) is a generic utility function that retrieves a property
from the Name Registry, storing its contents in the system's resident memory pool.
This is useful for retrieving configuration information. The driver must, of
course,return the memory to the pool when it's no longer needed, using
DisposeThisProperty,also shown in Listing 6.
Listing 6. Retrieving properties from the Name Registry
OSErr GetThisProperty(RegPropertyNamePtr regPropertyName,
RegPropertyValue *resultPropertyValuePtr,
RegPropertyValueSize *resultPropertySizePtr)
{
OSErr status,
RegPropertyValueSize size;
*resultPropertyValuePtr = NULL;
status = RegistryPropertyGetSize(
&gDeviceEntry, /* kInitializeCommand param */
regPropertyName,
&size);
if (status == noErr) {
*resultPropertyValuePtr =
(RegPropertyValue *) PoolAllocateResident(size, FALSE);
if (*resultPropertyValuePtr == NULL)
status = memFullErr;
}
if (status == noErr)
status = RegistryPropertyGet(
&gDeviceEntry, /* kInitializeCommand param */
regPropertyName,
*regPropertyValuePtr,
&size);
if (status != noErr)
DisposeThisProperty(regPropertyValuePtr);
}
if (status == noErr)
*resultPropertySizePtr = size; /* Success! */
return (status);
}
/* DisposeThisProperty disposes of a property that was obtained by
calling GetThisProperty. Note that applications would call
DisposePtr
DisposePtr instead of PoolDeallocate. */
void DisposeThisProperty(RegPropertyValue *regPropertyValuePtr)
{
if (*regPropertyValuePtr != NULL) {
PoolDeallocate(*regPropertyValuePtr);
*regPropertyValuePtr = NULL;
}
}
Applications can use the functions in Listing 6 but must replace calls to
PoolAllocateResident and PoolDeallocate with calls to NewPtr and DisposePtr. The
latter aren't available to PCI device drivers. *
There are two parts to starting an asynchronous I/O operation: the driver must carry
out the operations unique to the particular hardware device and it must configure
memory so that hardware direct memory access (DMA) operations can take place.
Completing an operation requires responding to hardware interrupts, updating user
parameter block fields, selecting the proper status code, and calling
IOCommandIsComplete to inform the Device Manager that the driver has finished with
this I/O request. The sequence for a complete, but somewhat simplified, I/O
transaction might be as follows:
This sequence represents an idealized and somewhat simplified situation. For example,
display frame buffers generally don't interrupt when written to but might interrupt
at the end of a display cycle.
I won't say much about the Read, Write, Control, Status, and KillIO handlers: they
carry out tasks that are specific to the particular driver. Often, they initiate an
operation that will be completed by a device hardware interrupt. Control and Status
handlers must process PBControl csCode = 43 (driverGestalt) requests. These provide
a systematic way to query device capabilities and are also used for power management.
KillIO replaces the PBControl csCode = 1 (killCode) used for desk accessories; it stops
all pending I/O requests.
Before jumping into the complexities of PrepareMemoryForIO and interrupt service, I
need to mention one small task: setting and reading values in the device registers.
SETTING AND READING DEVICE REGISTER VALUES
The PCI bus architecture gives hardware developers two methods for setting and
reading values in the device registers: memory-mapped I/O and I/O cycle operations
(described in more detail in "Methods of I/O Organization"). A device advertises its I/O
organization through bits in its configuration register and by providing a
PCI-standardized "reg" property. When the system starts up, it assigns each device a
range of physical addresses in the system's 32-bit physical address space. The driver
canretrieve the device's physical addresses by resolving the "assigned-addresses"
property and can use the Apple-specific "AAPL,address" property to translate the
values in an "assigned- addresses" property to logical addresses, as was shown in
Listing 3. Your driver should use these values when accessing your device's registers.
Ranges of logical addresses are assigned to PCI bus memory and I/O cycles; thus, your
driver can perform I/O cycles without calling operating system functions.
For example, the sample driver's hardware device has a test register (byte) at
offset0xCC from the start of its memory base address. Suppose the logical address
retrievedby GetDeviceAddress was stored in the global gDeviceBaseAddress, defined as
volatile UInt8 *gDeviceBaseAddress;
The driver could then read the test register with
testRegister = gDeviceBaseAddress[0xCC];
The volatile keyword is important, as it prevents the compiler from removing what
appear to be unnecessary operations. Drivers will also need to call the SynchronizeIO
function in the Driver Services Library to force the PowerPC processor to flush its
data pipeline. While the sample device driver appears to use only memory operations,
the PCI hardware issues either memory or I/O addresses depending on the particular
logical address reference. To issue I/O addresses, your device driver would have to
retrieve the "AAPL,address" property shown in Listing 3.
While byte accesses are straightforward, word (16-bit) and long word (32-bit)
accesses are more complex. This is because the PCI bus is little-endian (the address of
a multibyte entity is the address of the low-order byte), whereas the Mac OS and the
PowerPC chip are big-endian (the address of a multibyte entity is the address of the
high-order byte). To access 16-bit and 32-bit data, then, your driver must swap
bytes in memory, either by using the PowerPC lwbrx instruction or by calling the
library functions EndianSwap16Bit or EndianSwap32Bit. The Expansion Bus
Managerroutines handle "endian swapping" internally. Failing to swap bytes was the
most frequent error when I wrote the sample driver; you would be wise to check this
thoroughly in your code.
PREPARING THE MEMORY
Before starting a DMA operation, the operating system must ensure that the data
accessed by the operation is in physical memory and that any data in the processor
cache has been written to memory. This is done with the PrepareMemoryForIO and
CheckpointIO routines. Because the process is complex, I'll break it down into smaller
pieces to describe it. Let's assume your driver will prepare two areas: a permanent
shared-memory area used to communicate with the device (this could be used for a
display frame buffer) and a request-specific area used for a single I/O request.
Listing 7. Preparing a shared memory area
IOPreparationTablegSharedIOTable; LogicalAddress gSharedAreaPtr;
IOPreparationTable gSharedIOTable;
LogicalAddress gSharedAreaPtr;
OSErr PrepareSharedArea(
AddressSpaceID addressSpaceID) /* DoDriverIO parameter */
{
OSErr status;
ItemCount mapEntriesNeeded;
gSharedAreaPtr =
MemAllocatePhysicallyContiguous(kSharedAreaSize, TRUE);
if (gSharedAreaPtr == NULL)
return (memFullErr);
gSharedIOTable.options =
( kIOIsInput /* Device writes to memory. */
| kIOIsOutput /* Device reads from memory. */
| kIOLogicalRanges /* Input is logical addresses. */
| kIOShareMappingTables ); /* Share tables with kernel. */
gSharedIOTable.addressSpace = addressSpaceID;
gSharedIOTable.firstPrepared = 0;
gSharedIOTable.logicalMapping = NULL; /* We don't want this. */
/* Describe the area we're preparing and allocate a mapping
table. */
gSharedIOTable.rangeInfo.range.base = gSharedAreaPtr;
gSharedIOTable.rangeInfo.range.length = kSharedAreaSize;
mapEntriesNeeded =
GetMapEntryCount(gSharedArea, kSharedAreaSize);
gSharedIOTable.physicalMapping = PoolAllocateResident(
(mapEntriesNeeded * sizeof (PhysicalAddress)), TRUE);
if (gSharedIOTable.physicalMapping == NULL)
status = memFullErr;
else
status = PrepareMemoryForIO(&gSharedIOTable);
if (status == noErr)
status = CheckPhysicalMapping(&gSharedIOTable,
kSharedAreaSize);
return (status);
}
Preparing the shared area is fairly straightforward: your driver allocates a physical
mapping table, initializes an IOPreparationTable, and calls PrepareMemoryForIO.
Listing 7 shows how to prepare a shared area and Listing 8 shows several related
utility routines. Because PrepareSharedArea allocates memory for its physical
mapping table, it must be called when your driver is initialized. Note that
GetLogicalPageSize, used in several routines, returns a systemwide constant value; a
production device driver would call it once, storing the value in a global variable.
Listing 8. PrepareMemoryForIO utilities
/* Return the number of PhysicalMappingTable entries that will be
needed to describe this memory area. */
ItemCount GetMapEntryCount(void *areaAddress,
ByteCount areaLength)
{
ByteCount normalizedLength;
UInt32 theArea;
theArea = (UInt32) areaAddress;
normalizedLength = PageBaseAddress(theArea + areaLength - 1)
- PageBaseAddress(theArea);
return (normalizedLength / GetLogicalPageSize());
}
/* Check that the entire area was prepared and that all physical
memory is contiguous. */
OSErr CheckPhysicalMapping(IOPreparationTable *ioTable,
ByteCount areaLength)
{
ItemCount i;
OSErr status;
if (areaLength != ioTable->lengthPrepared)
status = paramErr; /* Didn't prepare the entire area. */
else {
status = noErr;
for (i = 0; i < ioTable->mappingEntryCount - 1; i++) {
if (NextPageBaseAddress(ioTable->physicalMapping[i])
!= ioTable->physicalMapping[i + 1]) {
status = paramErr;
/* Area isn't physically contiguous. */
break;
}
}
}
return (status);
}
/* Return the start of the physical page that follows the page
containing this physical address. */
PhysicalAddress NextPageBaseAddress(PhysicalAddress theAddress)
{
UInt32 result;
result = PageBaseAddress
(((UInt32) theAddress) + GetLogicalPageSize());
return ((PhysicalAddress) result);
}
/* Return the start of the physical page containing this address. */
UInt32 PageBaseAddress(UInt32 theAddress)
{
return (theAddress & ~(GetLogicalPageSize() - 1));
}
To prepare a request-specific user area, your driver will initialize an
IOPreparationTablewith the procedure shown in Listing 9. Since your driver can be
called from an I/O completion routine, it can't allocate a physical mapping table for
each I/O request. Instead, your initialization procedure will allocate a
maximum-length mapping table.
To process an I/O request, the driver initializes the options and I/O range and then
calls PrepareMemoryForIO and, after I/O completion, CheckpointIO. How to prepare a
single request is shown in Listing 10. You call CheckpointIO to complete your use of
the buffer in the interrupt service routine, as shown later in Listing 11.
Listing 9. Initializing a request-specific IOPreparationTable
IOPreparationTable gRequestIOTable;
ItemCount gRequestMapEntries;
OSErr InitializeRequestIOTable(void)
{
OSErr status;
ByteCount mapTableSize;
/* Compute the worst-case number of map entries. */
gRequestMapEntries =
GetMapEntryCount((void *) GetLogicalPageSize() - 1,
kDriverMaxTransferLength);
mapTableSize = (gRequestMapEntries * sizeof (PhysicalAddress));
gRequestIOTable.physicalMapping =
PoolAllocateResident(mapTableSize, TRUE);
status = (gRequestIOTable.physicalMapping != NULL)
? noErr : memFullErr;
return (status);
}
A production device driver must extend the algorithm in Listing 10 to handle two more
complex cases:
The solution to both of these problems is partial preparation. Your driver provides a
physical mapping table of reasonable size. PrepareMemoryForIO prepares as much as
possible and your driver uses the firstPrepared and lengthPrepared fields to navigate
the physical mapping table. When your driver has performed all I/O in a partial
preparation, it recalls PrepareMemoryForIO to prepare the next segment. So the
overall, somewhat simplified, algorithm is as follows:
Listing 10. Using the request-specific IOPreparationTable
OSErr PrepareIORequest(AddressSpaceID addressSpaceID,
LogicalAddress userBufferPtr,
ByteCount userCount)
{
OSErr status;
ItemCount mapEntriesNeeded;
gRequestIOTable.options =
( kIOIsInput /* Device writes to memory. */
| kIOLogicalRanges /* Input is logical addresses. */
| kIOShareMappingTables ); /* Share tables with kernel. */
gRequestIOTable.addressSpace = addressSpaceID;
gRequestIOTable.firstPrepared = 0;
gRequestIOTable.logicalMapping = NULL; /* We don't want this. */
/* Store the user parameters in the IOPreparationTable. */
gRequestIOTable.rangeInfo.range.base = userBufferPtr;
gSharedIOTable.rangeInfo.range.length = userCount;
mapEntriesNeeded = GetMapEntryCount(userBufferPtr, userCount);
if (mapEntriesNeeded > gRequestMapEntries)
status = paramErr;
else {
gRequestIOTable.mappingEntryCount = mapEntriesNeeded;
status = PrepareMemoryForIO(&gRequestIOTable);
}
if (status == noErr)
status = CheckPhysicalMapping(&gRequestIOTable, userCount);
return (status);
}
THE INTERRUPT SERVICE ROUTINE
When the hardware device completes a request, it interrupts the PowerPC processor.
The operating system kernel fields the interrupt and searches an interrupt service
treeto find a function that's been registered to handle that interrupt. A driver has
establishedthis function by calling InstallInterruptFunctions, as was shown in Listing
5.
A driver's interrupt service routine is generally broken into two parts: a primary
routine that handles immediate operations and a secondary routine that completes the
operation, releases any system resources held by PrepareMemoryForIO, and calls
IOCommandIsComplete. (Note that some drivers will have no secondary routine.)
Secondary interrupt routines are serialized: they always run to completion before the
system calls them again. However, they don't block other devices from interrupting
the system. This greatly simplifies device driver design, as the secondary interrupt
routine can manage the driver's internal queues without the significant overhead that
blocking all processor interrupts would require. Device drivers may need more
complex processing than can be accomplished with primary and secondary interrupt
routines. For example, a CD-ROM driver needs to check for disk insertion
periodically. Also, all drivers need to handle virtual memory paging. To accomplish
this, a driver can create a software task -- an independent function that's scheduled at
a time when all system services are available. Interrupt service and timer completion
routines can schedule software tasks when necessary.
Listing 11 shows an extremely simplified interrupt service routine to familiarize you
with this organization. DriverInterruptServiceRoutine, the primary routine, stores
the hardware completion status and then queues a secondary interrupt routine to
complete the operation. The secondary interrupt routine completes the I/O request by
checkpointing the memory that was prepared before the transfer started. It then
passes final completion status back to the operating system kernel.
Listing 11. A simplified interrupt service routine
InterruptSetMember DriverInterruptServiceRoutine(
InterruptSetMember interruptSetMember, /* Unused here */
void *refCon, /* Unused here */
UInt32 theInterruptCount) /* Unused here */
{
OSErr status;
UInt8 driverStatus;
/* Retrieve the operation status from the device. This is
fiction: a real device will be much more complex. */
driverStatus = gDeviceBaseAddress[kDeviceStatusRegister];
if (driverStatus == <device is not interrupting>
return (kISRIsNotComplete);
if (driverStatus == kDeviceStatusOK)
status = noErr;
else
status = ioErr;
/* The operation is (presumably) complete. Queue a secondary
interrupt task that will release all memory and return the
final status to the caller. We'll ignore an error from
QueueSecondaryInterrupt. */
(void) QueueSecondaryInterrupt(
DriverSecondaryInterruptRoutine,
NULL, /* No exception handler */
(void *) status, /* Operation ioResult */
NULL); /* No p2 parameter */
return (kISRIsComplete);
}
OSStatus DriverSecondaryInterruptRoutine(
void *p1, /* Has ioResult value */
void *p2) /* Unused */
{
IOPreparationID ioPreparationID; /* Request I/O prep ID */
/* Copy operation-specific values (such as the number of bytes
transferred) into the caller's parameter block. */
gCurrentParmBlkPtr->ioActCount = <device-specific value>;
ioPreparationID = gRequestIOTable.preparationID;
if (ioPreparationID != kInvalidID) {
gRequestIOTable.preparationID = kInvalidID;
(void) CheckpointIO(ioPreparationID, kNilOptions);
}
/* IOCommandIsComplete is the only function that should set the
ioResult field. */
IOCommandIsComplete(gIOCommandID, (OSErr) p1);
return (noErr);
}
This sample doesn't use the interrupt set member number, the refCon, or the
interrupt count, which are needed for interrupt service routines that handle several
devices (for example, in the case of a hardware device that controls several serial
lines). Also, to simplify this sample, I'm presuming that all information is stored in
driver globals. A better organization would make use of a "per-request" data structure
that encapsulates all information needed for a single user I/O request (such as
PBRead); this greatly simplifies the driver organization when you want to extend the
driver to support multiple simultaneous requests (concurrent I/O).
There's a lot of material here -- and a lot more that I haven't discussed. Still, this
should give you a good overview of the new driver services and how they work
together. While this may be overwhelming if you've never written a device driver
before, those of you who have (for any operating system) will be happy to note how
much isn't here: no assembly language, no dependencies on the strange quirks of the
Mac OS, and all hardware dependencies either hidden from you or limited to your
device's specific needs.
Memory-mapped I/O and I/O cycle operations represent two ways of designing a
computer architecture.
Using memory-mapped I/O, device hardware responds to normal memory operations
in a particular range of addresses. For example, PDP-11 computers without memory
management hardware reserved 8K for peripheral hardware registers, limiting the
memory available to programs to 56K.
I/O cycle operations effectively place external devices in an independent address space.
This gives programs additional memory but requires special instructions to access
peripheral devices. The Intel 80x86 series uses this organization.
To the programmer, memory-mapped I/O has the advantage of allowing direct device
operations without special instructions, making it relatively easy to write device
drivers in high-level languages. As bus widths and memory size limitations have
eased, the inability to use part of the address space for programs has become less of an
issue.
Apple's PCI-based machines use only memory-mapped I/O. However, the bus interface
hardware generates PCI I/O cycles for a subset of the physical address space.
REFERENCES
MARTIN MINOW recently sneaked away to England from his job at Apple for a (too)
brief vacation. The high point was at the Kew Bridge Steam Museum outside of London,
where he stood inside the oldest, or perhaps the largest, working steam engine in the
world. The four-story-high, 50-foot-long engine was used to pump water from the
Thames for more than 100 years and is now the centerpiece of a large collection of
working steam engines. And speaking of working, Martin's been doing too much of it
and already needs another vacation. *
Thanks to our technical reviewers Jano Banks, Holly Knight, Wayne Meretsky, Tom
Saulpaugh, and George Towner. *