Chapter 10. ISA Device Drivers

10.1. Synopsis

This chapter introduces the issues relevant to writing a driver for an ISA device. The pseudo-code presented here is rather detailed and reminiscent of the real code but is still only pseudo-code. It avoids the details irrelevant to the subject of the discussion. The real-life examples can be found in the source code of real drivers. In particular the drivers ep and aha are good sources of information.

10.2. Basic Information

A typical ISA driver would need the following include files:

#include <sys/module.h>
#include <sys/bus.h>
#include <machine/bus.h>
#include <machine/resource.h>
#include <sys/rman.h>

#include <isa/isavar.h>
#include <isa/pnpvar.h>

They describe the things specific to the ISA and generic bus subsystem.

The bus subsystem is implemented in an object-oriented fashion, its main structures are accessed by associated method functions.

The list of bus methods implemented by an ISA driver is like one for any other bus. For a hypothetical driver named "xxx" they would be:

  • static void xxx_isa_identify (driver_t *, device_t); Normally used for bus drivers, not device drivers. But for ISA devices this method may have special use: if the device provides some device-specific (non-PnP) way to auto-detect devices this routine may implement it.

  • static int xxx_isa_probe (device_t dev); Probe for a device at a known (or PnP) location. This routine can also accommodate device-specific auto-detection of parameters for partially configured devices.

  • static int xxx_isa_attach (device_t dev); Attach and initialize device.

  • static int xxx_isa_detach (device_t dev); Detach device before unloading the driver module.

  • static int xxx_isa_shutdown (device_t dev); Execute shutdown of the device before system shutdown.

  • static int xxx_isa_suspend (device_t dev); Suspend the device before the system goes to the power-save state. May also abort transition to the power-save state.

  • static int xxx_isa_resume (device_t dev); Resume the device activity after return from power-save state.

xxx_isa_probe() and xxx_isa_attach() are mandatory, the rest of the routines are optional, depending on the device’s needs.

The driver is linked to the system with the following set of descriptions.

    /* table of supported bus methods */
    static device_method_t xxx_isa_methods[] = {
        /* list all the bus method functions supported by the driver */
        /* omit the unsupported methods */
        DEVMETHOD(device_identify,  xxx_isa_identify),
        DEVMETHOD(device_probe,     xxx_isa_probe),
        DEVMETHOD(device_attach,    xxx_isa_attach),
        DEVMETHOD(device_detach,    xxx_isa_detach),
        DEVMETHOD(device_shutdown,  xxx_isa_shutdown),
        DEVMETHOD(device_suspend,   xxx_isa_suspend),
        DEVMETHOD(device_resume,    xxx_isa_resume),

	DEVMETHOD_END
    };

    static driver_t xxx_isa_driver = {
        "xxx",
        xxx_isa_methods,
        sizeof(struct xxx_softc),
    };

    static devclass_t xxx_devclass;

    DRIVER_MODULE(xxx, isa, xxx_isa_driver, xxx_devclass,
        load_function, load_argument);

Here struct xxx_softc is a device-specific structure that contains private driver data and descriptors for the driver’s resources. The bus code automatically allocates one softc descriptor per device as needed.

If the driver is implemented as a loadable module then load_function() is called to do driver-specific initialization or clean-up when the driver is loaded or unloaded and load_argument is passed as one of its arguments. If the driver does not support dynamic loading (in other words it must always be linked into the kernel) then these values should be set to 0 and the last definition would look like:

 DRIVER_MODULE(xxx, isa, xxx_isa_driver,
       xxx_devclass, 0, 0);

If the driver is for a device which supports PnP then a table of supported PnP IDs must be defined. The table consists of a list of PnP IDs supported by this driver and human-readable descriptions of the hardware types and models having these IDs. It looks like:

    static struct isa_pnp_id xxx_pnp_ids[] = {
        /* a line for each supported PnP ID */
        { 0x12345678,   "Our device model 1234A" },
        { 0x12345679,   "Our device model 1234B" },
        { 0,        NULL }, /* end of table */
    };

If the driver does not support PnP devices it still needs an empty PnP ID table, like:

    static struct isa_pnp_id xxx_pnp_ids[] = {
        { 0,        NULL }, /* end of table */
    };

10.3. device_t Pointer

device_t is the pointer type for the device structure. Here we consider only the methods interesting from the device driver writer’s standpoint. The methods to manipulate values in the device structure are:

  • device_t device_get_parent(dev) Get the parent bus of a device.

  • driver_t device_get_driver(dev) Get pointer to its driver structure.

  • char *device_get_name(dev) Get the driver name, such as "xxx" for our example.

  • int device_get_unit(dev) Get the unit number (units are numbered from 0 for the devices associated with each driver).

  • char *device_get_nameunit(dev) Get the device name including the unit number, such as "xxx0", "xxx1" and so on.

  • char *device_get_desc(dev) Get the device description. Normally it describes the exact model of device in human-readable form.

  • device_set_desc(dev, desc) Set the description. This makes the device description point to the string desc which may not be deallocated or changed after that.

  • device_set_desc_copy(dev, desc) Set the description. The description is copied into an internal dynamically allocated buffer, so the string desc may be changed afterwards without adverse effects.

  • void *device_get_softc(dev) Get pointer to the device descriptor (struct xxx_softc) associated with this device.

  • u_int32_t device_get_flags(dev) Get the flags specified for the device in the configuration file.

A convenience function device_printf(dev, fmt, …​) may be used to print the messages from the device driver. It automatically prepends the unitname and colon to the message.

The device_t methods are implemented in the file kern/bus_subr.c.

10.4. Configuration File and the Order of Identifying and Probing During Auto-Configuration

The ISA devices are described in the kernel configuration file like:

device xxx0 at isa? port 0x300 irq 10 drq 5
       iomem 0xd0000 flags 0x1 sensitive

The values of port, IRQ and so on are converted to the resource values associated with the device. They are optional, depending on the device’s needs and abilities for auto-configuration. For example, some devices do not need DRQ at all and some allow the driver to read the IRQ setting from the device configuration ports. If a machine has multiple ISA buses the exact bus may be specified in the configuration line, like isa0 or isa1, otherwise the device would be searched for on all the ISA buses.

sensitive is a resource requesting that this device must be probed before all non-sensitive devices. It is supported but does not seem to be used in any current driver.

For legacy ISA devices in many cases the drivers are still able to detect the configuration parameters. But each device to be configured in the system must have a config line. If two devices of some type are installed in the system but there is only one configuration line for the corresponding driver, ie:

device xxx0 at isa?
then only one device will be configured.

But for the devices supporting automatic identification by the means of Plug-n-Play or some proprietary protocol one configuration line is enough to configure all the devices in the system, like the one above or just simply:

device xxx at isa?

If a driver supports both auto-identified and legacy devices and both kinds are installed at once in one machine then it is enough to describe in the config file the legacy devices only. The auto-identified devices will be added automatically.

When an ISA bus is auto-configured the events happen as follows:

All the drivers' identify routines (including the PnP identify routine which identifies all the PnP devices) are called in random order. As they identify the devices they add them to the list on the ISA bus. Normally the drivers' identify routines associate their drivers with the new devices. The PnP identify routine does not know about the other drivers yet so it does not associate any with the new devices it adds.

The PnP devices are put to sleep using the PnP protocol to prevent them from being probed as legacy devices.

The probe routines of non-PnP devices marked as sensitive are called. If probe for a device went successfully, the attach routine is called for it.

The probe and attach routines of all non-PNP devices are called likewise.

The PnP devices are brought back from the sleep state and assigned the resources they request: I/O and memory address ranges, IRQs and DRQs, all of them not conflicting with the attached legacy devices.

Then for each PnP device the probe routines of all the present ISA drivers are called. The first one that claims the device gets attached. It is possible that multiple drivers would claim the device with different priority; in this case, the highest-priority driver wins. The probe routines must call ISA_PNP_PROBE() to compare the actual PnP ID with the list of the IDs supported by the driver and if the ID is not in the table return failure. That means that absolutely every driver, even the ones not supporting any PnP devices must call ISA_PNP_PROBE(), at least with an empty PnP ID table to return failure on unknown PnP devices.

The probe routine returns a positive value (the error code) on error, zero or negative value on success.

The negative return values are used when a PnP device supports multiple interfaces. For example, an older compatibility interface and a newer advanced interface which are supported by different drivers. Then both drivers would detect the device. The driver which returns a higher value in the probe routine takes precedence (in other words, the driver returning 0 has highest precedence, returning -1 is next, returning -2 is after it and so on). In result the devices which support only the old interface will be handled by the old driver (which should return -1 from the probe routine) while the devices supporting the new interface as well will be handled by the new driver (which should return 0 from the probe routine). If multiple drivers return the same value then the one called first wins. So if a driver returns value 0 it may be sure that it won the priority arbitration.

The device-specific identify routines can also assign not a driver but a class of drivers to the device. Then all the drivers in the class are probed for this device, like the case with PnP. This feature is not implemented in any existing driver and is not considered further in this document.

As the PnP devices are disabled when probing the legacy devices they will not be attached twice (once as legacy and once as PnP). But in case of device-dependent identify routines it is the responsibility of the driver to make sure that the same device will not be attached by the driver twice: once as legacy user-configured and once as auto-identified.

Another practical consequence for the auto-identified devices (both PnP and device-specific) is that the flags can not be passed to them from the kernel configuration file. So they must either not use the flags at all or use the flags from the device unit 0 for all the auto-identified devices or use the sysctl interface instead of flags.

Other unusual configurations may be accommodated by accessing the configuration resources directly with functions of families resource_query_*() and resource_*_value(). Their implementations are located in kern/subr_bus.c. The old IDE disk driver i386/isa/wd.c contains examples of such use. But the standard means of configuration must always be preferred. Leave parsing the configuration resources to the bus configuration code.

10.5. Resources

The information that a user enters into the kernel configuration file is processed and passed to the kernel as configuration resources. This information is parsed by the bus configuration code and transformed into a value of structure device_t and the bus resources associated with it. The drivers may access the configuration resources directly using functions resource_* for more complex cases of configuration. However, generally this is neither needed nor recommended, so this issue is not discussed further here.

The bus resources are associated with each device. They are identified by type and number within the type. For the ISA bus the following types are defined:

  • SYS_RES_IRQ - interrupt number

  • SYS_RES_DRQ - ISA DMA channel number

  • SYS_RES_MEMORY - range of device memory mapped into the system memory space

  • SYS_RES_IOPORT - range of device I/O registers

The enumeration within types starts from 0, so if a device has two memory regions it would have resources of type SYS_RES_MEMORY numbered 0 and 1. The resource type has nothing to do with the C language type, all the resource values have the C language type unsigned long and must be cast as necessary. The resource numbers do not have to be contiguous, although for ISA they normally would be. The permitted resource numbers for ISA devices are:

          IRQ: 0-1
          DRQ: 0-1
          MEMORY: 0-3
          IOPORT: 0-7

All the resources are represented as ranges, with a start value and count. For IRQ and DRQ resources the count would normally be equal to 1. The values for memory refer to the physical addresses.

Three types of activities can be performed on resources:

  • set/get

  • allocate/release

  • activate/deactivate

Setting sets the range used by the resource. Allocation reserves the requested range that no other driver would be able to reserve it (and checking that no other driver reserved this range already). Activation makes the resource accessible to the driver by doing whatever is necessary for that (for example, for memory it would be mapping into the kernel virtual address space).

The functions to manipulate resources are:

  • int bus_set_resource(device_t dev, int type, int rid, u_long start, u_long count)

    Set a range for a resource. Returns 0 if successful, error code otherwise. Normally, this function will return an error only if one of type, rid, start or count has a value that falls out of the permitted range.

    • dev - driver’s device

    • type - type of resource, SYS_RES_*

    • rid - resource number (ID) within type

    • start, count - resource range

  • int bus_get_resource(device_t dev, int type, int rid, u_long *startp, u_long *countp)

    Get the range of resource. Returns 0 if successful, error code if the resource is not defined yet.

  • u_long bus_get_resource_start(device_t dev, int type, int rid) u_long bus_get_resource_count (device_t dev, int type, int rid)

    Convenience functions to get only the start or count. Return 0 in case of error, so if the resource start has 0 among the legitimate values it would be impossible to tell if the value is 0 or an error occurred. Luckily, no ISA resources for add-on drivers may have a start value equal to 0.

  • void bus_delete_resource(device_t dev, int type, int rid)

    Delete a resource, make it undefined.

  • struct resource * bus_alloc_resource(device_t dev, int type, int *rid, u_long start, u_long end, u_long count, u_int flags)

    Allocate a resource as a range of count values not allocated by anyone else, somewhere between start and end. Alas, alignment is not supported. If the resource was not set yet it is automatically created. The special values of start 0 and end ~0 (all ones) means that the fixed values previously set by bus_set_resource() must be used instead: start and count as themselves and end=(start+count), in this case if the resource was not defined before then an error is returned. Although rid is passed by reference it is not set anywhere by the resource allocation code of the ISA bus. (The other buses may use a different approach and modify it).

Flags are a bitmap, the flags interesting for the caller are:

  • RF_ACTIVE - causes the resource to be automatically activated after allocation.

  • RF_SHAREABLE - resource may be shared at the same time by multiple drivers.

  • RF_TIMESHARE - resource may be time-shared by multiple drivers, i.e., allocated at the same time by many but activated only by one at any given moment of time.

  • Returns 0 on error. The allocated values may be obtained from the returned handle using methods rhand_*().

  • int bus_release_resource(device_t dev, int type, int rid, struct resource *r)

  • Release the resource, r is the handle returned by bus_alloc_resource(). Returns 0 on success, error code otherwise.

  • int bus_activate_resource(device_t dev, int type, int rid, struct resource *r) int bus_deactivate_resource(device_t dev, int type, int rid, struct resource *r)

  • Activate or deactivate resource. Return 0 on success, error code otherwise. If the resource is time-shared and currently activated by another driver then EBUSY is returned.

  • int bus_setup_intr(device_t dev, struct resource *r, int flags, driver_intr_t *handler, void *arg, void **cookiep) int bus_teardown_intr(device_t dev, struct resource *r, void *cookie)

  • Associate or de-associate the interrupt handler with a device. Return 0 on success, error code otherwise.

  • r - the activated resource handler describing the IRQ

    flags - the interrupt priority level, one of:

    • INTR_TYPE_TTY - terminals and other likewise character-type devices. To mask them use spltty().

    • (INTR_TYPE_TTY | INTR_TYPE_FAST) - terminal type devices with small input buffer, critical to the data loss on input (such as the old-fashioned serial ports). To mask them use spltty().

    • INTR_TYPE_BIO - block-type devices, except those on the CAM controllers. To mask them use splbio().

    • INTR_TYPE_CAM - CAM (Common Access Method) bus controllers. To mask them use splcam().

    • INTR_TYPE_NET - network interface controllers. To mask them use splimp().

    • INTR_TYPE_MISC - miscellaneous devices. There is no other way to mask them than by splhigh() which masks all interrupts.

When an interrupt handler executes all the other interrupts matching its priority level will be masked. The only exception is the MISC level for which no other interrupts are masked and which is not masked by any other interrupt.

  • handler - pointer to the handler function, the type driver_intr_t is defined as void driver_intr_t(void *)

  • arg - the argument passed to the handler to identify this particular device. It is cast from void* to any real type by the handler. The old convention for the ISA interrupt handlers was to use the unit number as argument, the new (recommended) convention is using a pointer to the device softc structure.

  • cookie[p] - the value received from setup() is used to identify the handler when passed to teardown()

A number of methods are defined to operate on the resource handlers (struct resource *). Those of interest to the device driver writers are:

  • u_long rman_get_start(r) u_long rman_get_end(r) Get the start and end of allocated resource range.

  • void *rman_get_virtual(r) Get the virtual address of activated memory resource.

10.6. Bus Memory Mapping

In many cases data is exchanged between the driver and the device through the memory. Two variants are possible:

(a) memory is located on the device card

(b) memory is the main memory of the computer

In case (a) the driver always copies the data back and forth between the on-card memory and the main memory as necessary. To map the on-card memory into the kernel virtual address space the physical address and length of the on-card memory must be defined as a SYS_RES_MEMORY resource. That resource can then be allocated and activated, and its virtual address obtained using rman_get_virtual(). The older drivers used the function pmap_mapdev() for this purpose, which should not be used directly any more. Now it is one of the internal steps of resource activation.

Most of the ISA cards will have their memory configured for physical location somewhere in range 640KB-1MB. Some of the ISA cards require larger memory ranges which should be placed somewhere under 16MB (because of the 24-bit address limitation on the ISA bus). In that case if the machine has more memory than the start address of the device memory (in other words, they overlap) a memory hole must be configured at the address range used by devices. Many BIOSes allow configuration of a memory hole of 1MB starting at 14MB or 15MB. FreeBSD can handle the memory holes properly if the BIOS reports them properly (this feature may be broken on old BIOSes).

In case (b) just the address of the data is sent to the device, and the device uses DMA to actually access the data in the main memory. Two limitations are present: First, ISA cards can only access memory below 16MB. Second, the contiguous pages in virtual address space may not be contiguous in physical address space, so the device may have to do scatter/gather operations. The bus subsystem provides ready solutions for some of these problems, the rest has to be done by the drivers themselves.

Two structures are used for DMA memory allocation, bus_dma_tag_t and bus_dmamap_t. Tag describes the properties required for the DMA memory. Map represents a memory block allocated according to these properties. Multiple maps may be associated with the same tag.

Tags are organized into a tree-like hierarchy with inheritance of the properties. A child tag inherits all the requirements of its parent tag, and may make them more strict but never more loose.

Normally one top-level tag (with no parent) is created for each device unit. If multiple memory areas with different requirements are needed for each device then a tag for each of them may be created as a child of the parent tag.

The tags can be used to create a map in two ways.

First, a chunk of contiguous memory conformant with the tag requirements may be allocated (and later may be freed). This is normally used to allocate relatively long-living areas of memory for communication with the device. Loading of such memory into a map is trivial: it is always considered as one chunk in the appropriate physical memory range.

Second, an arbitrary area of virtual memory may be loaded into a map. Each page of this memory will be checked for conformance to the map requirement. If it conforms then it is left at its original location. If it is not then a fresh conformant "bounce page" is allocated and used as intermediate storage. When writing the data from the non-conformant original pages they will be copied to their bounce pages first and then transferred from the bounce pages to the device. When reading the data would go from the device to the bounce pages and then copied to their non-conformant original pages. The process of copying between the original and bounce pages is called synchronization. This is normally used on a per-transfer basis: buffer for each transfer would be loaded, transfer done and buffer unloaded.

The functions working on the DMA memory are:

  • int bus_dma_tag_create(bus_dma_tag_t parent, bus_size_t alignment, bus_size_t boundary, bus_addr_t lowaddr, bus_addr_t highaddr, bus_dma_filter_t *filter, void *filterarg, bus_size_t maxsize, int nsegments, bus_size_t maxsegsz, int flags, bus_dma_tag_t *dmat)

    Create a new tag. Returns 0 on success, the error code otherwise.

    • parent - parent tag, or NULL to create a top-level tag.

    • alignment - required physical alignment of the memory area to be allocated for this tag. Use value 1 for "no specific alignment". Applies only to the future bus_dmamem_alloc() but not bus_dmamap_create() calls.

    • boundary - physical address boundary that must not be crossed when allocating the memory. Use value 0 for "no boundary". Applies only to the future bus_dmamem_alloc() but not bus_dmamap_create() calls. Must be power of 2. If the memory is planned to be used in non-cascaded DMA mode (i.e., the DMA addresses will be supplied not by the device itself but by the ISA DMA controller) then the boundary must be no larger than 64KB (64*1024) due to the limitations of the DMA hardware.

    • lowaddr, highaddr - the names are slightly misleading; these values are used to limit the permitted range of physical addresses used to allocate the memory. The exact meaning varies depending on the planned future use:

      • For bus_dmamem_alloc() all the addresses from 0 to lowaddr-1 are considered permitted, the higher ones are forbidden.

      • For bus_dmamap_create() all the addresses outside the inclusive range [lowaddr; highaddr] are considered accessible. The addresses of pages inside the range are passed to the filter function which decides if they are accessible. If no filter function is supplied then all the range is considered unaccessible.

      • For the ISA devices the normal values (with no filter function) are:

        lowaddr = BUS_SPACE_MAXADDR_24BIT

        highaddr = BUS_SPACE_MAXADDR

    • filter, filterarg - the filter function and its argument. If NULL is passed for filter then the whole range [lowaddr, highaddr] is considered unaccessible when doing bus_dmamap_create(). Otherwise the physical address of each attempted page in range [lowaddr; highaddr] is passed to the filter function which decides if it is accessible. The prototype of the filter function is: int filterfunc(void *arg, bus_addr_t paddr). It must return 0 if the page is accessible, non-zero otherwise.

    • maxsize - the maximal size of memory (in bytes) that may be allocated through this tag. In case it is difficult to estimate or could be arbitrarily big, the value for ISA devices would be BUS_SPACE_MAXSIZE_24BIT.

    • nsegments - maximal number of scatter-gather segments supported by the device. If unrestricted then the value BUS_SPACE_UNRESTRICTED should be used. This value is recommended for the parent tags, the actual restrictions would then be specified for the descendant tags. Tags with nsegments equal to BUS_SPACE_UNRESTRICTED may not be used to actually load maps, they may be used only as parent tags. The practical limit for nsegments seems to be about 250-300, higher values will cause kernel stack overflow (the hardware can not normally support that many scatter-gather buffers anyway).

    • maxsegsz - maximal size of a scatter-gather segment supported by the device. The maximal value for ISA device would be BUS_SPACE_MAXSIZE_24BIT.

    • flags - a bitmap of flags. The only interesting flag is:

      • BUS_DMA_ALLOCNOW - requests to allocate all the potentially needed bounce pages when creating the tag.

    • dmat - pointer to the storage for the new tag to be returned.

  • int bus_dma_tag_destroy(bus_dma_tag_t dmat)

    Destroy a tag. Returns 0 on success, the error code otherwise.

    dmat - the tag to be destroyed.

  • int bus_dmamem_alloc(bus_dma_tag_t dmat, void** vaddr, int flags, bus_dmamap_t *mapp)

    Allocate an area of contiguous memory described by the tag. The size of memory to be allocated is tag’s maxsize. Returns 0 on success, the error code otherwise. The result still has to be loaded by bus_dmamap_load() before being used to get the physical address of the memory.

    • dmat - the tag

    • vaddr - pointer to the storage for the kernel virtual address of the allocated area to be returned.

    • flags - a bitmap of flags. The only interesting flag is:

      • BUS_DMA_NOWAIT - if the memory is not immediately available return the error. If this flag is not set then the routine is allowed to sleep until the memory becomes available.

    • mapp - pointer to the storage for the new map to be returned.

  • void bus_dmamem_free(bus_dma_tag_t dmat, void *vaddr, bus_dmamap_t map)

    Free the memory allocated by bus_dmamem_alloc(). At present, freeing of the memory allocated with ISA restrictions is not implemented. Due to this the recommended model of use is to keep and re-use the allocated areas for as long as possible. Do not lightly free some area and then shortly allocate it again. That does not mean that bus_dmamem_free() should not be used at all: hopefully it will be properly implemented soon.

    • dmat - the tag

    • vaddr - the kernel virtual address of the memory

    • map - the map of the memory (as returned from bus_dmamem_alloc())

  • int bus_dmamap_create(bus_dma_tag_t dmat, int flags, bus_dmamap_t *mapp)

    Create a map for the tag, to be used in bus_dmamap_load() later. Returns 0 on success, the error code otherwise.

    • dmat - the tag

    • flags - theoretically, a bit map of flags. But no flags are defined yet, so at present it will be always 0.

    • mapp - pointer to the storage for the new map to be returned

  • int bus_dmamap_destroy(bus_dma_tag_t dmat, bus_dmamap_t map)

    Destroy a map. Returns 0 on success, the error code otherwise.

    • dmat - the tag to which the map is associated

    • map - the map to be destroyed

  • int bus_dmamap_load(bus_dma_tag_t dmat, bus_dmamap_t map, void *buf, bus_size_t buflen, bus_dmamap_callback_t *callback, void *callback_arg, int flags)

    Load a buffer into the map (the map must be previously created by bus_dmamap_create() or bus_dmamem_alloc()). All the pages of the buffer are checked for conformance to the tag requirements and for those not conformant the bounce pages are allocated. An array of physical segment descriptors is built and passed to the callback routine. This callback routine is then expected to handle it in some way. The number of bounce buffers in the system is limited, so if the bounce buffers are needed but not immediately available the request will be queued and the callback will be called when the bounce buffers will become available. Returns 0 if the callback was executed immediately or EINPROGRESS if the request was queued for future execution. In the latter case the synchronization with queued callback routine is the responsibility of the driver.

    • dmat - the tag

    • map - the map

    • buf - kernel virtual address of the buffer

    • buflen - length of the buffer

    • callback, callback_arg - the callback function and its argument

      The prototype of callback function is: void callback(void *arg, bus_dma_segment_t *seg, int nseg, int error)

    • arg - the same as callback_arg passed to bus_dmamap_load()

    • seg - array of the segment descriptors

    • nseg - number of descriptors in array

    • error - indication of the segment number overflow: if it is set to EFBIG then the buffer did not fit into the maximal number of segments permitted by the tag. In this case only the permitted number of descriptors will be in the array. Handling of this situation is up to the driver: depending on the desired semantics it can either consider this an error or split the buffer in two and handle the second part separately

      Each entry in the segments array contains the fields:

    • ds_addr - physical bus address of the segment

    • ds_len - length of the segment

  • void bus_dmamap_unload(bus_dma_tag_t dmat, bus_dmamap_t map)

    unload the map.

    • dmat - tag

    • map - loaded map

  • void bus_dmamap_sync (bus_dma_tag_t dmat, bus_dmamap_t map, bus_dmasync_op_t op)

    Synchronise a loaded buffer with its bounce pages before and after physical transfer to or from device. This is the function that does all the necessary copying of data between the original buffer and its mapped version. The buffers must be synchronized both before and after doing the transfer.

    • dmat - tag

    • map - loaded map

    • op - type of synchronization operation to perform:

    • BUS_DMASYNC_PREREAD - before reading from device into buffer

    • BUS_DMASYNC_POSTREAD - after reading from device into buffer

    • BUS_DMASYNC_PREWRITE - before writing the buffer to device

    • BUS_DMASYNC_POSTWRITE - after writing the buffer to device

As of now PREREAD and POSTWRITE are null operations but that may change in the future, so they must not be ignored in the driver. Synchronization is not needed for the memory obtained from bus_dmamem_alloc().

Before calling the callback function from bus_dmamap_load() the segment array is stored in the stack. And it gets pre-allocated for the maximal number of segments allowed by the tag. As a result of this the practical limit for the number of segments on i386 architecture is about 250-300 (the kernel stack is 4KB minus the size of the user structure, size of a segment array entry is 8 bytes, and some space must be left). Since the array is allocated based on the maximal number this value must not be set higher than really needed. Fortunately, for most of hardware the maximal supported number of segments is much lower. But if the driver wants to handle buffers with a very large number of scatter-gather segments it should do that in portions: load part of the buffer, transfer it to the device, load next part of the buffer, and so on.

Another practical consequence is that the number of segments may limit the size of the buffer. If all the pages in the buffer happen to be physically non-contiguous then the maximal supported buffer size for that fragmented case would be (nsegments * page_size). For example, if a maximal number of 10 segments is supported then on i386 maximal guaranteed supported buffer size would be 40K. If a higher size is desired then special tricks should be used in the driver.

If the hardware does not support scatter-gather at all or the driver wants to support some buffer size even if it is heavily fragmented then the solution is to allocate a contiguous buffer in the driver and use it as intermediate storage if the original buffer does not fit.

Below are the typical call sequences when using a map depend on the use of the map. The characters → are used to show the flow of time.

For a buffer which stays practically fixed during all the time between attachment and detachment of a device:

bus_dmamem_alloc → bus_dmamap_load → …​use buffer…​ → → bus_dmamap_unload → bus_dmamem_free

For a buffer that changes frequently and is passed from outside the driver:

          bus_dmamap_create ->
          -> bus_dmamap_load -> bus_dmamap_sync(PRE...) -> do transfer ->
          -> bus_dmamap_sync(POST...) -> bus_dmamap_unload ->
          ...
          -> bus_dmamap_load -> bus_dmamap_sync(PRE...) -> do transfer ->
          -> bus_dmamap_sync(POST...) -> bus_dmamap_unload ->
          -> bus_dmamap_destroy

When loading a map created by bus_dmamem_alloc() the passed address and size of the buffer must be the same as used in bus_dmamem_alloc(). In this case it is guaranteed that the whole buffer will be mapped as one segment (so the callback may be based on this assumption) and the request will be executed immediately (EINPROGRESS will never be returned). All the callback needs to do in this case is to save the physical address.

A typical example would be:

          static void
        alloc_callback(void *arg, bus_dma_segment_t *seg, int nseg, int error)
        {
          *(bus_addr_t *)arg = seg[0].ds_addr;
        }

          ...
          int error;
          struct somedata {
            ....
          };
          struct somedata *vsomedata; /* virtual address */
          bus_addr_t psomedata; /* physical bus-relative address */
          bus_dma_tag_t tag_somedata;
          bus_dmamap_t map_somedata;
          ...

          error=bus_dma_tag_create(parent_tag, alignment,
           boundary, lowaddr, highaddr, /*filter*/ NULL, /*filterarg*/ NULL,
           /*maxsize*/ sizeof(struct somedata), /*nsegments*/ 1,
           /*maxsegsz*/ sizeof(struct somedata), /*flags*/ 0,
           &tag_somedata);
          if(error)
          return error;

          error = bus_dmamem_alloc(tag_somedata, &vsomedata, /* flags*/ 0,
             &map_somedata);
          if(error)
             return error;

          bus_dmamap_load(tag_somedata, map_somedata, (void *)vsomedata,
             sizeof (struct somedata), alloc_callback,
             (void *) &psomedata, /*flags*/0);

Looks a bit long and complicated but that is the way to do it. The practical consequence is: if multiple memory areas are allocated always together it would be a really good idea to combine them all into one structure and allocate as one (if the alignment and boundary limitations permit).

When loading an arbitrary buffer into the map created by bus_dmamap_create() special measures must be taken to synchronize with the callback in case it would be delayed. The code would look like:

          {
           int s;
           int error;

           s = splsoftvm();
           error = bus_dmamap_load(
               dmat,
               dmamap,
               buffer_ptr,
               buffer_len,
               callback,
               /*callback_arg*/ buffer_descriptor,
               /*flags*/0);
           if (error == EINPROGRESS) {
               /*
                * Do whatever is needed to ensure synchronization
                * with callback. Callback is guaranteed not to be started
                * until we do splx() or tsleep().
                */
              }
           splx(s);
          }

Two possible approaches for the processing of requests are:

  1. If requests are completed by marking them explicitly as done (such as the CAM requests) then it would be simpler to put all the further processing into the callback driver which would mark the request when it is done. Then not much extra synchronization is needed. For the flow control reasons it may be a good idea to freeze the request queue until this request gets completed.

  2. If requests are completed when the function returns (such as classic read or write requests on character devices) then a synchronization flag should be set in the buffer descriptor and tsleep() called. Later when the callback gets called it will do its processing and check this synchronization flag. If it is set then the callback should issue a wakeup. In this approach the callback function could either do all the needed processing (just like the previous case) or simply save the segments array in the buffer descriptor. Then after callback completes the calling function could use this saved segments array and do all the processing.

10.7. DMA

The Direct Memory Access (DMA) is implemented in the ISA bus through the DMA controller (actually, two of them but that is an irrelevant detail). To make the early ISA devices simple and cheap the logic of the bus control and address generation was concentrated in the DMA controller. Fortunately, FreeBSD provides a set of functions that mostly hide the annoying details of the DMA controller from the device drivers.

The simplest case is for the fairly intelligent devices. Like the bus master devices on PCI they can generate the bus cycles and memory addresses all by themselves. The only thing they really need from the DMA controller is bus arbitration. So for this purpose they pretend to be cascaded slave DMA controllers. And the only thing needed from the system DMA controller is to enable the cascaded mode on a DMA channel by calling the following function when attaching the driver:

void isa_dmacascade(int channel_number)

All the further activity is done by programming the device. When detaching the driver no DMA-related functions need to be called.

For the simpler devices things get more complicated. The functions used are:

  • int isa_dma_acquire(int chanel_number)

    Reserve a DMA channel. Returns 0 on success or EBUSY if the channel was already reserved by this or a different driver. Most of the ISA devices are not able to share DMA channels anyway, so normally this function is called when attaching a device. This reservation was made redundant by the modern interface of bus resources but still must be used in addition to the latter. If not used then later, other DMA routines will panic.

  • int isa_dma_release(int chanel_number)

    Release a previously reserved DMA channel. No transfers must be in progress when the channel is released (in addition the device must not try to initiate transfer after the channel is released).

  • void isa_dmainit(int chan, u_int bouncebufsize)

    Allocate a bounce buffer for use with the specified channel. The requested size of the buffer can not exceed 64KB. This bounce buffer will be automatically used later if a transfer buffer happens to be not physically contiguous or outside of the memory accessible by the ISA bus or crossing the 64KB boundary. If the transfers will be always done from buffers which conform to these conditions (such as those allocated by bus_dmamem_alloc() with proper limitations) then isa_dmainit() does not have to be called. But it is quite convenient to transfer arbitrary data using the DMA controller. The bounce buffer will automatically care of the scatter-gather issues.

    • chan - channel number

    • bouncebufsize - size of the bounce buffer in bytes

  • void isa_dmastart(int flags, caddr_t addr, u_int nbytes, int chan)

    Prepare to start a DMA transfer. This function must be called to set up the DMA controller before actually starting transfer on the device. It checks that the buffer is contiguous and falls into the ISA memory range, if not then the bounce buffer is automatically used. If bounce buffer is required but not set up by isa_dmainit() or too small for the requested transfer size then the system will panic. In case of a write request with bounce buffer the data will be automatically copied to the bounce buffer.

  • flags - a bitmask determining the type of operation to be done. The direction bits B_READ and B_WRITE are mutually exclusive.

    • B_READ - read from the ISA bus into memory

    • B_WRITE - write from the memory to the ISA bus

    • B_RAW - if set then the DMA controller will remember the buffer and after the end of transfer will automatically re-initialize itself to repeat transfer of the same buffer again (of course, the driver may change the data in the buffer before initiating another transfer in the device). If not set then the parameters will work only for one transfer, and isa_dmastart() will have to be called again before initiating the next transfer. Using B_RAW makes sense only if the bounce buffer is not used.

  • addr - virtual address of the buffer

  • nbytes - length of the buffer. Must be less or equal to 64KB. Length of 0 is not allowed: the DMA controller will understand it as 64KB while the kernel code will understand it as 0 and that would cause unpredictable effects. For channels number 4 and higher the length must be even because these channels transfer 2 bytes at a time. In case of an odd length the last byte will not be transferred.

  • chan - channel number

  • void isa_dmadone(int flags, caddr_t addr, int nbytes, int chan)

    Synchronize the memory after device reports that transfer is done. If that was a read operation with a bounce buffer then the data will be copied from the bounce buffer to the original buffer. Arguments are the same as for isa_dmastart(). Flag B_RAW is permitted but it does not affect isa_dmadone() in any way.

  • int isa_dmastatus(int channel_number)

    Returns the number of bytes left in the current transfer to be transferred. In case the flag B_READ was set in isa_dmastart() the number returned will never be equal to zero. At the end of transfer it will be automatically reset back to the length of buffer. The normal use is to check the number of bytes left after the device signals that the transfer is completed. If the number of bytes is not 0 then something probably went wrong with that transfer.

  • int isa_dmastop(int channel_number)

    Aborts the current transfer and returns the number of bytes left untransferred.

10.8. xxx_isa_probe

This function probes if a device is present. If the driver supports auto-detection of some part of device configuration (such as interrupt vector or memory address) this auto-detection must be done in this routine.

As for any other bus, if the device cannot be detected or is detected but failed the self-test or some other problem happened then it returns a positive value of error. The value ENXIO must be returned if the device is not present. Other error values may mean other conditions. Zero or negative values mean success. Most of the drivers return zero as success.

The negative return values are used when a PnP device supports multiple interfaces. For example, an older compatibility interface and a newer advanced interface which are supported by different drivers. Then both drivers would detect the device. The driver which returns a higher value in the probe routine takes precedence (in other words, the driver returning 0 has highest precedence, one returning -1 is next, one returning -2 is after it and so on). In result the devices which support only the old interface will be handled by the old driver (which should return -1 from the probe routine) while the devices supporting the new interface as well will be handled by the new driver (which should return 0 from the probe routine).

The device descriptor struct xxx_softc is allocated by the system before calling the probe routine. If the probe routine returns an error the descriptor will be automatically deallocated by the system. So if a probing error occurs the driver must make sure that all the resources it used during probe are deallocated and that nothing keeps the descriptor from being safely deallocated. If the probe completes successfully the descriptor will be preserved by the system and later passed to the routine xxx_isa_attach(). If a driver returns a negative value it can not be sure that it will have the highest priority and its attach routine will be called. So in this case it also must release all the resources before returning and if necessary allocate them again in the attach routine. When xxx_isa_probe() returns 0 releasing the resources before returning is also a good idea and a well-behaved driver should do so. But in cases where there is some problem with releasing the resources the driver is allowed to keep resources between returning 0 from the probe routine and execution of the attach routine.

A typical probe routine starts with getting the device descriptor and unit:

         struct xxx_softc *sc = device_get_softc(dev);
          int unit = device_get_unit(dev);
          int pnperror;
          int error = 0;

          sc->dev = dev; /* link it back */
          sc->unit = unit;

Then check for the PnP devices. The check is carried out by a table containing the list of PnP IDs supported by this driver and human-readable descriptions of the device models corresponding to these IDs.

        pnperror=ISA_PNP_PROBE(device_get_parent(dev), dev,
        xxx_pnp_ids); if(pnperror == ENXIO) return ENXIO;

The logic of ISA_PNP_PROBE is the following: If this card (device unit) was not detected as PnP then ENOENT will be returned. If it was detected as PnP but its detected ID does not match any of the IDs in the table then ENXIO is returned. Finally, if it has PnP support and it matches on of the IDs in the table, 0 is returned and the appropriate description from the table is set by device_set_desc().

If a driver supports only PnP devices then the condition would look like:

          if(pnperror != 0)
              return pnperror;

No special treatment is required for the drivers which do not support PnP because they pass an empty PnP ID table and will always get ENXIO if called on a PnP card.

The probe routine normally needs at least some minimal set of resources, such as I/O port number to find the card and probe it. Depending on the hardware the driver may be able to discover the other necessary resources automatically. The PnP devices have all the resources pre-set by the PnP subsystem, so the driver does not need to discover them by itself.

Typically the minimal information required to get access to the device is the I/O port number. Then some devices allow to get the rest of information from the device configuration registers (though not all devices do that). So first we try to get the port start value:

 sc->port0 = bus_get_resource_start(dev,
        SYS_RES_IOPORT, 0 /*rid*/); if(sc->port0 == 0) return ENXIO;

The base port address is saved in the structure softc for future use. If it will be used very often then calling the resource function each time would be prohibitively slow. If we do not get a port we just return an error. Some device drivers can instead be clever and try to probe all the possible ports, like this:

          /* table of all possible base I/O port addresses for this device */
          static struct xxx_allports {
              u_short port; /* port address */
              short used; /* flag: if this port is already used by some unit */
          } xxx_allports = {
              { 0x300, 0 },
              { 0x320, 0 },
              { 0x340, 0 },
              { 0, 0 } /* end of table */
          };

          ...
          int port, i;
          ...

          port =  bus_get_resource_start(dev, SYS_RES_IOPORT, 0 /*rid*/);
          if(port !=0 ) {
              for(i=0; xxx_allports[i].port!=0; i++) {
                  if(xxx_allports[i].used || xxx_allports[i].port != port)
                      continue;

                  /* found it */
                  xxx_allports[i].used = 1;
                  /* do probe on a known port */
                  return xxx_really_probe(dev, port);
              }
              return ENXIO; /* port is unknown or already used */
          }

          /* we get here only if we need to guess the port */
          for(i=0; xxx_allports[i].port!=0; i++) {
              if(xxx_allports[i].used)
                  continue;

              /* mark as used - even if we find nothing at this port
               * at least we won't probe it in future
               */
               xxx_allports[i].used = 1;

              error = xxx_really_probe(dev, xxx_allports[i].port);
              if(error == 0) /* found a device at that port */
                  return 0;
          }
          /* probed all possible addresses, none worked */
          return ENXIO;

Of course, normally the driver’s identify() routine should be used for such things. But there may be one valid reason why it may be better to be done in probe(): if this probe would drive some other sensitive device crazy. The probe routines are ordered with consideration of the sensitive flag: the sensitive devices get probed first and the rest of the devices later. But the identify() routines are called before any probes, so they show no respect to the sensitive devices and may upset them.

Now, after we got the starting port we need to set the port count (except for PnP devices) because the kernel does not have this information in the configuration file.

         if(pnperror /* only for non-PnP devices */
         && bus_set_resource(dev, SYS_RES_IOPORT, 0, sc->port0,
         XXX_PORT_COUNT)<0)
             return ENXIO;

Finally allocate and activate a piece of port address space (special values of start and end mean "use those we set by bus_set_resource()"):

          sc->port0_rid = 0;
          sc->port0_r = bus_alloc_resource(dev, SYS_RES_IOPORT,
          &sc->port0_rid,
              /*start*/ 0, /*end*/ ~0, /*count*/ 0, RF_ACTIVE);

          if(sc->port0_r == NULL)
              return ENXIO;

Now having access to the port-mapped registers we can poke the device in some way and check if it reacts like it is expected to. If it does not then there is probably some other device or no device at all at this address.

Normally drivers do not set up the interrupt handlers until the attach routine. Instead they do probes in the polling mode using the DELAY() function for timeout. The probe routine must never hang forever, all the waits for the device must be done with timeouts. If the device does not respond within the time it is probably broken or misconfigured and the driver must return error. When determining the timeout interval give the device some extra time to be on the safe side: although DELAY() is supposed to delay for the same amount of time on any machine it has some margin of error, depending on the exact CPU.

If the probe routine really wants to check that the interrupts really work it may configure and probe the interrupts too. But that is not recommended.

          /* implemented in some very device-specific way */
          if(error = xxx_probe_ports(sc))
              goto bad; /* will deallocate the resources before returning */

The function xxx_probe_ports() may also set the device description depending on the exact model of device it discovers. But if there is only one supported device model this can be as well done in a hardcoded way. Of course, for the PnP devices the PnP support sets the description from the table automatically.

          if(pnperror)
              device_set_desc(dev, "Our device model 1234");

Then the probe routine should either discover the ranges of all the resources by reading the device configuration registers or make sure that they were set explicitly by the user. We will consider it with an example of on-board memory. The probe routine should be as non-intrusive as possible, so allocation and check of functionality of the rest of resources (besides the ports) would be better left to the attach routine.

The memory address may be specified in the kernel configuration file or on some devices it may be pre-configured in non-volatile configuration registers. If both sources are available and different, which one should be used? Probably if the user bothered to set the address explicitly in the kernel configuration file they know what they are doing and this one should take precedence. An example of implementation could be:

          /* try to find out the config address first */
          sc->mem0_p = bus_get_resource_start(dev, SYS_RES_MEMORY, 0 /*rid*/);
          if(sc->mem0_p == 0) { /* nope, not specified by user */
              sc->mem0_p = xxx_read_mem0_from_device_config(sc);

          if(sc->mem0_p == 0)
                  /* can't get it from device config registers either */
                  goto bad;
          } else {
              if(xxx_set_mem0_address_on_device(sc) < 0)
                  goto bad; /* device does not support that address */
          }

          /* just like the port, set the memory size,
           * for some devices the memory size would not be constant
           * but should be read from the device configuration registers instead
           * to accommodate different models of devices. Another option would
           * be to let the user set the memory size as "msize" configuration
           * resource which will be automatically handled by the ISA bus.
           */
           if(pnperror) { /* only for non-PnP devices */
              sc->mem0_size = bus_get_resource_count(dev, SYS_RES_MEMORY, 0 /*rid*/);
              if(sc->mem0_size == 0) /* not specified by user */
                  sc->mem0_size = xxx_read_mem0_size_from_device_config(sc);

              if(sc->mem0_size == 0) {
                  /* suppose this is a very old model of device without
                   * auto-configuration features and the user gave no preference,
                   * so assume the minimalistic case
                   * (of course, the real value will vary with the driver)
                   */
                  sc->mem0_size = 8*1024;
              }

              if(xxx_set_mem0_size_on_device(sc) < 0)
                  goto bad; /* device does not support that size */

              if(bus_set_resource(dev, SYS_RES_MEMORY, /*rid*/0,
                      sc->mem0_p, sc->mem0_size)<0)
                  goto bad;
          } else {
              sc->mem0_size = bus_get_resource_count(dev, SYS_RES_MEMORY, 0 /*rid*/);
          }

Resources for IRQ and DRQ are easy to check by analogy.

If all went well then release all the resources and return success.

          xxx_free_resources(sc);
          return 0;

Finally, handle the troublesome situations. All the resources should be deallocated before returning. We make use of the fact that before the structure softc is passed to us it gets zeroed out, so we can find out if some resource was allocated: then its descriptor is non-zero.

          bad:

          xxx_free_resources(sc);
          if(error)
                return error;
          else /* exact error is unknown */
              return ENXIO;

That would be all for the probe routine. Freeing of resources is done from multiple places, so it is moved to a function which may look like:

static void
           xxx_free_resources(sc)
              struct xxx_softc *sc;
          {
              /* check every resource and free if not zero */

              /* interrupt handler */
              if(sc->intr_r) {
                  bus_teardown_intr(sc->dev, sc->intr_r, sc->intr_cookie);
                  bus_release_resource(sc->dev, SYS_RES_IRQ, sc->intr_rid,
                      sc->intr_r);
                  sc->intr_r = 0;
              }

              /* all kinds of memory maps we could have allocated */
              if(sc->data_p) {
                  bus_dmamap_unload(sc->data_tag, sc->data_map);
                  sc->data_p = 0;
              }
               if(sc->data) { /* sc->data_map may be legitimately equal to 0 */
                  /* the map will also be freed */
                  bus_dmamem_free(sc->data_tag, sc->data, sc->data_map);
                  sc->data = 0;
              }
              if(sc->data_tag) {
                  bus_dma_tag_destroy(sc->data_tag);
                  sc->data_tag = 0;
              }

              ... free other maps and tags if we have them ...

              if(sc->parent_tag) {
                  bus_dma_tag_destroy(sc->parent_tag);
                  sc->parent_tag = 0;
              }

              /* release all the bus resources */
              if(sc->mem0_r) {
                  bus_release_resource(sc->dev, SYS_RES_MEMORY, sc->mem0_rid,
                      sc->mem0_r);
                  sc->mem0_r = 0;
              }
              ...
              if(sc->port0_r) {
                  bus_release_resource(sc->dev, SYS_RES_IOPORT, sc->port0_rid,
                      sc->port0_r);
                  sc->port0_r = 0;
              }
          }

10.9. xxx_isa_attach

The attach routine actually connects the driver to the system if the probe routine returned success and the system had chosen to attach that driver. If the probe routine returned 0 then the attach routine may expect to receive the device structure softc intact, as it was set by the probe routine. Also if the probe routine returns 0 it may expect that the attach routine for this device shall be called at some point in the future. If the probe routine returns a negative value then the driver may make none of these assumptions.

The attach routine returns 0 if it completed successfully or error code otherwise.

The attach routine starts just like the probe routine, with getting some frequently used data into more accessible variables.

          struct xxx_softc *sc = device_get_softc(dev);
          int unit = device_get_unit(dev);
          int error = 0;

Then allocate and activate all the necessary resources. As normally the port range will be released before returning from probe, it has to be allocated again. We expect that the probe routine had properly set all the resource ranges, as well as saved them in the structure softc. If the probe routine had left some resource allocated then it does not need to be allocated again (which would be considered an error).

          sc->port0_rid = 0;
          sc->port0_r = bus_alloc_resource(dev, SYS_RES_IOPORT,  &sc->port0_rid,
              /*start*/ 0, /*end*/ ~0, /*count*/ 0, RF_ACTIVE);

          if(sc->port0_r == NULL)
               return ENXIO;

          /* on-board memory */
          sc->mem0_rid = 0;
          sc->mem0_r = bus_alloc_resource(dev, SYS_RES_MEMORY,  &sc->mem0_rid,
              /*start*/ 0, /*end*/ ~0, /*count*/ 0, RF_ACTIVE);

          if(sc->mem0_r == NULL)
                goto bad;

          /* get its virtual address */
          sc->mem0_v = rman_get_virtual(sc->mem0_r);

The DMA request channel (DRQ) is allocated likewise. To initialize it use functions of the isa_dma*() family. For example:

isa_dmacascade(sc→drq0);

The interrupt request line (IRQ) is a bit special. Besides allocation the driver’s interrupt handler should be associated with it. Historically in the old ISA drivers the argument passed by the system to the interrupt handler was the device unit number. But in modern drivers the convention suggests passing the pointer to structure softc. The important reason is that when the structures softc are allocated dynamically then getting the unit number from softc is easy while getting softc from the unit number is difficult. Also this convention makes the drivers for different buses look more uniform and allows them to share the code: each bus gets its own probe, attach, detach and other bus-specific routines while the bulk of the driver code may be shared among them.

          sc->intr_rid = 0;
          sc->intr_r = bus_alloc_resource(dev, SYS_RES_MEMORY,  &sc->intr_rid,
                /*start*/ 0, /*end*/ ~0, /*count*/ 0, RF_ACTIVE);

          if(sc->intr_r == NULL)
              goto bad;

          /*
           * XXX_INTR_TYPE is supposed to be defined depending on the type of
           * the driver, for example as INTR_TYPE_CAM for a CAM driver
           */
          error = bus_setup_intr(dev, sc->intr_r, XXX_INTR_TYPE,
              (driver_intr_t *) xxx_intr, (void *) sc, &sc->intr_cookie);
          if(error)
              goto bad;

If the device needs to make DMA to the main memory then this memory should be allocated like described before:

          error=bus_dma_tag_create(NULL, /*alignment*/ 4,
              /*boundary*/ 0, /*lowaddr*/ BUS_SPACE_MAXADDR_24BIT,
              /*highaddr*/ BUS_SPACE_MAXADDR, /*filter*/ NULL, /*filterarg*/ NULL,
              /*maxsize*/ BUS_SPACE_MAXSIZE_24BIT,
              /*nsegments*/ BUS_SPACE_UNRESTRICTED,
              /*maxsegsz*/ BUS_SPACE_MAXSIZE_24BIT, /*flags*/ 0,
              &sc->parent_tag);
          if(error)
              goto bad;

          /* many things get inherited from the parent tag
           * sc->data is supposed to point to the structure with the shared data,
           * for example for a ring buffer it could be:
           * struct {
           *   u_short rd_pos;
           *   u_short wr_pos;
           *   char    bf[XXX_RING_BUFFER_SIZE]
           * } *data;
           */
          error=bus_dma_tag_create(sc->parent_tag, 1,
              0, BUS_SPACE_MAXADDR, 0, /*filter*/ NULL, /*filterarg*/ NULL,
              /*maxsize*/ sizeof(* sc->data), /*nsegments*/ 1,
              /*maxsegsz*/ sizeof(* sc->data), /*flags*/ 0,
              &sc->data_tag);
          if(error)
              goto bad;

          error = bus_dmamem_alloc(sc->data_tag, &sc->data, /* flags*/ 0,
              &sc->data_map);
          if(error)
               goto bad;

          /* xxx_alloc_callback() just saves the physical address at
           * the pointer passed as its argument, in this case &sc->data_p.
           * See details in the section on bus memory mapping.
           * It can be implemented like:
           *
           * static void
           * xxx_alloc_callback(void *arg, bus_dma_segment_t *seg,
           *     int nseg, int error)
           * {
           *    *(bus_addr_t *)arg = seg[0].ds_addr;
           * }
           */
          bus_dmamap_load(sc->data_tag, sc->data_map, (void *)sc->data,
              sizeof (* sc->data), xxx_alloc_callback, (void *) &sc->data_p,
              /*flags*/0);

After all the necessary resources are allocated the device should be initialized. The initialization may include testing that all the expected features are functional.

          if(xxx_initialize(sc) < 0)
               goto bad;

The bus subsystem will automatically print on the console the device description set by probe. But if the driver wants to print some extra information about the device it may do so, for example:

        device_printf(dev, "has on-card FIFO buffer of %d bytes\n", sc->fifosize);

If the initialization routine experiences any problems then printing messages about them before returning error is also recommended.

The final step of the attach routine is attaching the device to its functional subsystem in the kernel. The exact way to do it depends on the type of the driver: a character device, a block device, a network device, a CAM SCSI bus device and so on.

If all went well then return success.

          error = xxx_attach_subsystem(sc);
          if(error)
              goto bad;

          return 0;

Finally, handle the troublesome situations. All the resources should be deallocated before returning an error. We make use of the fact that before the structure softc is passed to us it gets zeroed out, so we can find out if some resource was allocated: then its descriptor is non-zero.

          bad:

          xxx_free_resources(sc);
          if(error)
              return error;
          else /* exact error is unknown */
              return ENXIO;

That would be all for the attach routine.

10.10. xxx_isa_detach

If this function is present in the driver and the driver is compiled as a loadable module then the driver gets the ability to be unloaded. This is an important feature if the hardware supports hot plug. But the ISA bus does not support hot plug, so this feature is not particularly important for the ISA devices. The ability to unload a driver may be useful when debugging it, but in many cases installation of the new version of the driver would be required only after the old version somehow wedges the system and a reboot will be needed anyway, so the efforts spent on writing the detach routine may not be worth it. Another argument that unloading would allow upgrading the drivers on a production machine seems to be mostly theoretical. Installing a new version of a driver is a dangerous operation which should never be performed on a production machine (and which is not permitted when the system is running in secure mode). Still, the detach routine may be provided for the sake of completeness.

The detach routine returns 0 if the driver was successfully detached or the error code otherwise.

The logic of detach is a mirror of the attach. The first thing to do is to detach the driver from its kernel subsystem. If the device is currently open then the driver has two choices: refuse to be detached or forcibly close and proceed with detach. The choice used depends on the ability of the particular kernel subsystem to do a forced close and on the preferences of the driver’s author. Generally the forced close seems to be the preferred alternative.

          struct xxx_softc *sc = device_get_softc(dev);
          int error;

          error = xxx_detach_subsystem(sc);
          if(error)
              return error;

Next the driver may want to reset the hardware to some consistent state. That includes stopping any ongoing transfers, disabling the DMA channels and interrupts to avoid memory corruption by the device. For most of the drivers this is exactly what the shutdown routine does, so if it is included in the driver we can just call it.

xxx_isa_shutdown(dev);

And finally release all the resources and return success.

          xxx_free_resources(sc);
          return 0;

10.11. xxx_isa_shutdown

This routine is called when the system is about to be shut down. It is expected to bring the hardware to some consistent state. For most of the ISA devices no special action is required, so the function is not really necessary because the device will be re-initialized on reboot anyway. But some devices have to be shut down with a special procedure, to make sure that they will be properly detected after soft reboot (this is especially true for many devices with proprietary identification protocols). In any case disabling DMA and interrupts in the device registers and stopping any ongoing transfers is a good idea. The exact action depends on the hardware, so we do not consider it here in any detail.

10.12. xxx_intr

The interrupt handler is called when an interrupt is received which may be from this particular device. The ISA bus does not support interrupt sharing (except in some special cases) so in practice if the interrupt handler is called then the interrupt almost for sure came from its device. Still, the interrupt handler must poll the device registers and make sure that the interrupt was generated by its device. If not it should just return.

The old convention for the ISA drivers was getting the device unit number as an argument. This is obsolete, and the new drivers receive whatever argument was specified for them in the attach routine when calling bus_setup_intr(). By the new convention it should be the pointer to the structure softc. So the interrupt handler commonly starts as:

          static void
          xxx_intr(struct xxx_softc *sc)
          {

It runs at the interrupt priority level specified by the interrupt type parameter of bus_setup_intr(). That means that all the other interrupts of the same type as well as all the software interrupts are disabled.

To avoid races it is commonly written as a loop:

          while(xxx_interrupt_pending(sc)) {
              xxx_process_interrupt(sc);
              xxx_acknowledge_interrupt(sc);
          }

The interrupt handler has to acknowledge interrupt to the device only but not to the interrupt controller, the system takes care of the latter.


Last modified on: August 21, 2022 by Colin Percival