illumos Kernel and Device Tree
A device driver needs to work transparently as an integral part of the operating system. Understanding how the kernel works is a prerequisite for learning about device drivers. This chapter provides an overview of the illumos kernel and device tree. For an overview of how device drivers work, see Overview of illumos Device Drivers.
This chapter provides information on the following subjects:
2.1. What Is the Kernel?
The illumos kernel is a program that manages system resources. The kernel insulates applications from the system hardware and provides them with essential system services such as input/output (I/O) management, virtual memory, and scheduling. The kernel consists of object modules that are dynamically loaded into memory when needed.
The illumos kernel can be divided logically into two parts: the first part, referred to as the kernel, manages file systems, scheduling, and virtual memory. The second part, referred to as the I/O subsystem, manages the physical components.
The kernel provides a set of interfaces for applications to use that are accessible through system calls. System calls are documented in section 2 of the Reference Manual Collection (see Intro(2)). Some system calls are used to invoke device drivers to perform I/O. Device drivers are loadable kernel modules that manage data transfers while insulating the rest of the kernel from the device hardware. To be compatible with the operating system, device drivers need to be able to accommodate such features as multithreading, virtual memory addressing, and both 32-bit and 64-bit operation.
The following figure illustrates the kernel. The kernel modules handle system calls from application programs. The I/O modules communicate with hardware.
The kernel provides access to device drivers through the following features:
Device-to-driver mapping. The kernel maintains the device tree. Each node in the tree represents a virtual or a physical device. The kernel binds each node to a driver by matching the device node name with the set of drivers installed in the system. The device is made accessible to applications only if there is a driver binding.
DDI/DKI interfaces. DDI/DKI (Device Driver Interface/Driver-Kernel Interface) interfaces standardize interactions between the driver and the kernel, the device hardware, and the boot/configuration software. These interfaces keep the driver independent from the kernel and improve the driver's portability across successive releases of the operating system on a particular machine.
LDI. The LDI (Layered Driver Interface) is an extension of the DDI/DKI. The LDI enables a kernel module to access other devices in the system. The LDI also enables you to determine which devices are currently being used by the kernel. See Layered Driver Interface (LDI).
2.1.1. Multithreaded Execution Environment
The illumos kernel is multithreaded. On a multiprocessor machine, multiple kernel threads can be running kernel code, and can do so concurrently. Kernel threads can also be preempted by other kernel threads at any time.
The multithreading of the kernel imposes some additional restrictions on device drivers. For more information on multithreading considerations, see Multithreading. Device drivers must be coded to run as needed at the request of many different threads. For each thread, a driver must handle contention problems from overlapping I/O requests.
2.1.2. Virtual Memory
A complete overview of the illumos virtual memory system is beyond the scope of this book, but two virtual memory terms of special importance are used when discussing device drivers: virtual address and address space.
Virtual address. A virtual address is an address that is mapped by the memory management unit (MMU) to a physical hardware address. All addresses directly accessible by the driver are kernel virtual addresses. Kernel virtual addresses refer to the kernel address space.
Address space. An address space is a set of virtual address segments. Each segment is a contiguous range of virtual addresses. Each user process has an address space called the user address space. The kernel has its own address space, called the kernel address space.
2.1.3. Devices as Special Files
Devices are represented in the file system by special files. In illumos, these files reside in the /devices directory hierarchy.
Special files can be of type block or character. The type indicates which kind of device driver operates the device.
Drivers can be implemented to operate on both types. For example, disk drivers
export a character interface for use by the
mkfs(1) utilities, and a block interface for use by the file system.
Associated with each special file is a device number (
dev_t). A device number consists of a major number and
a minor number. The major number
identifies the device driver associated with the special file. The minor number is created and used by the device driver to further identify
the special file. Usually, the minor number is an encoding that is used to
identify which device instance the driver should access and which type of
access should be performed. For example, the minor number can identify a tape
device used for backup and can specify that the tape needs to be rewound when
the backup operation is complete.
2.1.4. DDI/DKI Interfaces
In System V Release 4 (SVR4), the interface between device drivers and the rest of the UNIX kernel was standardized as the DDI/DKI. The DDI/DKI is documented in section 9 of the Reference Manual Collection. Section 9E documents driver entry points, section 9F documents driver-callable functions, and section 9S documents kernel data structures used by device drivers. See Intro(9E), Intro(9F), and Intro(9S).
The DDI/DKI is intended to standardize and document all interfaces between device drivers and the rest of the kernel. In addition, the DDI/DKI enables source and binary compatibility for drivers on any machine that runs illumos, regardless of the processor architecture, whether SPARC or x86. Drivers that use only kernel facilities that are part of the DDI/DKI are known as DDI/DKI-compliant device drivers.
The DDI/DKI enables you to write platform-independent device drivers for any machine that runs illumos. These binary-compatible drivers enable you to more easily integrate third-party hardware and software into any machine that runs illumos. The DDI/DKI is architecture independent, which enables the same driver to work across a diverse set of machine architectures.
Platform independence is accomplished by the design of DDI in the following areas:
Dynamic loading and unloading of modules
Accessing the device space from the kernel or a user process, that is, register mapping and memory mapping
Accessing kernel or user process space from the device using DMA services
Managing device properties
2.2. Overview of the Device Tree
Devices in illumos are represented as a tree of interconnected device information nodes. The device tree describes the configuration of loaded devices for a particular machine.
2.2.1. Device Tree Components
The system builds a tree structure that contains information about the devices connected to the machine at boot time. The device tree can also be modified by dynamic reconfiguration operations while the system is in normal operation. The tree begins at the root device node, which represents the platform.
Below the root node are the branches of the device tree. A branch consists of one or more bus nexus devices and a terminating leaf device.
A bus nexus device provides bus mapping and translation services to subordinate devices in the device tree. PCI - PCI bridges, PCMCIA adapters, and SCSI HBAs are all examples of nexus devices. The discussion of writing drivers for nexus devices is limited to the development of SCSI HBA drivers (see SCSI Host Bus Adapter Drivers).
Leaf devices are typically peripheral devices such as disks, tapes, network adapters, frame buffers, and so forth. Leaf device drivers export the traditional character driver interfaces and block driver interfaces. The interfaces enable user processes to read data from and write data to either storage or communication devices.
The system goes through the following steps to build the tree:
The CPU is initialized and searches for firmware.
The main firmware (OpenBoot, Basic Input/Output System (BIOS), or
Bootconf) initializes and creates the device tree with known or self-identifying hardware.
When the main firmware finds compatible firmware on a device, the main firmware initializes the device and retrieves the device's properties.
The firmware locates and boots the operating system.
The kernel starts at the root node of the tree, searches for a matching device driver, and binds that driver to the device.
If the device is a nexus, the kernel looks for child devices that have not been detected by the firmware. The kernel adds any child devices to the tree below the nexus node.
The kernel repeats the process from Step 5 until no further device nodes need to be created.
Each driver exports a device operations structure dev_ops(9S) to define the operations that the device driver can perform. The device operations structure contains function pointers for generic operations such as attach(9E), detach(9E), and getinfo(9E). The structure also contains a pointer to a set of operations specific to bus nexus drivers and a pointer to a set of operations specific to leaf drivers.
The tree structure creates a parent-child relationship between nodes. This parent-child relationship is the key to architectural independence. When a leaf or bus nexus driver requires a service that is architecturally dependent in nature, that driver requests its parent to provide the service. This approach enables drivers to function regardless of the architecture of the machine or the processor. A typical device tree is shown in the following figure.
The nexus nodes can have one or more children. The leaf nodes represent individual devices.
2.2.2. Displaying the Device Tree
The device tree can be displayed in three ways:
libdevinfolibrary provides interfaces to access the contents of the device tree programmatically.
The prtconf(1M) command displays the complete contents of the device tree.
The /devices hierarchy is a representation of the device tree. Use the ls(1) command to view the hierarchy.
/devices displays only devices that have drivers configured into the system. The prtconf(1M) command shows all device nodes regardless of whether a driver for the device exists on the system.
provides interfaces for accessing all public device configuration data. See
the libdevinfo(3LIB) man page for a list of interfaces.
The following excerpted prtconf(1M) command example displays all the devices in the system.
System Configuration: Sun Microsystems sun4u Memory size: 128 Megabytes System Peripherals (Software Nodes): SUNW,Ultra-5_10 packages (driver not attached) terminal-emulator (driver not attached) deblocker (driver not attached) obp-tftp (driver not attached) disk-label (driver not attached) SUNW,builtin-drivers (driver not attached) sun-keyboard (driver not attached) ufs-file-system (driver not attached) chosen (driver not attached) openprom (driver not attached) client-services (driver not attached) options, instance #0 aliases (driver not attached) memory (driver not attached) virtual-memory (driver not attached) pci, instance #0 pci, instance #0 ebus, instance #0 auxio (driver not attached) power, instance #0 SUNW,pll (driver not attached) se, instance #0 su, instance #0 su, instance #1 ecpp (driver not attached) fdthree, instance #0 eeprom (driver not attached) flashprom (driver not attached) SUNW,CS4231 (driver not attached) network, instance #0 SUNW,m64B (driver not attached) ide, instance #0 disk (driver not attached) cdrom (driver not attached) dad, instance #0 sd, instance #15 pci, instance #1 pci, instance #0 pci108e,1000 (driver not attached) SUNW,hme, instance #1 SUNW,isptwo, instance #0 sd (driver not attached) st (driver not attached) sd, instance #0 (driver not attached) sd, instance #1 (driver not attached) sd, instance #2 (driver not attached) ... SUNW,UltraSPARC-IIi (driver not attached) SUNW,ffb, instance #0 pseudo, instance #0
The /devices hierarchy provides a namespace that represents the device tree. Following is an abbreviated listing of the /devices namespace. The sample output corresponds to the example device tree and prtconf(1M) output shown previously.
/devices /devices/pseudo /devices/pci@1f,0:devctl /devices/SUNW,ffb@1e,0:ffb0 /devices/pci@1f,0 /devices/pci@1f,0/pci@1,1 /devices/pci@1f,0/pci@1,1/SUNW,m64B@2:m640 /devices/pci@1f,0/pci@1,1/ide@3:devctl /devices/pci@1f,0/pci@1,1/ide@3:scsi /devices/pci@1f,0/pci@1,1/ebus@1 /devices/pci@1f,0/pci@1,1/ebus@1/power@14,724000:power_button /devices/pci@1f,0/pci@1,1/ebus@1/se@14,400000:a /devices/pci@1f,0/pci@1,1/ebus@1/se@14,400000:b /devices/pci@1f,0/pci@1,1/ebus@1/se@14,400000:0,hdlc /devices/pci@1f,0/pci@1,1/ebus@1/se@14,400000:1,hdlc /devices/pci@1f,0/pci@1,1/ebus@1/se@14,400000:a,cu /devices/pci@1f,0/pci@1,1/ebus@1/se@14,400000:b,cu /devices/pci@1f,0/pci@1,1/ebus@1/ecpp@14,3043bc:ecpp0 /devices/pci@1f,0/pci@1,1/ebus@1/fdthree@14,3023f0:a /devices/pci@1f,0/pci@1,1/ebus@1/fdthree@14,3023f0:a,raw /devices/pci@1f,0/pci@1,1/ebus@1/SUNW,CS4231@14,200000:sound,audio /devices/pci@1f,0/pci@1,1/ebus@1/SUNW,CS4231@14,200000:sound,audioctl /devices/pci@1f,0/pci@1,1/ide@3 /devices/pci@1f,0/pci@1,1/ide@3/sd@2,0:a /devices/pci@1f,0/pci@1,1/ide@3/sd@2,0:a,raw /devices/pci@1f,0/pci@1,1/ide@3/dad@0,0:a /devices/pci@1f,0/pci@1,1/ide@3/dad@0,0:a,raw /devices/pci@1f,0/pci@1 /devices/pci@1f,0/pci@1/pci@2 /devices/pci@1f,0/pci@1/pci@2/SUNW,isptwo@4:devctl /devices/pci@1f,0/pci@1/pci@2/SUNW,isptwo@4:scsi
2.2.3. Binding a Driver to a Device
In addition to constructing the device tree, the kernel determines the drivers that are used to manage the devices.
Binding a driver to a device refers to the process by which the system selects a driver to manage a particular device. The binding name is the name that links a driver to a unique device node in the device information tree. For each device in the device tree, the system attempts to choose a driver from a list of installed drivers.
Each device node has an associated name property. This property can be assigned either from an external
agent, such as the PROM, during system boot or from a driver.conf configuration
file. In any case, the name property represents the
node name assigned to a device in the device tree.
node name is the name visible in
/devices and listed
in the prtconf(1M) output.
A device node can have an associated compatible property as well. The compatible property contains an ordered list of one or more possible driver names or driver aliases for the device.
The system uses both the compatible and the name properties to select a driver for the device. The system first attempts to match the contents of the compatible property, if the compatible property exists, to a driver on the system. Beginning with the first driver name on the compatible property list, the system attempts to match the driver name to a known driver on the system. Each entry on the list is processed until the system either finds a match or reaches the end of the list.
If the contents of either the name property or the compatible property match a driver on the system, then that driver is bound to the device node. If no match is found, no driver is bound to the device node.
Generic Device Names
Some devices specify a generic device name as the value for the name property. Generic device names describe the function of a device without actually identifying a specific driver for the device. For example, a SCSI host bus adapter might have a generic device name of scsi. An Ethernet device might have a generic device name of ethernet.
The compatible property enables the system to determine alternate driver names for devices with a generic device name, for example, glm for scsi HBA device drivers or hme for ethernet device drivers.
Devices with generic device names are required to supply a compatible property.
For a complete description of generic device names, see the IEEE 1275 Open Firmware Boot Standard.
The following figure shows a device node with a specific device name. The driver binding name SUNW,ffb is the same name as the device node name.
The following figure shows a device node with the generic device name display. The driver binding name SUNW,ffb is the first name on the compatible property driver list that matches a driver on the system driver list. In this case, display is a generic device name for frame buffers.