Debugging, Testing, and Tuning Device Drivers
This chapter presents an overview of the various tools that are provided to assist with testing, debugging, and tuning device drivers. This chapter provides information on the following subjects:
-
Testing Drivers – Testing a driver can potentially impair a system's ability to function. Use of both serial connections and alternate kernels helps facilitate recovery from crashes.
-
Debugging Tools – Integral debugging facilities enable you to exercise and observe driver features conveniently without having to run a separate debugger.
-
Tuning Drivers – illumos provides facilities for measuring the performance of device drivers. Writing kernel statistics structures for your device exports continuous statistics as the device is running. If an area for performance improvement is determined, then the DTrace dynamic instrumentation tool can help determine any problems more precisely.
22.1. Testing Drivers
To avoid data loss and other problems, you should take special care when testing a new device driver. This section discusses various testing strategies. For example, setting up a separate system that you control through a serial connection is the safest way to test a new driver. You can load test modules with various kernel variable settings to test performance under different kernel conditions. Should your system crash, you should be prepared to restore backup data, analyze any crash dumps, and rebuild the device directory.
22.1.1. Enable the Deadman Feature to Avoid a Hard Hang
If your system is in a hard hang, then you cannot break into the debugger. If you enable the deadman feature, the system panics instead of hanging indefinitely. You can then use the kmdb(1) kernel debugger to analyze your problem.
The deadman feature checks every second whether the system clock is updating. If the system clock is not updating, then you are in an indefinite hang. If the system clock has not been updated for 50 seconds, the deadman feature induces a panic and puts you in the debugger.
-
Make sure you are capturing crash images with dumpadm(1M).
-
Set the
snooping
variable in the /etc/system file. See the system(4) man page for information on the /etc/system file.set snooping=1
-
Reboot the system so that the /etc/system file is read again and the
snooping
setting takes effect.
Note that any zones on your system inherit the deadman setting as well.
If your system hangs while the deadman feature is enabled, you should see output similar to the following example on your console:
panic[cpu1]/thread=30018dd6cc0: deadman: timed out after 9 seconds of
clock inactivity
panic: entering debugger (continue to save dump)
Inside the debugger, use the ::cpuinfo
command to
investigate why the clock interrupt was not able to fire and advance the system
time.
22.1.2. Testing With a Serial Connection
Using a serial connection is a good way to test drivers. Use the tip(1) command to make a serial connection between a host system and a test system. With this approach, the tip window on the host console is used as the console of the test machine. See the tip(1) man page for additional information.
-
Interactions with the test system and kernel debuggers can be monitored. For example, the window can keep a log of the session for use if the driver crashes the test system.
-
The test machine can be accessed remotely by logging into a tip host machine and using tip(1) to connect to the test machine.
Although using a tip connection and a second machine are not required to debug an illumos device driver, this technique is still recommended.
Connect the host system to the test machine using serial port A on both machines.
This connection must be made with a null modem cable.
On the host system, make sure there is an entry in /etc/remote for the connection. See the remote(4) man page for details.
The terminal entry must match the serial port that is used. illumos comes with the correct entry for serial port B, but a terminal entry must be added for serial port A:
debug:\ :dv=/dev/term/a:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:
The baud rate must be set to 9600.
In a shell window on the host, run tip(1) and specify the name of the entry:
% tip debug connected
The shell window is now a tip window with a connection to the console of the test machine.
Do not use STOP-A
for SPARC machines or F1-A
for x86 architecture machines on the host machine to stop the
test machine. This action actually stops the host machine. To send a break
to the test machine, type ~# in the tip window. Commands
such as ~# are recognized only if these characters
on first on the line. If the command has no effect, press either the Return
key or Control-U.
Setting Up a Target System on the SPARC Platform
A quick way to set up the test machine on the SPARC platform is to unplug the keyboard before turning on the machine. The machine then automatically uses serial port A as the console.
Another way to set up the test machine is to use boot PROM commands
to make serial port A the console. On the test machine, at the boot PROM ok
prompt, direct console I/O to the serial line. To make the test
machine always come up with serial port A as the console, set the environment
variables: input-device
and output-device
.
ok setenv input-device ttya ok setenv output-device ttya
The eeprom
command can also be used to make serial
port A the console. As superuser, execute the following commands to make the input-device
and output-device
parameters point
to serial port A. The following example demonstrates the eeprom
command.
# eeprom input-device=ttya # eeprom output-device=ttya
The eeprom
commands cause the console to be redirected
to serial port A at each subsequent system boot.
Setting Up a Target System on the x86 Platform
On x86 platforms, use the eeprom
command to make
serial port A the console. This procedure is the same as the SPARC platform
procedure. See Setting Up a Target System on the SPARC Platform. The eeprom
command causes the console
to switch to serial port A (COM1) during reboot.
x86 machines do not transfer console control to the tip
connection
until an early stage in the boot process unless the BIOS supports console
redirection to a serial port. In SPARC machines, the tip
connection
maintains console control throughout the boot process.
22.1.3. Setting Up Test Modules
The system(4) file in the /etc directory
enables you to set the value of kernel variables at boot time. With kernel
variables, you can toggle different behaviors in a driver and take advantage
of debugging features that are provided by the kernel. The kernel variables moddebug
and kmem_flags
, which can be very useful in
debugging, are discussed later in this section. See also Enable the Deadman Feature to Avoid a Hard Hang.
Changes to kernel variables after boot are unreliable, because /etc/system
is read only once when the kernel boots. After this file is modified,
the system must be rebooted for the changes to take effect. If a change in
the file causes the system not to work, boot with the ask (-a
)
option. Then specify /dev/null
as the system file.
Kernel variables cannot be relied on to be present in subsequent releases.
Setting Kernel Variables
The set
command changes the value of module or kernel
variables. To set module variables, specify the module name and the variable:
set module_name:variable=value
For example, to set the variable test_debug
in a
driver that is named myTest
, use set
as
follows:
% set myTest:test_debug=1
To set a variable that is exported by the kernel itself, omit the module name.
You can also use a bitwise OR operation to set a value, for example:
% set moddebug | 0x80000000
Loading and Unloading Test Modules
The commands modload(1M), modunload(1M), and modinfo(1M) can be used to add test modules,
which is a useful technique for debugging and stress-testing drivers. These
commands are generally not needed in normal operation, because the kernel
automatically loads needed modules and unloads unused modules. The moddebug
kernel variable works with these commands to provide information
and set controls.
Using the modload Function
Use modload(1M) to force a module into memory. The modload
command
verifies that the driver has no unresolved references when that driver is
loaded. Loading a driver does not necessarily mean that
the driver can attach. When a driver loads successfully, the driver's _info(9E) entry point is called. The attach
entry point is not necessarily called.
Using the modinfo Function
Use modinfo(1M) to confirm that the driver is loaded.
$ modinfo Id Loadaddr Size Info Rev Module Name 6 101b6000 732 - 1 obpsym (OBP symbol callbacks) 7 101b65bd 1acd0 226 1 rpcmod (RPC syscall) 7 101b65bd 1acd0 226 1 rpcmod (32-bit RPC syscall) 7 101b65bd 1acd0 1 1 rpcmod (rpc interface str mod) 8 101ce8dd 74600 0 1 ip (IP STREAMS module) 8 101ce8dd 74600 3 1 ip (IP STREAMS device) ... $ modinfo | grep mydriver 169 781a8d78 13fb 0 1 mydriver (Test Driver 1.5)
The number in the info
field is the major number
that has been chosen for the driver. The modunload(1M) command can be used to unload a module
if the module ID is provided. The module ID is found in the left column of modinfo
output.
Sometimes a driver does not unload as expected after a modunload
is
issued, because the driver is determined to be busy. This situation occurs
when the driver fails detach(9E),
either because the driver really is busy, or because the detach
entry
point is implemented incorrectly.
Using modunload
To remove all of the currently unused modules from memory, run modunload(1M) with a module ID of 0:
# modunload -i 0
Setting the moddebug Kernel Variable
The moddebug
kernel variable controls the module
loading process. The possible values of moddebug
are:
0x80000000
-
Prints messages to the console when loading or unloading modules.
0x40000000
-
Gives more detailed error messages.
0x20000000
-
Prints more detail when loading or unloading, such as including the address and size.
0x00001000
-
No auto-unloading drivers. The system does not attempt to unload the device driver when the system resources become low.
0x00000080
-
No auto-unloading streams. The system does not attempt to unload the STREAMS module when the system resources become low.
0x00000010
-
No auto-unloading of kernel modules of any type.
0x00000001
-
If running with
kmdb
,moddebug
causes a breakpoint to be executed and a return tokmdb
immediately before each module's_init
routine is called. This setting also generates additional debug messages when the module's_info
and_fini
routines are executed.
Setting kmem_flags Debugging Flags
The kmem_flags
kernel variable enables debugging
features in the kernel's memory allocator. Set kmem_flags
to
0xf
to enable the allocator's debugging features. These
features include runtime checks to find the following code conditions:
-
Writing to a buffer after the buffer is freed
-
Using memory before the memory is initialized
-
Writing past the end of a buffer
The Modular Debugger Guide describes how to use the kernel memory allocator to analyze such problems.
Testing and developing with kmem_flags
set
to 0xf
can help detect latent memory corruption bugs. Because
setting kmem_flags
to 0xf
changes the
internal behavior of the kernel memory allocator, you should thoroughly test
without kmem_flags
as well.
22.1.4. Avoiding Data Loss on a Test System
A driver bug can sometimes render a system incapable of booting. By taking precautions, you can avoid system reinstallation in this event, as described in this section.
Back Up Critical System Files
A number of driver-related system files are difficult, if not impossible, to reconstruct. Files such as /etc/name_to_major, /etc/driver_aliases, /etc/driver_classes, and /etc/minor_perm can be corrupted if the driver crashes the system during installation. See the add_drv(1M) man page.
To be safe, make a backup copy of the root file system after the test machine is in the proper configuration. If you plan to modify the /etc/system file, make a backup copy of the file before making modifications.
To avoid rendering a system inoperable, you should boot from a copy of the kernel and associated binaries rather than from the default kernel.
Make a copy of the drivers in /platform/*
.
# cp -r /platform/`uname -i`/kernel /platform/`uname -i`/kernel.test
Place the driver module in /platform/`uname -i`/kernel.test/drv
.
Boot the alternate kernel instead of the default kernel.
After you have created and stored the alternate kernel, you can boot this kernel in a number of ways.
-
You can boot the alternate kernel by rebooting:
# reboot -- kernel.test/unix
-
On a SPARC-based system, you can also boot from the PROM:
ok boot kernel.test/sparcv9/unix
To boot with the
kmdb
debugger, use the-k
option as described in Getting Started With the Modular Debugger. -
On an x86-based system, when the
Select (b)oot or (i)nterpreter:
message is displayed in the boot process, type the following:boot kernel.test/unix
The following example demonstrates booting with an alternate kernel.
ok boot kernel.test/sparcv9/unix Rebooting with command: boot kernel.test/sparcv9/unix Boot device: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a File and \ args: kernel.test/sparcv9/unix
Alternatively, the module path can be changed by booting with the ask
(-a
) option. This option results in a series of prompts
for configuring the boot method.
ok boot -a Rebooting with command: boot -a Boot device: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a File and \ args: -a Enter filename [kernel/sparcv9/unix]: kernel.test/sparcv9/unix Enter default directory for modules [/platform/sun4u/kernel.test /kernel /usr/kernel]: <CR> Name of system file [etc/system]: <CR> SunOS Release 5.10 Version Generic 64-bit Copyright 1983-2002 Sun Microsystems, Inc. All rights reserved. root filesystem type [ufs]: <CR> Enter physical name of root device [/sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a]: <CR>
Consider Alternative Back-Up Plans
If the system is attached to a network, the test machine can be added as a client of a server. If a problem occurs, the system can be booted from the network. The local disks can then be mounted, and any fixes can be made. Alternatively, the system can be booted directly from the illumos system CD-ROM.
Another way to recover from disaster is to have another bootable root file system. Use format(1M) to make a partition that is the exact size of the original. Then use dd(1M) to copy the bootable root file system. After making a copy, run fsck(1M) on the new file system to ensure its integrity.
Subsequently, if the system cannot boot from the original root partition,
boot the backup partition. Use dd(1M) to
copy the backup partition onto the original partition. You might have a situation
where the system cannot boot even though the root file system is undamaged.
For example, the damage might be limited to the boot block or the boot program.
In such a case, you can boot from the backup partition with the ask (-a
)
option. You can then specify the original file system as the root file system.
Capture System Crash Dumps
When a system panics, the
system writes an image of kernel memory to the dump device. The dump device
is by default the most suitable swap device. The dump is a system crash dump,
similar to core dumps generated by applications. On rebooting after a panic, savecore(1M) checks the
dump device for a crash dump. If a dump is found, savecore
makes
a copy of the kernel's symbol table, which is called unix.n. The savecore
utility then dumps
a core file that is called vmcore.n in
the core image directory. By default, the core image directory is /var/crash/machine_name. If /var/crash has
insufficient space for a core dump, the system displays the needed space but
does not actually save the dump. The mdb(1) debugger
can then be used on the core dump and the saved kernel.
In most illumos distributions, crash dump is enabled by default. The dumpadm(1M) command is used to configure
system crash dumps. Use the dumpadm
command to verify that
crash dumps are enabled and to determine the location of core files that have
been saved.
You can prevent the savecore
utility from
filling the file system.
Add a file that is named minfree to the directory in
which the dumps are to be saved. In this file, specify the number of kilobytes
to remain free after savecore
has run. If insufficient
space is available, the core file is not saved.
22.1.5. Recovering the Device Directory
Damage to the /devices and /dev directories can occur if the driver crashes during attach(9E). If either directory is damaged, you can rebuild the directory by booting the system and running fsck(1M) to repair the damaged root file system. The root file system can then be mounted. Recreate the /devices and /dev directories by running devfsadm(1M) and specifying the /devices directory on the mounted disk.
The following example shows how to repair a damaged root file system on a SPARC system. In this example, the damaged disk is /dev/dsk/c0t3d0s0, and an alternate boot disk is /dev/dsk/c0t1d0s0.
ok boot disk1 ... Rebooting with command: boot kernel.test/sparcv9/unix Boot device: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@31,0:a File and \ args: kernel.test/sparcv9/unix ... # fsck /dev/dsk/c0t3d0s0** /dev/dsk/c0t3d0s0 ** Last Mounted on / ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 1478 files, 9922 used, 29261 free (141 frags, 3640 blocks, 0.4% fragmentation) # mount /dev/dsk/c0t3d0s0 /mnt # devfsadm -r /mnt
A fix to the /devices and /dev directories can allow the system to boot while other parts of the system are still corrupted. Such repairs are only a temporary fix to save information, such as system crash dumps, before reinstalling the system.
22.2. Debugging Tools
-
The kmdb(1) kernel debugger provides typical runtime debugger facilities, such as breakpoints, watch points, and single-stepping. The
kmdb
debugger supersedeskadb
, which was available in previous releases. The commands that were previously available fromkadb
are used inkmdb
, in addition to new functionality. Wherekadb
could only be loaded at boot time,kmdb
can be loaded at any time. Thekmdb
debugger is preferred for live, interactive debugging due to its execution controls. -
The mdb(1) modular debugger is more limited than
kmdb
as a real-time debugger, butmdb
has rich facilities for postmortem debugging.
The kmdb
and mdb
debuggers mostly
share the same user interface. Many debugging techniques therefore can be
applied with the same commands in both tools. Both debuggers support macros,
dcmds, and dmods. A dcmd (pronounced dee-command) is a
routine in the debugger that can access any of the properties of the current
target program. A dcmd can be dynamically loaded at runtime. A
dmod, which is short for debugger module, is a package of
dcmds that can be loaded to provide non-standard behavior.
Both mdb
and kmdb
are
backward-compatible with legacy debuggers such as adb
and
kadb
. The mdb
debugger can execute all of
the macros that are available to kmdb
as well as any legacy
user-defined macros for adb
. See the
Modular Debugger Guide for information about
where to find standard macro sets.
22.2.1. Postmortem Debugging
Postmortem analysis offers numerous advantages to driver developers.
More than one developer can examine a problem in parallel. Multiple instances
of the debugger can be used simultaneously on a single crash dump. The analysis
can be performed offline so that the crashed system can be returned to service,
if possible. Postmortem analysis enables the use of user-developed debugger
functionality in the form of dmods. Dmods can bundle functionality that would
be too memory-intensive for real-time debuggers, such as kmdb
.
When a system panics while kmdb
is loaded, control
is passed to the debugger for immediate investigation. If kmdb
does
not seem appropriate for analyzing the current problem, a good strategy is
to use :c
to continue execution and save the crash dump.
When the system reboots, you can perform postmortem analysis with mdb
on
the saved crash dump. This process is analogous to debugging an application
crash from a process core file.
In earlier versions of the Solaris operating system, adb(1) was
the recommended tool for postmortem analysis. In the current illumos releases,
mdb(1) is
the recommended tool for postmortem analysis. The mdb
feature
set surpasses the set of commands from the legacy crash(1M) utility. The crash
utility is no longer available in illumos.
22.2.2. Using the kmdb Kernel Debugger
-
Control of kernel execution
-
Inspection of the kernel state
-
Live modifications to the code
This section assumes that you are already familiar with the kmdb
debugger.
The focus in this section is on kmdb
capabilities that
are useful in device driver design. To learn how to use kmdb
in
detail, refer to the kmdb(1) man
page and to the Modular Debugger Guide. If you are familiar with kadb
, refer to the kadb(1M) man
page for the major differences between kadb
and kmdb
.
The kmdb
debugger can be loaded and unloaded at will.
Instructions for loading and unloading kmdb
are in the Modular Debugger Guide. For safety and convenience, booting
with an alternate kernel is highly encouraged. The boot process is slightly
different between the SPARC platform and the x86 platform, as described in
this section.
By default, kmdb
uses the CPU ID as the prompt
when kmdb is running. In the examples in
this chapter [0]
is used as the prompt unless otherwise
noted.
Booting kmdb With an Alternate Kernel on the SPARC Platform
Use either of the following commands to boot a SPARC system with both kmdb
and an alternate kernel:
boot kmdb -D kernel.test/sparcv9/unix boot kernel.test/sparcv9/unix -k
Booting kmdb With an Alternate Kernel on the x86 Platform
Use either of the following commands to boot an x86 system with both kmdb
and an alternate kernel:
b kmdb -D kernel.test/unix b kernel.test/unix -k
Setting Breakpoints in kmdb
Use the bp
command to set a breakpoint, as shown in the following example.
[0]> myModule`myBreakpointLocation::bp
If the target module has not been loaded, then an error message that
indicates this condition is displayed, and the breakpoint is not created.
In this case you can use a deferred breakpoint. A deferred
breakpoint activates automatically when the specified module is loaded. Set
a deferred breakpoint by specifying the target location after the bp
command.
The following example demonstrates a deferred breakpoint.
[0]>::bp myModule`myBreakpointLocation
For more information on using breakpoints, see the Modular Debugger Guide. You can also get help by typing either of the following two lines:
> ::help bp > ::bp dcmd
kmdb Macros for Driver Developers
The kmdb
(1M) debugger supports macros that can be
used to display kernel data structures. Use $M
to display kmdb
macros. Macros are used in the form:
[ address ] $<macroname
Neither the information displayed by these macros nor the format in which the information is displayed, constitutes an interface. Therefore, the information and format can change at any time.
The kmdb
macros in the following table are particularly
useful to developers of device drivers. For convenience, legacy macro names
are shown where applicable.
Dcmd |
Legacy Macro |
Description |
---|---|---|
|
|
Print a summary of a device node |
|
devinfo.parent |
Walk the ancestors of a device node |
|
devinfo.sibling |
Walk the siblings of a device node |
|
devinfo.minor |
Print the minor nodes that correspond to the given device node |
|
Print the name of a device that is bound to a given device node. |
|
|
Print the device nodes that are bound to a given device node or major number. |
The ::devinfo
dcmd displays a node state that can
have one of the following values:
DS_ATTACHED
-
The driver's attach(9E) routine returned successfully.
DS_BOUND
-
The node is bound to a driver, but the driver's probe(9E) routine has not yet been called.
DS_INITIALIZED
-
The parent nexus has assigned a bus address for the driver. The implementation-specific initializations have been completed. The driver's probe(9E) routine has not yet been called at this point.
DS_LINKED
-
The device node has been linked into the kernel's device tree, but the system has not yet found a driver for this node.
DS_PROBED
-
The driver's probe(9E) routine returned successfully.
DS_READY
-
The device is fully configured.
22.2.3. Using the mdb Modular Debugger
-
Live operating system components
-
Operating system crash dumps
-
User processes
-
User process core dumps
-
Object files
The mdb
debugger provides sophisticated debugging
support for analyzing kernel problems. This section provides an overview of mdb
features. For a complete discussion of mdb
,
refer to the
Modular Debugger Guide.
Although mdb
can be used to alter live kernel state, mdb
lacks the kernel execution control that is provided by kmdb
. As a result kmdb
is preferred for runtime debugging.
The mdb
debugger is used more for static situations.
The prompt for mdb
is >
.
Getting Started With the Modular Debugger
The mdb
debugger provides an extensive programming
API for implementing debugger modules so that driver developers can implement
custom debugging support. The mdb
debugger also provides
many usability features, such as command-line editing, command history, an
output pager, and online help.
The adb
macros should no longer be used. That
functionality has largely been superseded by the dcmds in mdb
.
The mdb
debugger provides a rich set of modules and
dcmds. With these tools, you can debug the illumos kernel, any associated
modules, and device drivers. These facilities enable you to perform tasks
such as:
-
Formulate complex debugging queries
-
Locate all the memory allocated by a particular thread
-
Print a visual picture of a kernel STREAM
-
Determine what type of structure a particular address refers to
-
Locate leaked memory blocks in the kernel
-
Analyze memory to locate stack traces
-
Assemble dcmds into modules called dmods for creating customized operations
To get started, switch to the crash directory and type mdb
,
specifying a system crash dump, as illustrated in the following example.
% cd /var/crash/testsystem % ls bounds unix.0 vmcore.0 % mdb unix.0 vmcore.0 Loading modules: [ unix krtld genunix ufs_log ip usba s1394 cpc nfs ] > ::status debugging crash dump vmcore.0 (64-bit) from testsystem operating system: 5.10 Generic (sun4u) panic message: zero dump content: kernel pages only
When mdb
responds with the >
prompt,
you can run commands.
To examine the running kernel on a live system, run mdb
from
the system prompt as follows.
# mdb -k Loading modules: [ unix krtld genunix ufs_log ip usba s1394 ptm cpc ipc nfs ] > ::status debugging live kernel (64-bit) on testsystem operating system: 5.10 Generic (sun4u)
22.2.4. Useful Debugging Tasks With kmdb and mdb
This section provides examples
of useful debugging tasks. The tasks in this section can be performed with
either mdb
or kmdb
unless specifically
noted. This section assumes a basic knowledge of the use of kmdb
and mdb
. Note that the information presented here is dependent on the
type of system used. A Sun BladeTM 100 workstation running
the 64-bit kernel was used to produce these examples.
Because irreversible destruction of data can result from modifying data in kernel structures, you should exercise extreme caution. Do not modify or rely on data in structures that are not part of the illumos DDI. See the Intro(9S) man page for information on structures that are part of the illumos DDI.
Exploring System Registers With kmdb
The kmdb
debugger can display machine registers as
a group or individually. To display all registers as a group, use $r
as
shown in the following example.
[0]: $r g0 0 l0 0 g1 100130a4 debug_enter l1 edd00028 g2 10411c00 tsbmiss_area+0xe00 l2 10449c90 g3 10442000 ti_statetbl+0x1ba l3 1b g4 3000061a004 l4 10474400 ecc_syndrome_tab+0x80 g5 0 l5 3b9aca00 g6 0 l6 0 g7 2a10001fd40 l7 0 o0 0 i0 0 o1 c i1 10449e50 o2 20 i2 0 o3 300006b2d08 i3 10 o4 0 i4 0 o5 0 i5 b0 sp 2a10001b451 fp 2a10001b521 o7 1001311c debug_enter+0x78 i7 1034bb24 zsa_xsint+0x2c4 y 0 tstate: 1604 (ccr=0x0, asi=0x0, pstate=0x16, cwp=0x4) pstate: ag:0 ie:1 priv:1 am:0 pef:1 mm:0 tle:0 cle:0 mg:0 ig:0 winreg: cur:4 other:0 clean:7 cansave:1 canrest:5 wstate:14 tba 0x10000000 pc edd000d8 edd000d8: ta %icc,%g0 + 125 npc edd000dc edd000dc: nop
The debugger exports each register value to a variable with the same
name as the register. If you read the variable, the current value of the register
is returned. If you write to the variable, the value of the associated machine
register is changed. The following example changes the value of the %o0
register
from 0 to 1 on an x86 machine.
[0]> <eax=K c1e6e0f0 [0]> 0>eax [0]> <eax=K 0 [0]> c1e6e0f0>eax
If you need to inspect the registers of a different processor, you can
use the ::cpuregs
dcmd. The ID of the processor to be examined
can be supplied as either the address to the dcmd or as the value of the -c
option, as shown in the following example.
[0]> 0::cpuregs %cs = 0x0158 %eax = 0xc1e6e0f0 kmdbmod`kaif_dvec %ds = 0x0160 %ebx = 0x00000000
The following example switches from processor 0
to
processor 3
on a SPARC machine. The %g3
register
is inspected and then cleared. To confirm the new value, %g3
is
read again.
[0]> 3::switch [3]> <g3=K 24 [3]> 0>g3 [3]> <g3 0
Detecting Kernel Memory Leaks
The ::findleaks
dcmd provides powerful, efficient
detection of memory leaks in kernel crash dumps. The full set of kernel-memory
debugging features must be enabled for ::findleaks
to be
effective. For more information, see Setting kmem_flags Debugging Flags. Run ::findleaks
during driver
development and testing to detect code that leaks memory, thus wasting kernel
resources. See Modular Debugger Guide for a complete
discussion of ::findleaks
.
Code that leaks kernel memory can render the system vulnerable to denial-of-service attacks.
Writing Debugger Commands With mdb
The mdb
debugger provides a powerful API for implementing
debugger facilities that you customize to debug your driver. The
Modular Debugger Guide explains
the programming API in detail.
The SUNWmdbdm
package installs sample mdb
source
code in the directory /usr/demo/mdb. You can use mdb
to automate lengthy debugging chores or help to validate that your
driver is behaving properly. You can also package your mdb
debugging
modules with your driver product. With packaging, these facilities are available
to service personnel at a customer site.
Obtaining Kernel Data Structure Information
The illumos kernel provides data type information in structures that
can be inspected with either kmdb
or mdb
.
The kmdb
and mdb
dcmds can
be used only with objects that contain compressed symbolic debugging information
that has been designed for use with mdb
. This information
is currently available only for certain illumos kernel modules. The SUNWzlib
package must be installed to process the symbolic debugging information.
The following example demonstrates how to display the data in the scsi_pkt
structure.
> 7079ceb0::print -t 'struct scsi_pkt' { opaque_t pkt_ha_private = 0x7079ce20 struct scsi_address pkt_address = { struct scsi_hba_tran *a_hba_tran = 0x70175e68 ushort_t a_target = 0x6 uchar_t a_lun = 0 uchar_t a_sublun = 0 } opaque_t pkt_private = 0x708db4d0 int (*)() *pkt_comp = sd_intr uint_t pkt_flags = 0 int pkt_time = 0x78 uchar_t *pkt_scbp = 0x7079ce74 uchar_t *pkt_cdbp = 0x7079ce64 ssize_t pkt_resid = 0 uint_t pkt_state = 0x37 uint_t pkt_statistics = 0 uchar_t pkt_reason = 0 }
The size of a data structure can be useful in debugging. Use the ::sizeof
dcmd to obtain the size of a structure, as shown in the following
example.
> ::sizeof struct scsi_pkt sizeof (struct scsi_pkt) = 0x58
The address of a specific member within a structure is also useful in debugging. Several methods are available for determining a member's address.
Use the ::offsetof
dcmd to obtain the offset for
a given member of a structure, as in the following example.
> ::offsetof struct scsi_pkt pkt_state offsetof (struct pkt_state) = 0x48
Use the ::print
dcmd with the -a
option
to display the addresses of all members of a structure, as in the following
example.
> ::print -a struct scsi_pkt { 0 pkt_ha_private 8 pkt_address { ... } 18 pkt_private ... }
If an address is specified with ::print
in conjunction
with the -a
option, the absolute address for each member is
displayed.
> 10000000::print -a struct scsi_pkt { 10000000 pkt_ha_private 10000008 pkt_address { ... } 10000018 pkt_private ... }
The ::print
, ::sizeof
and ::offsetof
dcmds enable you to debug problems when your driver interacts with
the illumos kernel.
This facility provides access to raw kernel data structures. You can examine any structure whether or not that structure appears as part of the DDI. Therefore, you should refrain from relying on any data structure that is not explicitly part of the DDI.
These dcmds should be used only with objects that contain compressed
symbolic debugging information that has been designed for use with mdb
.
Symbolic debugging information is currently available for certain illumos
kernel modules only. The SUNWzlib (32-bit) or SUNWzlibx (64-bit) decompression software must be installed to
process the symbolic debugging information. The kmdb
debugger
can process symbolic type data with or without the SUNWzlib or SUNWzlibx packages.
Obtaining Device Tree Information
The mdb
debugger provides the ::prtconf
dcmd
for displaying the kernel device tree. The output of the ::prtconf
dcmd
is similar to the output of the prtconf(1M) command.
> ::prtconf 300015d3e08 SUNW,Sun-Blade-100 300015d3c28 packages (driver not attached) 300015d3868 SUNW,builtin-drivers (driver not attached) 300015d3688 deblocker (driver not attached) 300015d34a8 disk-label (driver not attached) 300015d32c8 terminal-emulator (driver not attached) 300015d30e8 obp-tftp (driver not attached) 300015d2f08 dropins (driver not attached) 300015d2d28 kbd-translator (driver not attached) 300015d2b48 ufs-file-system (driver not attached) 300015d3a48 chosen (driver not attached) 300015d2968 openprom (driver not attached)
You can display the node by using a macro, such as the ::devinfo
dcmd,
as shown in the following example.
> 300015d3e08::devinfo 300015d3e08 SUNW,Sun-Blade-100 System properties at 0x300015abdc0: name='relative-addressing' type=int items=1 value=00000001 name='MMU_PAGEOFFSET' type=int items=1 value=00001fff name='MMU_PAGESIZE' type=int items=1 value=00002000 name='PAGESIZE' type=int items=1 value=00002000 Driver properties at 0x300015abe00: name='pm-hardware-state' type=string items=1 value='no-suspend-resume'
Use ::prtconf
to see where your driver has attached
in the device tree, and to display device properties. You can also specify
the verbose (-v
) flag to ::prtconf
to display
the properties for each device node, as follows.
> ::prtconf -v DEVINFO NAME 300015d3e08 SUNW,Sun-Blade-100 System properties at 0x300015abdc0: name='relative-addressing' type=int items=1 value=00000001 name='MMU_PAGEOFFSET' type=int items=1 value=00001fff name='MMU_PAGESIZE' type=int items=1 value=00002000 name='PAGESIZE' type=int items=1 value=00002000 Driver properties at 0x300015abe00: name='pm-hardware-state' type=string items=1 value='no-suspend-resume' ... 300015ce798 pci10b9,5229, instance #0 Driver properties at 0x300015ab980: name='target2-dcd-options' type=any items=4 value=00.00.00.a4 name='target1-dcd-options' type=any items=4 value=00.00.00.a2 name='target0-dcd-options' type=any items=4 value=00.00.00.a4
Another way to locate instances of your driver is the ::devbindings
dcmd. Given a driver name, the command displays a list of all instances
of the named driver as demonstrated in the following example.
> ::devbindings dad 300015ce3d8 ide-disk (driver not attached) 300015c9a60 dad, instance #0 System properties at 0x300015ab400: name='lun' type=int items=1 value=00000000 name='target' type=int items=1 value=00000000 name='class_prop' type=string items=1 value='ata' name='type' type=string items=1 value='ata' name='class' type=string items=1 value='dada' ... 300015c9880 dad, instance #1 System properties at 0x300015ab080: name='lun' type=int items=1 value=00000000 name='target' type=int items=1 value=00000002 name='class_prop' type=string items=1 value='ata' name='type' type=string items=1 value='ata' name='class' type=string items=1 value='dada'
Retrieving Driver Soft State Information
A common problem when debugging a driver is retrieving the soft
state for a particular driver instance. The soft state is allocated
with the ddi_soft_state_zalloc(9F) routine. The driver can obtain the
soft state through ddi_get_soft_state(9F). The name of the soft state
pointer is the first argument to ddi_soft_state_init(9F)). With the name,
you can use mdb
to retrieve the soft state for a particular
driver instance through the ::softstate
dcmd:
> *bst_state::softstate 0x3 702b7578
In this case, ::softstate
is used to fetch the soft
state for instance 3 of the bst
sample driver. This pointer
references a bst_soft
structure that is used by the
driver to track state for this instance.
Modifying Kernel Variables
You can use both kmdb
and mdb
to
modify kernel variables or other kernel state. Kernel state modification with mdb
should be done with care, because mdb
does
not stop the kernel before making modifications. Groups of modifications can
be made atomically by using kmdb
, because kmdb
stops
the kernel before allowing access by the user. The mdb
debugger
is capable of making single atomic modifications only.
-
w
– Writes the lowest two bytes of the value of each expression to the target beginning at the location specified by dot -
W
– Writes the lowest 4 bytes of the value of each expression to the target beginning at the location specified by dot -
Z
– Write the complete 8 bytes of the value of each expression to the target beginning at the location specified by dot
Use the ::sizeof
dcmd to determine the size of the
variable to be modified.
The following example overwrites the value of moddebug
with
the value 0x80000000.
> moddebug/W 0x80000000 moddebug: 0 = 0x80000000
22.3. Tuning Drivers
illumos provides kernel statistics structures so that you can implement counters for your driver. The DTrace facility enables you to analyze performance in real time. This section presents the following topics on device performance:
-
Kernel Statistics – illumos provides a set of data structures and functions for capturing performance statistics in the kernel. Kernel statistics (called kstats) enable your driver to export continuous statistics while the system is running. The kstat data is handled programmatically by using the kstat functions.
-
DTrace for Dynamic Instrumentation – DTrace enables you to add instrumentation to your driver dynamically so that you can perform tasks like analyzing the system and measuring performance. DTrace takes advantage of predefined kstat structures.
22.3.1. Kernel Statistics
To assist in performance tuning, the illumos kernel provides the kstat(3KSTAT) facility. The kstat facility provides a set of functions and data structures for device drivers and other kernel modules to export module-specific kernel statistics.
A kstat is a data structure for recording quantifiable aspects of a
device's usage. A kstat is stored as a null-terminated linked list. Each kstat
has a common header section and a type-specific data section. The header section
is defined by the kstat_t
structure.
The article “Using kstat From Within a Program in the Solaris OS” on the Sun Developer Network at http://developers.sun.com/solaris/articles/kstat_api.html provides two practical examples on how to use the kstat(3KSTAT) and libkstat(3LIB) APIs to extract metrics from illumos. The examples include “Walking Through All the kstat” and “Getting NIC kstat Output Using the Java Platform.”
Kernel Statistics Structure Members
The members of a kstat structure are:
ks_class[KSTAT_STRLEN]
-
Categorizes the kstat type as
bus
,controller
,device_error
,disk
,hat
,kmem_cache
,kstat
,misc
,net
,nfs
,pages
,partition
,rps
,ufs
,vm
, orvmem
. ks_crtime
-
Time at which the kstat was created.
ks_crtime
is commonly used in calculating rates of various counters. ks_data
-
Points to the data section for the kstat.
ks_data_size
-
Total size of the data section in bytes.
ks_instance
-
The instance of the kernel module that created this kstat.
ks_instance
is combined withks_module
andks_name
to give the kstat a unique, meaningful name. ks_kid
-
Unique ID for the kstat.
ks_module[KSTAT_STRLEN]
-
Identifies the kernel module that created this kstat.
ks_module
is combined withks_instance
andks_name
to give the kstat a unique, meaningful name.KSTAT_STRLEN
sets the maximum length ofks_module
. ks_name[KSTAT_STRLEN]
-
A name assigned to the kstat in combination with
ks_module
andks_instance
.KSTAT_STRLEN
sets the maximum length ofks_module
. ks_ndata
-
Indicates the number of data records for those kstat types that support multiple records:
KSTAT_TYPE_RAW
,KSTAT_TYPE_NAMED
, andKSTAT_TYPE_TIMER
ks_next
-
Points to next kstat in the chain.
ks_resv
-
A reserved field.
ks_snaptime
-
The timestamp for the last data snapshot, useful in calculating rates.
ks_type
-
The data type, which can be
KSTAT_TYPE_RAW
for binary data,KSTAT_TYPE_NAMED
for name/value pairs,KSTAT_TYPE_INTR
for interrupt statistics,KSTAT_TYPE_IO
for I/O statistics, andKSTAT_TYPE_TIMER
for event timers.
Kernel Statistics Structures
The structures for the different kinds of kstats are:
- kstat(9S)
-
Each kernel statistic (kstat) that is exported by device drivers consists of a header section and a data section. The kstat(9S) structure is the header portion of the statistic.
- kstat_intr(9S)
-
Structure for interrupt kstats. The types of interrupts are:
-
Hard interrupt – Sourced from the hardware device itself
-
Soft interrupt – Induced by the system through the use of some system interrupt source
-
Watchdog interrupt – Induced by a periodic timer call
-
Spurious interrupt – An interrupt entry point was entered but there was no interrupt to service
-
Multiple service – An interrupt was detected and serviced just prior to returning from any of the other types
Drivers generally report only claimed hard interrupts and soft interrupts from their handlers, but measurement of the spurious class of interrupts is useful for auto-vectored devices to locate any interrupt latency problems in a particular system configuration. Devices that have more than one interrupt of the same type should use multiple structures.
-
- kstat_io(9S)
-
Structure for I/O kstats.
- kstat_named(9S)
-
Structure for named kstats. A named kstat is an array of name-value pairs. These pairs are kept in the
kstat_named
structure.
Kernel Statistics Functions
The functions for using kstats are:
- kstat_create(9F)
-
Allocate and initialize a kstat(9S) structure.
- kstat_delete(9F)
-
Remove a kstat from the system.
- kstat_install(9F)
-
Add a fully initialized kstat to the system.
- kstat_named_init(9F), kstat_named_setstr(9F)
-
Initialize a named kstat.
kstat_named_setstr
associatesstr
, a string, with the named kstat pointer. - kstat_queue(9F)
-
A large number of I/O subsystems have at least two basic queues of transactions to be managed. One queue is for transactions that have been accepted for processing but for which processing has yet to begin. The other queue is for transactions that are actively being processed but not yet done. For this reason, two cumulative time statistics are kept:wait time and run time. Wait time is prior to service. Run time is during the service. The
kstat_queue
family of functions manages these times based on the transitions between the driver wait queue and run queue:
Kernel Statistics for illumos Ethernet Drivers
The kstat interface described in the following table is an effective
way to obtain Ethernet physical layer statistics from the driver. Ethernet
drivers should export these statistics to guide users in better diagnosis
and repair of Ethernet physical layer problems. With exception of link_up
, all statistics have a default value of 0 when not present. The
value of the link_up
statistic should be assumed to be
1.
The following example gives all the shared link setup. In this case mii
is used to filter statistics.
kstat ce:0:mii:link_*
Kstat Variable |
Type |
Description |
---|---|---|
|
|
Provides the MII address of the transceiver that is currently in use.
|
|
|
Provides the specific vendor ID or device ID of the transceiver that is currently in use. |
|
|
Indicates the type of transceiver that is currently in use. The IEEE
This set is smaller than the set specified by |
|
|
Indicates the device is 1 Gbits/s full duplex capable. |
|
|
Indicates the device is 1 Gbits/s half duplex capable. |
|
|
Indicates the device is 100 Mbits/s full duplex capable. |
|
|
Indicates the device is 100 Mbits/s half duplex capable. |
|
|
Indicates the device is 10 Mbits/s full duplex capable. |
|
|
Indicates the device is 10 Mbits/s half duplex capable. |
|
|
Indicates the device is capable of asymmetric pause Ethernet flow control. |
|
|
Indicates the device is capable of symmetric pause Ethernet flow control when
|
|
|
Indicates the device is capable of remote fault indication. |
|
|
Indicates the device is capable of auto-negotiation. |
|
|
Indicates the device is advertising 1 Gbits/s full duplex capability. |
|
|
Indicates the device is advertising 1 Gbits/s half duplex capability. |
|
|
Indicates the device is advertising 100 Mbits/s full duplex capability. |
|
|
Indicates the device is advertising 100 Mbits/s half duplex capability. |
|
|
Indicates the device is advertising 10 Mbits/s full duplex capability. |
|
|
Indicates the device is advertising 10 Mbits/s half duplex capability. |
|
|
Indicates the device is advertising the capability of asymmetric pause Ethernet flow control. |
|
|
Indicates the device is advertising the capability of symmetric pause Ethernet flow
control when
|
|
|
Indicates the device is experiencing a fault that it is going to forward to the link partner. |
|
|
Indicates the device is advertising the capability of auto-negotiation. |
|
|
Indicates the link partner device is 1 Gbits/s full duplex capable. |
|
|
Indicates the link partner device is 1 Gbits/s half duplex capable. |
|
|
Indicates the link partner device is 100 Mbits/s full duplex capable. |
|
|
Indicates the link partner device is 100 Mbits/s half duplex capable. |
|
|
Indicates the link partner device is 10 Mbits/s full duplex capable. |
|
|
Indicates the link partner device is 10 Mbits/s half duplex capable. |
|
|
Indicates the link partner device is capable of asymmetric pause Ethernet flow control. |
|
|
Indicates the link partner device is capable of symmetric pause Ethernet flow
control when
|
|
|
Indicates the link partner is experiencing a fault with the link. |
|
|
Indicates the link partner device is capable of auto-negotiation. |
|
|
Indicates the link is operating with asymmetric pause Ethernet flow control. |
|
|
Indicates the resolution of the pause capability. Indicates the link is operating
with
symmetric pause Ethernet flow control when
|
|
|
Indicates the link duplex.
|
|
|
Indicates whether the link is up or down.
|
22.3.2. DTrace for Dynamic Instrumentation
DTrace is a comprehensive dynamic tracing facility for examining the behavior of both user programs and the operating system itself. With DTrace, you can collect data at strategic locations in your environment, referred to as probes. DTrace enables you to record such data as stack traces, timestamps, the arguments to a function, or simply counts of how often the probe fires. Because DTrace enables you to insert probes dynamically, you do not need to recompile your code. For more information on DTrace, see the Dynamic Tracing Guide and the DTrace User Guide.