audio - generic audio device interface


#include <sys/audio.h>


An audio device is used to play and/or record a stream of audio data.
Since a specific audio device may not support all functionality described
below, refer to the device-specific manual pages for a complete description
of each hardware device. An application can use the AUDIO_GETDEV ioctl(2)
to determine the current audio hardware associated with /dev/audio.

The audio framework provides a software mixing engine (audio mixer) for all
audio devices, allowing more than one process to play or record audio at
the same time.

Backward Compatibility

It is no longer possible to disable the mixing function. Applications must
not assume that they have exclusive access to the audio device.

Multi-Stream Codecs
The audio mixer supports multi-stream Codecs. These devices have DSP
engines that provide sample rate conversion, hardware mixing, and other
features. The use of such hardware features is opaque to applications.


Digital audio data represents a quantized approximation of an analog audio
signal waveform. In the simplest case, these quantized numbers represent
the amplitude of the input waveform at particular sampling intervals. To
achieve the best approximation of an input signal, the highest possible
sampling frequency and precision should be used. However, increased
accuracy comes at a cost of increased data storage requirements. For
instance, one minute of monaural audio recorded in <mu>-Law format
(pronounced mew-law) at 8 KHz requires nearly 0.5 megabytes of storage,
while the standard Compact Disc audio format (stereo 16-bit linear PCM data
sampled at 44.1 KHz) requires approximately 10 megabytes per minute.

Audio data may be represented in several different formats. An audio
device's current audio data format can be determined by using the
AUDIO_GETINFO ioctl(2) described below.

An audio data format is characterized in the audio driver by four
parameters: Sample Rate, Encoding, Precision, and Channels. Refer to the
device-specific manual pages for a list of the audio formats that each
device supports. In addition to the formats that the audio device supports
directly, other formats provide higher data compression. Applications may
convert audio data to and from these formats when playing or recording.

Sample Rate

Sample rate is a number that represents the sampling frequency (in samples
per second) of the audio data.

The audio mixer always configures the hardware for the highest possible
sample rate for both play and record. This ensures that none of the audio
streams require compute-intensive low pass filtering. The result is that
high sample rate audio streams are not degraded by filtering.

Sample rate conversion can be a compute-intensive operation, depending on
the number of channels and a device's sample rate. For example, an 8KHz
signal can be easily converted to 48KHz, requiring a low cost up sampling
by 6. However, converting from 44.1KHz to 48KHz is compute intensive
because it must be up sampled by 160 and then down sampled by 147. This is
only done using integer multipliers.

Applications can greatly reduce the impact of sample rate conversion by
carefully picking the sample rate. Applications should always use the
highest sample rate the device supports. An application can also do its
own sample rate conversion (to take advantage of floating point and
accelerated instruction or use small integers for up and down sampling.

All modern audio devices run at 48 kHz or a multiple thereof, hence just
using 48 kHz may be a reasonable compromise if the application is not
prepared to select higher sample rates.


An encoding parameter specifies the audio data representation. <mu>-Law
encoding corresponds to CCITT G.711, and is the standard for voice data
used by telephone companies in the United States, Canada, and Japan. A-Law
encoding is also part of CCITT G.711 and is the standard encoding for
telephony elsewhere in the world. A-Law and <mu>-Law audio data are
sampled at a rate of 8000 samples per second with 12-bit precision, with
the data compressed to 8-bit samples. The resulting audio data quality is
equivalent to that of standard analog telephone service.

Linear Pulse Code Modulation (PCM) is an uncompressed, signed audio format
in which sample values are directly proportional to audio signal voltages.
Each sample is a 2's complement number that represents a positive or
negative amplitude.


Precision indicates the number of bits used to store each audio sample.
For instance, <mu>-Law and A-Law data are stored with 8-bit precision. PCM
data may be stored at various precisions, though 16-bit is the most common.


Multiple channels of audio may be interleaved at sample boundaries. A
sample frame consists of a single sample from each active channel. For
example, a sample frame of stereo 16-bit PCM data consists of two 16-bit
samples, corresponding to the left and right channel data.

The audio mixer sets the hardware to the maximum number of channels
supported. If a mono signal is played or recorded, it is mixed on the
first two (usually the left and right) channels only. Silence is mixed on
the remaining channels

Supported Formats

The audio mixer supports the following audio formats:

Encoding Precision Channels
Signed Linear PCM 32-bit Mono or Stereo
Signed Linear PCM 16-bit Mono or Stereo
Signed Linear PCM 8-bit Mono or Stereo
<mu>-Law 8-bit Mono or Stereo
A-Law 8-bit Mono or Stereo

The audio mixer converts all audio streams to 24-bit Linear PCM before
mixing. After mixing, conversion is made to the best possible Codec
format. The conversion process is not compute intensive and audio
applications can choose the encoding format that best meets their needs.

Note that the mixer discards the low order 8 bits of 32-bit Signed Linear
PCM in order to perform mixing. (This is done to allow for possible
overflows to fit into 32-bits when mixing multiple streams together.)
Hence, the maximum effective precision is 24-bits.


The device /dev/audio is a device driver that dispatches audio requests to
the appropriate underlying audio hardware. The audio driver is implemented
as a STREAMS driver. In order to record audio input, applications open(2)
the /dev/audio device and read data from it using the read(2) system call.
Similarly, sound data is queued to the audio output port by using the
write(2) system call. Device configuration is performed using the ioctl(2)

Because some systems may contain more than one audio device, application
writers are encouraged to query the AUDIODEV environment variable. If this
variable is present in the environment, its value should identify the path
name of the default audio device.

Opening the Audio Device

The audio device is not treated as an exclusive resource. Each process may
open the audio device once.

Each open(2) completes as long as there are channels available to be
allocated. If no channels are available to be allocated:

+o if either the O_NDELAY or O_NONBLOCK flags are set in the open(2)
oflag argument, then -1 is immediately returned, with errno set

+o if neither the O_NDELAY nor the O_NONBLOCK flag are set, then
open(2) hangs until the device is available or a signal is
delivered to the process, in which case a -1 is returned with
errno set to EINTR.

Upon the initial open(2) of the audio channel, the audio mixer sets the
data format of the audio channel to the default state of 8-bit, 8Khz, mono
<mu>-Law data.

If the audio device does not support this configuration, it informs the
audio mixer of the initial configuration. Audio applications should
explicitly set the encoding characteristics to match the audio data
requirements, and not depend on the default configuration.

Recording Audio Data

The read(2) system call copies data from the system's buffers to the
application. Ordinarily, read(2) blocks until the user buffer is filled.
The I_NREAD ioctl (see streamio(4I)) may be used to determine the amount of
data that may be read without blocking. The device may alternatively be
set to a non-blocking mode, in which case read(2) completes immediately,
but may return fewer bytes than requested. Refer to the read(2) manual
page for a complete description of this behavior.

When the audio device is opened with read access, the device driver
immediately starts buffering audio input data. Since this consumes system
resources, processes that do not record audio data should open the device
write-only (O_WRONLY).

The transfer of input data to STREAMS buffers may be paused (or resumed) by
using the AUDIO_SETINFO ioctl to set (or clear) the record.pause flag in
the audio information structure (see below). All unread input data in the
STREAMS queue may be discarded by using the I_FLUSH STREAMS ioctl. See
streamio(4I). When changing record parameters, the input stream should be
paused and flushed before the change, and resumed afterward. Otherwise,
subsequent reads may return samples in the old format followed by samples
in the new format. This is particularly important when new parameters
result in a changed sample size.

Input data can accumulate in STREAMS buffers very quickly. At a minimum,
it will accumulate at 8000 bytes per second for 8-bit, 8 KHz, mono,
<mu>-Law data. If the device is configured for 16-bit linear or higher
sample rates, it will accumulate even faster. If the application that
consumes the data cannot keep up with this data rate, the STREAMS queue may
become full. When this occurs, the record.error flag is set in the audio
information structure and input sampling ceases until there is room in the
input queue for additional data. In such cases, the input data stream
contains a discontinuity. For this reason, audio recording applications
should open the audio device when they are prepared to begin reading data,
rather than at the start of extensive initialization.

Playing Audio Data

The write(2) system call copies data from an application's buffer to the
STREAMS output queue. Ordinarily, write(2) blocks until the entire user
buffer is transferred. The device may alternatively be set to a non-
blocking mode, in which case write(2) completes immediately, but may have
transferred fewer bytes than requested. See write(2).

Although write(2) returns when the data is successfully queued, the actual
completion of audio output may take considerably longer. The AUDIO_DRAIN
ioctl may be issued to allow an application to block until all of the
queued output data has been played. Alternatively, a process may request
asynchronous notification of output completion by writing a zero-length
buffer (end-of-file record) to the output stream. When such a buffer has
been processed, the play.eof flag in the audio information structure is

The final close(2) of the file descriptor hangs until all of the audio
output has drained. If a signal interrupts the close(2), or if the process
exits without closing the device, any remaining data queued for audio
output is flushed and the device is closed immediately.

The consumption of output data may be paused (or resumed) by using the
AUDIO_SETINFO ioctl to set (or clear) the play.pause flag in the audio
information structure. Queued output data may be discarded by using the
I_FLUSH STREAMS ioctl. (See streamio(4I)).

Output data is played from the STREAMS buffers at a default rate of at
least 8000 bytes per second for <mu>-Law, A-Law or 8-bit PCM data (faster
for 16-bit linear data or higher sampling rates). If the output queue
becomes empty, the play.error flag is set in the audio information
structure and output is stopped until additional data is written. If an
application attempts to write a number of bytes that is not a multiple of
the current sample frame size, an error is generated and the bad data is
thrown away. Additional writes are allowed.

Asynchronous I/O
The I_SETSIG STREAMS ioctl enables asynchronous notification, through the
SIGPOLL signal, of input and output ready condition changes. The
O_NONBLOCK flag may be set using the F_SETFL fcntl(2) to enable non-
blocking read(2) and write(2) requests. This is normally sufficient for
applications to maintain an audio stream in the background.

Audio Control Pseudo-Device
It is sometimes convenient to have an application, such as a volume control
panel, modify certain characteristics of the audio device while it is being
used by an unrelated process.

The /dev/audioctl pseudo-device is provided for this purpose. Any number
of processes may open /dev/audioctl simultaneously. However, read(2) and
write(2) system calls are ignored by /dev/audioctl. The AUDIO_GETINFO and
AUDIO_SETINFO ioctl commands may be issued to /dev/audioctl to determine
the status or alter the behavior of /dev/audio. Note: In general, the
audio control device name is constructed by appending the letters "ctl" to
the path name of the audio device.

Audio Status Change Notification

Applications that open the audio control pseudo-device may request
asynchronous notification of changes in the state of the audio device by
setting the S_MSG flag in an I_SETSIG STREAMS ioctl. Such processes
receive a SIGPOLL signal when any of the following events occur:

+o An AUDIO_SETINFO ioctl has altered the device state.

+o An input overflow or output underflow has occurred.

+o An end-of-file record (zero-length buffer) has been processed on

+o An open(2) or close(2) of /dev/audio has altered the device

+o An external event (such as speakerbox's volume control) has
altered the device state.


Audio Information Structure

The state of the audio device may be polled or modified using the
AUDIO_GETINFO and AUDIO_SETINFO ioctl commands. These commands operate on
the audio_info structure as defined, in <sys/audio.h>, as follows:

* This structure contains state information for audio device
* IO streams

struct audio_prinfo {
* The following values describe the
* audio data encoding
uint_t sample_rate; /* samples per second */
uint_t channels; /* number of interleaved channels */
uint_t precision; /* number of bits per sample */
uint_t encoding; /* data encoding method */

* The following values control audio device
* configuration
uint_t gain; /* volume level */
uint_t port; /* selected I/O port */
uint_t buffer_size; /* I/O buffer size */

* The following values describe the current device
* state
uint_t samples; /* number of samples converted */
uint_t eof; /* End Of File counter (play only) */
uchar_t pause; /* non-zero if paused, zero to resume */
uchar_t error; /* non-zero if overflow/underflow */
uchar_t waiting; /* non-zero if a process wants access */
uchar_t balance; /* stereo channel balance */

* The following values are read-only device state
* information
uchar_t open; /* non-zero if open access granted */
uchar_t active; /* non-zero if I/O active */
uint_t avail_ports; /* available I/O ports */
uint_t mod_ports; /* modifiable I/O ports */
typedef struct audio_prinfo audio_prinfo_t;

* This structure is used in AUDIO_GETINFO and AUDIO_SETINFO ioctl
* commands
struct audio_info {
audio_prinfo_t record;/* input status info */
audio_prinfo_t play; /* output status info */
uint_t monitor_gain; /* input to output mix */
uchar_toutput_muted; /* non-zero if output muted */
uint_t hw_features; /* supported H/W features */
uint_t sw_features; /* supported S/W features */
uint_t sw_features_enabled;
/* supported S/W features enabled */
typedef struct audio_info audio_info_t;

/* Audio encoding types */
#define AUDIO_ENCODING_ULAW (1) /* u-Law encoding */
#define AUDIO_ENCODING_ALAW (2) /* A-Law encoding */
#define AUDIO_ENCODING_LINEAR (3) /* Signed Linear PCM encoding */

* These ranges apply to record, play, and
* monitor gain values
#define AUDIO_MIN_GAIN (0)/* minimum gain value */
#define AUDIO_MAX_GAIN (255) /* maximum gain value */

* These values apply to the balance field to adjust channel
* gain values
#define AUDIO_LEFT_BALANCE (0) /* left channel only */
#define AUDIO_MID_BALANCE (32) /* equal left/right balance */
#define AUDIO_RIGHT_BALANCE (64) /* right channel only */

* Define some convenient audio port names
* (for port, avail_ports and mod_ports)

/* output ports (several might be enabled at once) */
#define AUDIO_SPEAKER (0x01) /* built-in speaker */
#define AUDIO_HEADPHONE (0x02) /* headphone jack */
#define AUDIO_LINE_OUT (0x04) /* line out */
#define AUDIO_SPDIF_OUT (0x08) /* SPDIF port */
#define AUDIO_AUX1_OUT (0x10) /* aux1 out */
#define AUDIO_AUX2_OUT (0x20) /* aux2 out */

* input ports (usually only one may be enabled at a time)
#define AUDIO_MICROPHONE (0x01) /* microphone */
#define AUDIO_LINE_IN (0x02) /* line in */
#define AUDIO_CD (0x04) /* on-board CD inputs */
#define AUDIO_SPDIF_IN (0x08) /* SPDIF port */
#define AUDIO_AUX1_IN (0x10) /* aux1 in */
#define AUDIO_AUX2_IN (0x20) /* aux2 in */
#define AUDIO_CODEC_LOOPB_IN (0x40) /* Codec inter. loopback */

/* These defines are for hardware features */
#define AUDIO_HWFEATURE_DUPLEX (0x00000001u)
/* simult. play & cap. supported */

#define AUDIO_HWFEATURE_MSCODEC (0x00000002u)
/* multi-stream Codec */

/* These defines are for software features *
#define AUDIO_SWFEATURE_MIXER (0x00000001u)
/* audio mixer audio pers. mod. */

* Parameter for the AUDIO_GETDEV ioctl
* to determine current audio devices
#define MAX_AUDIO_DEV_LEN (16)
struct audio_device {
char name[MAX_AUDIO_DEV_LEN];
char version[MAX_AUDIO_DEV_LEN];
char config[MAX_AUDIO_DEV_LEN];
typedef struct audio_device audio_device_t;

The play.gain and record.gain fields specify the output and input volume
levels. A value of AUDIO_MAX_GAIN indicates maximum volume. Audio output
may also be temporarily muted by setting a non-zero value in the
output_muted field. Clearing this field restores audio output to the
normal state.

The monitor_gain field is present for compatibility, and is no longer
supported. See dsp(4I) for more detail.

Likewise, the play.port, play.ports, play.mod_ports, record.port,
record.ports, and record.mod_ports are no longer supported. See dsp(4I)
for more detail.

The play.balance and record.balance fields are fixed to AUDIO_MID_BALANCE.
Changes to volume levels for different channels can be made using the
interfaces in dsp(4I).

The play.pause and record.pause flags may be used to pause and resume the
transfer of data between the audio device and the STREAMS buffers. The
play.error and record.error flags indicate that data underflow or overflow
has occurred. The and flags indicate that data
transfer is currently active in the corresponding direction.

The and flags indicate that the device is currently
open with the corresponding access permission. The play.waiting and
record.waiting flags provide an indication that a process may be waiting to
access the device. These flags are set automatically when a process blocks
on open(2), though they may also be set using the AUDIO_SETINFO ioctl
command. They are cleared only when a process relinquishes access by
closing the device.

The play.samples and record.samples fields are zeroed at open(2) and are
incremented each time a data sample is copied to or from the associated
STREAMS queue. Some audio drivers may be limited to counting buffers of
samples, instead of single samples for their samples accounting. For this
reason, applications should not assume that the samples fields contain a
perfectly accurate count. The play.eof field increments whenever a zero-
length output buffer is synchronously processed. Applications may use this
field to detect the completion of particular segments of audio output.

The record.buffer_size field controls the amount of input data that is
buffered in the device driver during record operations. Applications that
have particular requirements for low latency should set the value
appropriately. Note however that smaller input buffer sizes may result in
higher system overhead. The value of this field is specified in bytes and
drivers will constrain it to be a multiple of the current sample frame
size. Some drivers may place other requirements on the value of this
field. Refer to the audio device-specific manual page for more details.
If an application changes the format of the audio device and does not
modify the record.buffer_size field, the device driver may use a default
value to compensate for the new data rate. Therefore, if an application is
going to modify this field, it should modify it during or after the format
change itself, not before. When changing the record.buffer_size
parameters, the input stream should be paused and flushed before the
change, and resumed afterward. Otherwise, subsequent reads may return
samples in the old format followed by samples in the new format. This is
particularly important when new parameters result in a changed sample size.
If you change the record.buffer_size for the first packet, this protocol
must be followed or the first buffer will be the default buffer size for
the device, followed by packets of the requested change size.

The record.buffer_size field may be modified only on the /dev/audio device
by processes that have it opened for reading.

The play.buffer_size field is currently not supported.

The audio data format is indicated by the sample_rate, channels, precision,
and encoding fields. The values of these fields correspond to the
descriptions in the AUDIO FORMATS section of this man page. Refer to the
audio device-specific manual pages for a list of supported data format

The data format fields can be modified only on the /dev/audio device.

If the parameter changes requested by an AUDIO_SETINFO ioctl cannot all be
accommodated, ioctl(2) returns with errno set to EINVAL and no changes are
made to the device state.

Streamio IOCTLS

All of the streamio(4I) ioctl(2) commands may be issued for the /dev/audio
device. Because the /dev/audioctl device has its own STREAMS queues, most
of these commands neither modify nor report the state of /dev/audio if
issued for the /dev/audioctl device. The I_SETSIG ioctl may be issued for
/dev/audioctl to enable the notification of audio status changes, as
described above.


The audio device additionally supports the following ioctl(2) commands:

AUDIO_DRAIN The argument is ignored. This command suspends the calling
process until the output STREAMS queue is empty and all
queued samples have been played, or until a signal is
delivered to the calling process. It may not be issued for
the /dev/audioctl device. An implicit AUDIO_DRAIN is
performed on the final close(2) of /dev/audio.

AUDIO_GETDEV The argument is a pointer to an audio_device_t structure.
This command may be issued for either /dev/audio or
/dev/audioctl. The returned value in the name field will be
a string that will identify the current /dev/audio hardware
device, the value in version will be a string indicating the
current version of the hardware, and config will be a
device-specific string identifying the properties of the
audio stream associated with that file descriptor. Refer to
the audio device-specific manual pages to determine the
actual strings returned by the device driver.

AUDIO_GETINFO The argument is a pointer to an audio_info_t structure.
This command may be issued for either /dev/audio or
/dev/audioctl. The current state of the /dev/audio device
is returned in the structure. Values return pertain to a
logical view of the device as seen by and private to the
process, and do not necessarily reflect the actual hardware
device itself.

AUDIO_SETINFO The argument is a pointer to an audio_info_t structure.
This command may be issued for either the /dev/audio or the
/dev/audioctl device with some restrictions. This command
configures the audio device according to the supplied
structure and overwrites the existing structure with the new
state of the device. Note: The play.samples,
record.samples, play.error, record.error, and play.eof
fields are modified to reflect the state of the device when
the AUDIO_SETINFO is issued. This allows programs to
automatically modify these fields while retrieving the
previous value.

As with AUDIO_SETINFO, the settings managed by this ioctl
deal with a logical view of the device which is private to
the process, and don't necessarily have any impact on the
hardware device itself.

Certain fields in the audio information structure, such as the pause flags,
are treated as read-only when /dev/audio is not open with the corresponding
access permission. Other fields, such as the gain levels and encoding
information, may have a restricted set of acceptable values. Applications
that attempt to modify such fields should check the returned values to be
sure that the corresponding change took effect. The sample_rate, channels,
precision, and encoding fields treated as read-only for /dev/audioctl, so
that applications can be guaranteed that the existing audio format will
stay in place until they relinquish the audio device. AUDIO_SETINFO will
return EINVAL when the desired configuration is not possible, or EBUSY when
another process has control of the audio device.

All of the logical device state is reset when the corresponding I/O stream
of /dev/audio is closed.

The audio_info_t structure may be initialized through the use of the
AUDIO_INITINFO macro. This macro sets all fields in the structure to
values that are ignored by the AUDIO_SETINFO command. For instance, the
following code switches the output port from the built-in speaker to the
headphone jack without modifying any other audio parameters:

audio_info_t info;
err = ioctl(audio_fd, AUDIO_SETINFO, );

This technique eliminates problems associated with using a sequence of


The physical audio device names are system dependent and are rarely used by
programmers. Programmers should use the following generic device names:

/dev/audio Symbolic link to the system's primary audio

/dev/audioctl Symbolic link to the control device for

/dev/sound/0 First audio device in the system

/dev/sound/0ctl Audio control device for /dev/sound/0

/usr/share/audio/samples Audio files


An open(2) call will fail if:

EBUSY The requested play or record access is busy and either the O_NDELAY
or O_NONBLOCK flag was set in the open(2) request.

EINTR The requested play or record access is busy and a signal interrupted
the open(2) request.

An ioctl(2) call will fail if:

EINVAL The parameter changes requested in the AUDIO_SETINFO ioctl are
invalid or are not supported by the device.




Obsolete Uncommitted


close(2), fcntl(2), ioctl(2), open(2), poll(2), read(2), write(2), dsp(4I),
streamio(4I), attributes(7)


Due to a feature of the STREAMS implementation, programs that are
terminated or exit without closing the audio device may hang for a short
period while audio output drains. In general, programs that produce audio
output should catch the SIGINT signal and flush the output stream before

OmniOS July 8, 2018 OmniOS