MHD(4I) 4I MHD(4I)

NAME


mhd - multihost disk control operations

SYNOPSIS


#include <sys/mhd.h>

DESCRIPTION


The mhd ioctl(2) control access rights of a multihost disk, using disk
reservations on the disk device.

The stability level of this interface (see attributes(7)) is evolving. As
a result, the interface is subject to change and you should limit your use
of it.

The mhd ioctls fall into two major categories: (1) ioctls for non-shared
multihost disks and (2) ioctls for shared multihost disks.

One ioctl, MHIOCENFAILFAST, is applicable to both non-shared and shared
multihost disks. It is described after the first two categories.

All the ioctls require root privilege.

For all of the ioctls, the caller should obtain the file descriptor for the
device by calling open(2) with the O_NDELAY flag; without the O_NDELAY
flag, the open may fail due to another host already having a conflicting
reservation on the device. Some of the ioctls below permit the caller to
forcibly clear a conflicting reservation held by another host, however, in
order to call the ioctl, the caller must first obtain the open file
descriptor.

Non-shared multihost disks
Non-shared multihost disks ioctls consist of MHIOCTKOWN, MHIOCRELEASE,
HIOCSTATUS, and MHIOCQRESERVE. These ioctl requests control the access
rights of non-shared multihost disks. A non-shared multihost disk is one
that supports serialized, mutually exclusive I/O mastery by the connected
hosts. This is in contrast to the shared-disk model, in which concurrent
access is allowed from more than one host (see below).

A non-shared multihost disk can be in one of two states:

+o Exclusive access state, where only one connected host has I/O
access

+o Non-exclusive access state, where all connected hosts have I/O
access. An external hardware reset can cause the disk to enter the
non-exclusive access state.

Each multihost disk driver views the machine on which it's running as the
"local host"; each views all other machines as "remote hosts". For each
I/O or ioctl request, the requesting host is the local host.

Note that the non-shared ioctls are designed to work with SCSI-2 disks.
The SCSI-2 RESERVE/RELEASE command set is the underlying hardware facility
in the device that supports the non-shared ioctls.

The function prototypes for the non-shared ioctls are:

ioctl(fd, MHIOCTKOWN);
ioctl(fd, MHIOCRELEASE);
ioctl(fd, MHIOCSTATUS);
ioctl(fd, MHIOCQRESERVE);

MHIOCTKOWN Forcefully acquires exclusive access rights to the multihost
disk for the local host. Revokes all access rights to the
multihost disk from remote hosts. Causes the disk to enter
the exclusive access state.

Implementation Note: Reservations (exclusive access rights)
broken via random resets should be reinstated by the driver
upon their detection, for example, in the automatic probe
function described below.

MHIOCRELEASE Relinquishes exclusive access rights to the multihost disk
for the local host. On success, causes the disk to enter
the non- exclusive access state.

MHIOCSTATUS Probes a multihost disk to determine whether the local host
has access rights to the disk. Returns 0 if the local host
has access to the disk, 1 if it doesn't, and -1 with errno
set to EIO if the probe failed for some other reason.

MHIOCQRESERVE Issues, simply and only, a SCSI-2 Reserve command. If the
attempt to reserve fails due to the SCSI error Reservation
Conflict (which implies that some other host has the device
reserved), then the ioctl will return -1 with errno set to
EACCES. The MHIOCQRESERVE ioctl does NOT issue a bus device
reset or bus reset prior to attempting the SCSI-2 reserve
command. It also does not take care of re-instating
reservations that disappear due to bus resets or bus device
resets; if that behavior is desired, then the caller can
call MHIOCTKOWN after the MHIOCQRESERVE has returned
success. If the device does not support the SCSI-2 Reserve
command, then the ioctl returns -1 with errno set to
ENOTSUP. The MHIOCQRESERVE ioctl is intended to be used by
high-availability or clustering software for a "quorum"
disk, hence, the "Q" in the name of the ioctl.

Shared Multihost Disks


Shared multihost disks ioctls control access to shared multihost disks.
The ioctls are merely a veneer on the SCSI-3 Persistent Reservation
facility. Therefore, the underlying semantic model is not described in
detail here, see instead the SCSI-3 standard. The SCSI-3 Persistent
Reservations support the concept of a group of hosts all sharing access to
a disk.

The function prototypes and descriptions for the shared multihost ioctls
are as follows:

ioctl(fd, MHIOCGRP_INKEYS, (mhioc_inkeys_t *)k)

Issues the SCSI-3 command Persistent Reserve In Read Keys to the device.
On input, the field k->li should be initialized by the caller with
k->li.listsize reflecting how big of an array the caller has allocated
for the k->lilist field and with `k->li.listlen == 0'. On return, the
field k->li.listlen is updated to indicate the number of reservation
keys the device currently has: if this value is larger than
k->li.listsize then that indicates that the caller should have passed a
bigger k->li.list array with a bigger k->li.listsize. The number of
array elements actually written by the callee into k->li.list is the
minimum of k->li.listlen and k->li.listsize. The field k->generation is
updated with the generation information returned by the SCSI-3 Read Keys
query. If the device does not support SCSI-3 Persistent Reservations,
then this ioctl returns -1 with errno set to ENOTSUP.

ioctl(fd, MHIOCGRP_INRESV, (mhioc_inresvs_t *)r)

Issues the SCSI-3 command Persistent Reserve In Read Reservations to the
device. Remarks similar to MHIOCGRP_INKEYS apply to the array
manipulation. If the device does not support SCSI-3 Persistent
Reservations, then this ioctl returns -1 with errno set to ENOTSUP.

ioctl(fd, MHIOCGRP_REGISTER, (mhioc_register_t *)r)

Issues the SCSI-3 command Persistent Reserve Out Register. The fields
of structure r are all inputs; none of the fields are modified by the
ioctl. The field r->aptpl should be set to true to specify that
registrations and reservations should persist across device power
failures, or to false to specify that registrations and reservations
should be cleared upon device power failure; true is the recommended
setting. The field r->oldkey is the key that the caller believes the
device may already have for this host initiator; if the caller believes
that that this host initiator is not already registered with this
device, it should pass the special key of all zeros. To achieve the
effect of unregistering with the device, the caller should pass its
current key for the r->oldkey field and an r->newkey field containing
the special key of all zeros. If the device returns the SCSI error code
Reservation Conflict, this ioctl returns -1 with errno set to EACCES.

ioctl(fd, MHIOCGRP_RESERVE, (mhioc_resv_desc_t *)r)

Issues the SCSI-3 command Persistent Reserve Out Reserve. The fields of
structure r are all inputs; none of the fields are modified by the
ioctl. If the device returns the SCSI error code Reservation Conflict,
this ioctl returns -1 with errno set to EACCES.

ioctl(fd, MHIOCGRP_PREEMPTANDABORT, (mhioc_preemptandabort_t *)r)

Issues the SCSI-3 command Persistent Reserve Out Preempt-And-Abort. The
fields of structure r are all inputs; none of the fields are modified by
the ioctl. The key of the victim host is specified by the field
r->victim_key. The field r->resvdesc supplies the preempter's key and
the reservation that it is requesting as part of the SCSI-3 Preempt-And-
Abort command. If the device returns the SCSI error code Reservation
Conflict, this ioctl returns -1 with errno set to EACCES.

ioctl(fd, MHIOCGRP_PREEMPT, (mhioc_preemptandabort_t *)r)

Similar to MHIOCGRP_PREEMPTANDABORT, but instead issues the SCSI-3
command Persistent Reserve Out Preempt. (Note: This command is not
implemented).

ioctl(fd, MHIOCGRP_CLEAR, (mhioc_resv_key_t *)r)
Issues the SCSI-3 command Persistent Reserve Out Clear. The input
parameter r is the reservation key of the caller, which should have been
already registered with the device, by an earlier call to
MHIOCGRP_REGISTER.

For each device, the non-shared ioctls should not be mixed with the
Persistent Reserve Out shared ioctls, and vice-versa, otherwise, the
underlying device is likely to return errors, because SCSI does not permit
SCSI-2 reservations to be mixed with SCSI-3 reservations on a single
device. It is, however, legitimate to call the Persistent Reserve In
ioctls, because these are query only. Issuing the MHIOCGRP_INKEYS ioctl is
the recommended way for a caller to determine if the device supports SCSI-3
Persistent Reservations (the ioctl will return -1 with errno set to ENOTSUP
if the device does not).

MHIOCENFAILFAST Ioctl


The MHIOCENFAILFAST ioctl is applicable for both non-shared and shared
disks, and may be used with either the non-shared or shared ioctls.

ioctl(fd, MHIOENFAILFAST, (unsigned int *)millisecs)

Enables or disables the failfast option in the multihost disk driver and
enables or disables automatic probing of a multihost disk, described
below. The argument is an unsigned integer specifying the number of
milliseconds to wait between executions of the automatic probe function.
An argument of zero disables the failfast option and disables automatic
probing. If the MHIOCENFAILFAST ioctl is never called, the effect is
defined to be that both the failfast option and automatic probing are
disabled.

Automatic Probing


The MHIOCENFAILFAST ioctl sets up a timeout in the driver to periodically
schedule automatic probes of the disk. The automatic probe function works
in this manner: The driver is scheduled to probe the multihost disk every n
milliseconds, rounded up to the next integral multiple of the system
clock's resolution. If

1. the local host no longer has access rights to the multihost
disk, and

2. access rights were expected to be held by the local host,

the driver immediately panics the machine to comply with the failfast
model.

If the driver makes this discovery outside the timeout function, especially
during a read or write operation, it is imperative that it panic the system
then as well.

RETURN VALUES


Each request returns -1 on failure and sets errno to indicate the error.

EPERM Caller is not root.

EACCES Access rights were denied.

EIO The multihost disk or controller was unable to
successfully complete the requested operation.

EOPNOTSUP The multihost disk does not support the operation. For
example, it does not support the SCSI-2 Reserve/Release
command set, or the SCSI-3 Persistent Reservation
command set.

STABILITY


Uncommitted

SEE ALSO


ioctl(2), open(2), attributes(7)

OmniOS October 23, 2017 OmniOS