Disconnect NVMe submission and completion queues

Review Request #1979 — Created June 7, 2019 and submitted — Latest diff uploaded

pwinder
illumos-gate
general
hans, rm

Separate the submission and completion queues. Each queue pair has a dedicated submission queue, but may have shared completion queues. This allows the number of submission queues to be increased beyond the current limitation set by the nunber of interrupt vectors.

The change includes a few bug fixes we have come across. They are highlighted in the diff.

See: https://www.illumos.org/issues/11202

Issues also covered here are:
nvme may queue more submissions than allowed
nvme_get_logpage() can allocate a too small buffer to receive logpage data
Panic in nvme_fill_prp() because of miscalculation of the number of PRPs per page
nvme in polled mode ignores the command call back

The base of this code has been in extensive use in WDC's Intelliflash O/S.

Within Illumos (using SmartOS), it has been tested by running the full zfs test suite against a set of NVMe drives, as well as stress tested using vdbench.

It has been run on raw hardware with Dual Intel(R) Xeon(R) Gold 6130, with NVMe drives in a PCI fanout behind a Switchtec switch.

Also tested in a Fusion VM with a NVMe root disk.

Loading...