12668: ZFS support for vectorized algorithms on x86 (initial support)

Review Request #2560 — Created May 20, 2020 and submitted

jjelinek
illumos-gate
12668
general

This is a partial port of the OpenZFS code for hardware assisted (FPU) raidz handling. The current code does not provide any of the FPU support yet. Instead, this is just the refactoring to enable the plugin code so that a future change can add the actual FPU algorithms.

ZFS test suite run, including the new tests for verifying raidz generation and reconstruction still matches the existing behavior.

  • 0
  • 0
  • 1
  • 0
  • 1
Description From Last Updated
jbk
  1. Just a minor nit. I do question the soundness of 'pick the fastest thing at boot time', however since this is all private interfaces, I think it's also something that could be hashed out later if we don't have a good answer right now.

    It might be worth brainstorming on ways to override the implementation choice (once we have more than one) -- e.g. is it something that could be done via /etc/system, /kernel/drv/zfs.conf, etc. but not this change (i.e. I'm not suggesting holding this up for that)

    1. Overall this is just how it is done in OpenZFS, so I'd like to keep it this
      way. We can override this at runtime using mdb to set 'zfs_vdev_raidz_impl'.
      Setting this to 0 will using the 'original' algorithm, setting it to 1 will
      use the 'scalar' algorithm. Neither of those uses the FPU. Setting this to
      -1 is the default behavior and the uses the 'fastest' algorithm. If we wanted
      to allow this to be overridden in /etc/system, we would have to modify the
      OpenZFS code to avoid the initial benchmark run which picks the fastest
      algorithm based on the current HW.

    2. One other possibility here for overriding the algorithm in /etc/system
      is with the current 'user_sel_impl' variable. That defaults to 'fastest' (-1)
      but it's what is actually used to set 'zfs_vdev_raidz_impl'.
      We could rename that variable with a 'zfs' prefix and treat it
      as the overridable setting.

  2. usr/src/cmd/raidz_test/Makefile (Diff revision 1)
     
     

    This could be removed.

  3. 
      
jjelinek
jbk
  1. Yeah -- I didn't mean to imply they needed to be addressed now -- more of a lament. It seems like the 'choose the fastest' technique could (for example) cause AVX512 implementations to be chosen on systems where using AVX512 will slow down everything else on the chip (but make the AVX512 stuff go fast).

    1. Thanks for the review. Yes, I am very leery of of the AVX512 algorithms since we
      know about the CPU downclocking that can result. I'm not planning on porting
      those to illumos any time soon. The next phase of this work, which does add
      the FPU-based algorithms, only will include SSE and AVX2 algorithms.

  2. 
      
jjelinek
Review request changed

Status: Closed (submitted)

Loading...