12794 ZFS support for vectorized algorithms on x86 (HW support)

Review Request #2570 — Created June 5, 2020 and submitted


This is a port from OpenZFS and is a continuation of the work I started with 12668. This adds support for sse2, ssse3 and avx2 raidz parity algorithms. This change depends on the kfpu code from 12793.

  • 0
  • 0
  • 0
  • 4
  • 4
Description From Last Updated
  2. can we get rid of red ugliness here and below?:)

    1. Normally I would be happy to fix these, but for all three of these, this is the exact code as in OpenZFS. Due to the complexity of all three of these files, I think it is more valuable to be able to diff the illumos file against the OpenZFS file and see that they are the same. Thus, I don't want to introduce a lot of spurious diffs here for no real benefit.

  3. can we get rid of red ugliness here and below?

  4. and red ugliness here and below...

  2. i think all of definitions can be moved to one __x86 - no need this one for every line

    1. i just looked to original OpenZFS implementation and understand why it was splitted to several definitions - no need fix it.

  1. i have panic on DEBUG BUILD under vmware esxi 6.7 with:

    SMBIOS v2.7 loaded (10240 bytes)initialized model-specific module 'cpu_ms.GenuineIntel' on chip 0 core 0 strand 0
    root nexus = i86pc
    pseudo0 at root
    pseudo0 is /pseudo
    scsi_vhci0 at root
    scsi_vhci0 is /scsi_vhci
    Reading Intel IOMMU boot options
    npe0 at root: space 0 offset 0
    npe0 is /pci@0,0
    PCI Express-device: isa@7, isa0

    panic[cpu0]/thread=fffffffffbc580c0: assertion failed: (pf->fpu_flags & FPU_EN) == 0, file: ../../intel/ia32/os/fpu.c, line: 1375

    Warning - stack not written to the dump buffer
    fffffffffbc97d40 genunix:process_type+19b26d ()
    fffffffffbc97d90 unix:kernel_fpu_begin+2a0 ()
    fffffffffbc97e00 zfs:sse2_gen_p+19c ()
    fffffffffbc97e20 zfs:vdev_raidz_math_generate+54 ()
    fffffffffbc97e50 zfs:vdev_raidz_generate_parity+16 ()
    fffffffffbc97e70 zfs:benchmark_gen_impl+b ()
    fffffffffbc97f10 zfs:benchmark_raidz_impl+99 ()
    fffffffffbc97f50 zfs:benchmark_raidz+e9 ()
    fffffffffbc97f60 zfs:vdev_raidz_math_init+9 ()
    fffffffffbc97f90 zfs:spa_init+ff ()
    fffffffffbc97fb0 zfs:_init+13 ()
    fffffffffbc97ff0 genunix:modinstall+12d ()
    fffffffffbc98050 genunix:mod_hold_installed_mod+77 ()
    fffffffffbc980d0 genunix:modrload+1ab ()
    fffffffffbc980f0 genunix:modload+d ()
    fffffffffbc98130 genunix:rootconf+6d ()
    fffffffffbc98170 genunix:vfs_mountroot+6e ()
    fffffffffbc981b0 genunix:main+194 ()
    fffffffffbc981c0 unix:_locore_start+90 ()

    panic: entering debugger (no dump device, continue to reboot)

    1. <pre>
      [0]> $C
      fffffffffbca0420 kmdb_enter+0xb()
      fffffffffbca0450 debug_enter+0x53(fffffffffb9a85a8)
      fffffffffbca0530 panicsys+0x5ed(fffffffffbf98500, fffffffffbc97c98, fffffffffbca0540, 1)
      fffffffffbc97c80 vpanic+0x15c()
      fffffffffbc97cf0 0xfffffffffb8ba291()
      fffffffffbc97d40 0xfffffffffbe438f5()
      fffffffffbc97d90 kernel_fpu_begin+0x2a0(0, 2)
      fffffffffbc97e00 zfssse2_gen_p+0x19c(fffffe0be6286980) fffffffffbc97e20 zfsvdev_raidz_math_generate+0x54(fffffe0be6286980)
      fffffffffbc97e50 zfsvdev_raidz_generate_parity+0x16(fffffe0be6286980) fffffffffbc97e70 zfsbenchmark_gen_impl+0xb(fffffe0be6286980, 0)
      fffffffffbc97f10 zfsbenchmark_raidz_impl+0x99(fffffe0be6286980, 0, fffffffff7abc860) fffffffffbc97f50 zfsbenchmark_raidz+0xe9()
      fffffffffbc97f60 zfsvdev_raidz_math_init+9() fffffffffbc97f90 zfsspa_init+0xff(3)
      fffffffffbc97fb0 zfs`_init+0x13()
      fffffffffbc97ff0 modinstall+0x12d(fffffe0be65b8a20)
      fffffffffbc98050 mod_hold_installed_mod+0x77(fffffe0be624aba0, 1, 0, fffffffffbc98074)
      fffffffffbc980d0 modrload+0x1ab(fffffffffbf72ccd, fffffffffbc02000, 0)
      fffffffffbc980f0 modload+0xd(fffffffffbf72ccd, fffffffffbc02000)
      fffffffffbc98130 rootconf+0x6d()
      fffffffffbc98170 vfs_mountroot+0x6e()
      fffffffffbc981b0 main+0x194()

      we can see - we have flags=2 in function kernel_fpu_begin+0x2a0(0, 2)

    2. Yes, sorry about that, I have an incorrect ASSERT in the DEBUG build which I have fixed and which will be in the next rev of the kfpu code review (https://www.illumos.org/rb/r/2569/).

  1. Ship It!
  1. Ship It!
  1. Ship It!
Review request changed

Status: Closed (submitted)