12794 ZFS support for vectorized algorithms on x86 (HW support)

Review Request #2570 — Created June 5, 2020 and submitted

jjelinek
illumos-gate
12794
general

This is a port from OpenZFS and is a continuation of the work I started with 12668. This adds support for sse2, ssse3 and avx2 raidz parity algorithms. This change depends on the kfpu code from 12793.



  • 0
  • 0
  • 0
  • 4
  • 4
Description From Last Updated
tsoome
  1. 
      
  2. can we get rid of red ugliness here and below?:)

    1. Normally I would be happy to fix these, but for all three of these, this is the exact code as in OpenZFS. Due to the complexity of all three of these files, I think it is more valuable to be able to diff the illumos file against the OpenZFS file and see that they are the same. Thus, I don't want to introduce a lot of spurious diffs here for no real benefit.

  3. can we get rid of red ugliness here and below?

  4. and red ugliness here and below...

  5. 
      
igork
  1. 
      
  2. i think all of definitions can be moved to one __x86 - no need this one for every line

    1. i just looked to original OpenZFS implementation and understand why it was splitted to several definitions - no need fix it.

  3. 
      
igork
  1. i have panic on DEBUG BUILD under vmware esxi 6.7 with:

    <pre>
    SMBIOS v2.7 loaded (10240 bytes)initialized model-specific module 'cpu_ms.GenuineIntel' on chip 0 core 0 strand 0
    root nexus = i86pc
    pseudo0 at root
    pseudo0 is /pseudo
    scsi_vhci0 at root
    scsi_vhci0 is /scsi_vhci
    Reading Intel IOMMU boot options
    npe0 at root: space 0 offset 0
    npe0 is /pci@0,0
    PCI Express-device: isa@7, isa0

    panic[cpu0]/thread=fffffffffbc580c0: assertion failed: (pf->fpu_flags & FPU_EN) == 0, file: ../../intel/ia32/os/fpu.c, line: 1375

    Warning - stack not written to the dump buffer
    fffffffffbc97d40 genunix:process_type+19b26d ()
    fffffffffbc97d90 unix:kernel_fpu_begin+2a0 ()
    fffffffffbc97e00 zfs:sse2_gen_p+19c ()
    fffffffffbc97e20 zfs:vdev_raidz_math_generate+54 ()
    fffffffffbc97e50 zfs:vdev_raidz_generate_parity+16 ()
    fffffffffbc97e70 zfs:benchmark_gen_impl+b ()
    fffffffffbc97f10 zfs:benchmark_raidz_impl+99 ()
    fffffffffbc97f50 zfs:benchmark_raidz+e9 ()
    fffffffffbc97f60 zfs:vdev_raidz_math_init+9 ()
    fffffffffbc97f90 zfs:spa_init+ff ()
    fffffffffbc97fb0 zfs:_init+13 ()
    fffffffffbc97ff0 genunix:modinstall+12d ()
    fffffffffbc98050 genunix:mod_hold_installed_mod+77 ()
    fffffffffbc980d0 genunix:modrload+1ab ()
    fffffffffbc980f0 genunix:modload+d ()
    fffffffffbc98130 genunix:rootconf+6d ()
    fffffffffbc98170 genunix:vfs_mountroot+6e ()
    fffffffffbc981b0 genunix:main+194 ()
    fffffffffbc981c0 unix:_locore_start+90 ()

    panic: entering debugger (no dump device, continue to reboot)
    </pre>

    1. <pre>
      [0]> $C
      fffffffffbca0420 kmdb_enter+0xb()
      fffffffffbca0450 debug_enter+0x53(fffffffffb9a85a8)
      fffffffffbca0530 panicsys+0x5ed(fffffffffbf98500, fffffffffbc97c98, fffffffffbca0540, 1)
      fffffffffbc97c80 vpanic+0x15c()
      fffffffffbc97cf0 0xfffffffffb8ba291()
      fffffffffbc97d40 0xfffffffffbe438f5()
      fffffffffbc97d90 kernel_fpu_begin+0x2a0(0, 2)
      fffffffffbc97e00 zfssse2_gen_p+0x19c(fffffe0be6286980) fffffffffbc97e20 zfsvdev_raidz_math_generate+0x54(fffffe0be6286980)
      fffffffffbc97e50 zfsvdev_raidz_generate_parity+0x16(fffffe0be6286980) fffffffffbc97e70 zfsbenchmark_gen_impl+0xb(fffffe0be6286980, 0)
      fffffffffbc97f10 zfsbenchmark_raidz_impl+0x99(fffffe0be6286980, 0, fffffffff7abc860) fffffffffbc97f50 zfsbenchmark_raidz+0xe9()
      fffffffffbc97f60 zfsvdev_raidz_math_init+9() fffffffffbc97f90 zfsspa_init+0xff(3)
      fffffffffbc97fb0 zfs`_init+0x13()
      fffffffffbc97ff0 modinstall+0x12d(fffffe0be65b8a20)
      fffffffffbc98050 mod_hold_installed_mod+0x77(fffffe0be624aba0, 1, 0, fffffffffbc98074)
      fffffffffbc980d0 modrload+0x1ab(fffffffffbf72ccd, fffffffffbc02000, 0)
      fffffffffbc980f0 modload+0xd(fffffffffbf72ccd, fffffffffbc02000)
      fffffffffbc98130 rootconf+0x6d()
      fffffffffbc98170 vfs_mountroot+0x6e()
      fffffffffbc981b0 main+0x194()
      [0]>
      </pre>

      we can see - we have flags=2 in function kernel_fpu_begin+0x2a0(0, 2)

    2. Yes, sorry about that, I have an incorrect ASSERT in the DEBUG build which I have fixed and which will be in the next rev of the kfpu code review (https://www.illumos.org/rb/r/2569/).

  2. 
      
tsoome
  1. Ship It!
  2. 
      
jjelinek
tsoome
  1. Ship It!
  2. 
      
jjelinek
tsoome
  1. Ship It!
  2. 
      
jjelinek
Review request changed

Status: Closed (submitted)

Loading...