7863 AIO read/write requests return 0 bytes read/written

Review Request #357 — Created Feb. 14, 2017 and updated

dstepanov
illumos-gate
general
7863 AIO read/write requests return 0 bytes read/written


igork
  1. Ship It!
  2. 
      
marcel
  1. Ship It!
  2. 
      
danmcd
  1. I appreciate the analysis in the bug report. Basically, before the fix, there was a possibility of use-after-reassign, because aio_copyout_result_port() must be called before the reqp parameters are marked for reuse by aio_req_free_port().

    Did you do any performance testing? We're now holding the mutex for the additional time of aio_copyout_result_port(). It doesn't appear to be an inordinately long time, but I'm wondering if you noticed any performance changes after this fix?

    1. To be honest we made performance testing only for our case: AIO with real hardware. For this case we see no difference. I believe it is so because read/write operations themselves are the most time consuming operations. Some special test case should be written to test overall performance, smth like:
      - Increate number of threads to 50 or even 100
      - Use the files on ramdisk or ramdisk itself to minimize the IO impact
      - Maybe some other
      Maybe in this case we will be able to measure some difference in the performance, but i'm not sure. Also even in case of performance hit we will still have a race condition to fix. Another approach is to call mutex_exit() before calling the aio_copyout_result_port() routine and invoke the mutex_enter() after it. But this should be carefully investigated because the current code is like: get all the result values and then mark the structure as free.

    2. A simple dtrace script to measure time spent in aio_copyout_result_port() should be enough to measure how long we spend in there. And yes, correctness comes before performance. Also dropping-and-reacquiring is more expensive usually than just holding the mutex (esp. if the time-spent turns out to be minimal.)

      That you didn't notice anything bad w.r.t. performance suggests I'm making a big deal out of nothing. Measurements never hurt, though.

    3. Dan, what you are prefer: performance or spability?
      in our case - we need stable result with more simultaniasly threads.
      and - we have stability degradation in this moment - we can't kill application where threads in zomby queue - we have no zomby processes, but we can see not killed treads and application in unused state.

    4. Dan, sorry for confision, my comment is mistake - we are working on aonther problem with zomby threads and still in progrees, and not related to this topic - ignore it.

  2. 
      
Loading...