12363 add O_DIRECT support

Review Request #2530 — Created March 6, 2020 and updated

jjelinek
illumos-gate
12363
general

Full details are in the bug report.



  • 0
  • 0
  • 12
  • 4
  • 16
Description From Last Updated
seeemef@mac.com
  1. 
      
  2. usr/src/man/man2/open.2 (Diff revision 1)
     
     

    SP "resued"?

  3. usr/src/man/man2/open.2 (Diff revision 1)
     
     

    Should directio be added to SEE ALSO and should the section be capitalized: (3C)?

    1. Yes, it probably should added to the SEE ALSO. And yes, it 3C needs to be capitalized.
  4. 
      
rm
  1. Would it be worth adding a unit test to the os-tests that verifies that some things like opening a UDS or other file systems like a named pipe, etc. will fail with O_DIRECT set?
    1. I did this as part of the new test I added in place of the ZoL O_DIRECT tests.

  2. usr/src/man/man2/open.2 (Diff revision 1)
     
     
    Please update the date.
  3. usr/src/man/man2/open.2 (Diff revision 1)
     
     
    We should add a sentence here about O_DIRECT.
  4. It's not clear why all of these ZFS related tests are part of a change that's adding a flag to open.
    1. I tried to explain that in this CR description, but since you're objecting, I've removed them and wrote a new test for out os-test suite instead.

    2. Sorry, I honestly didn't see that. I looked at the ticket and everything else, but had missed that entirely. To be clear, I think we should still add those tests in addition to what you have, but I'd rather it was in a follow up. When I started looking at them, it was hard to see how they exercised it, which is probably because they're other ones. The non-ZFS test helps, because most folks won't touch the ZFS ones because of the complexity in having an environment.

  5. usr/src/uts/common/fs/vnode.c (Diff revision 1)
     
     
     
     
     
     
     
    It appears other systems ignore the failure here (FreeBSD at least from a quick glance). I guess following the Linux error makes the most sense? Did you consider how others swallow the error at all?
    1. I did not consider any other OS. We obviously can only do one behavior or the other. Do you feel like the FreeBSD behavior is what most open source SW expects instead of the Linux behavior?

    2. I honestly have no idea what software expects. I think it'll be important to go through pkgsrc and other software to see what changes with this present. I suspect that most stuff will expect the Linux behavior and it makes sense. I haven't really done any research here beyond that. My hope is that folks pushing the change will.
    3. I'm not sure how to respond to this issue and until I can resolve it, I'm not going to try to address any of the other issues.

      There are over 1.5 million C files in github which use O_DIRECT. I can't survey all of these, but I can guess some of them might depend on the Linux failure/errno behavior, some of them probably expect/depend on running only on a specific file system, and some of them don't care or even check. I don't know how any of this helps me though.

      Because there is no standard for O_DIRECT and the behavior varies from OS to OS, I am not really sure what we should do. Because O_DIRECT is mostly a "hint" to the OS about what behavior to use, I can see the argument for following the FreeBSD approach, accepting the flag, but never returning any error for this hint on the open(2) syscall. I'm happy to change to that behavior, in which case I can delete the test code altogether. Otherwise, I don't know how to resolve this issue.

    4. So I think the path forward here might be a combination of a few things:

      • Enumerate in the ticket what Linux and FreeBSD do here, where they differ and where they're the same
      • Probably make choices that bias towards the Linux behaviour, where their choice is safe, because on balance that's the widest software target
      • When we're making arbitrary choices, bias towards ones we can tighten or relax later; e.g., if Linux returns EINVAL in some cases that FreeBSD swallows, we should probably start with EINVAL too -- if it's an issue we can start swallowing the condition later, I imagine.
      • Not survey all of pkgsrc, but let's at least look at what, say, PostgreSQL is going to do on illumos once we suddenly grow O_DIRECT. Is it going to handle the error, or not, or whatever other behaviour. It would be a shame if people only handle the error when #ifdef Linux specifically, rather than just all the time when #ifdef O_DIRECT, say.

      I agree that it's all pretty wishy-washy, and that it is unlikely a masked man from the Austin Group is going to swoop in and clean it all up. Let's do a little written diligence, and then get to shipping.

      Thanks, as ever, for the work Jerry!

    5. Hi Josh, thanks a lot for the useful suggestions. I'll work on answering these questions and put that info into the bug report.

    6. I have added Joshua Clulow's questions, and my answers, to the bug report. I'll wait for feedback before doing anything else on this issue.

    7. Based on an email exchange with Robert, my comments in the bug report and this approach, seem ok.

  6. 
      
jjelinek
seeemef@mac.com
  1. 
      
  2. Is a boolean needed, or should this method just inspect f_basetype to control its mode? Or alternatively should this method strictly assert the caller's intent?

    1. There are many possible ways to implement the test case. I've changed it to another approach to see if you like this way better.

  3. 
      
rm
  1. 
      
  2. If we're going to create a new subdir and system call logic, I'd suggest we make sure that this always builds both 32-bit and 64-bit binaries by default like I've done with others.
  3. usr/src/test/os-tests/tests/syscall/open.c (Diff revision 2)
     
     
     
    There are tests for O_DIRECTORY already, fwiw, so this isn't the only open(2) tests. May be worth calling that out or we should move what I already did in a follow up.
    1. I'm not sure where to call this out, other than by filing a new bug report after this code integrates. I'm happy to open a new bug at that time.

    2. I just meant that you would mention this in the comment. The test makes it sound like it's the only tests for open, when we have others that I wrote. Something like 'See also os-tests/tests/odirectory.c'.
    3. I've added a comment to make a note of this.

  4. usr/src/test/os-tests/tests/syscall/open.c (Diff revision 2)
     
     
     
     
    If you make a file, you should unlike a file. But you should consider making a name that's less likely to collide with something that already exists on the file system. It'd probably be safer to add O_EXCL to avoid that so it fails if it already exists.
  5. Is it worth also doing something like binding a pipe or door?
    1. As I noted on another response, I don't think this is necessary since the ioctl is implemented at the vnode layer.

    2. Hmm, I guess? I don't know. Even though O_DIRECTORY was implemented at the directory level, it seems like making sure we do the right thing on other types of files is useful. At least, that was my view. I'm not sure why an OpenZFS test would want to cover a door or socket. But if you've already done it there and that makes more sense, then great.
    3. Since we've had directio support in the system for a very long time and this code is not changing the directio behavior for any filesystem, it doesn't really seem necessary. Also, since this is basically just a performance hint, if a filesystem chooses to ignore the ioctl and we never return an error to the application, no harm is done. This is essentially the zfs behavior.

    4. The reason I brought it up was for writing an actually useful test suite and improving overall quality. I just would have liked to see something that actually tested the flag with different types of files. It's true that you could have called directio() with any of them, but today we don't have tests for that and just seemed like a good part of doing work in the system. But I get it, you're not changing anything here, which is a bit disappointing, but I can understand.

  6. 
      
jjelinek
seeemef@mac.com
  1. LGTM

  2. 
      
rm
  1. Thank for the other bits here, Jerry. Just a few notes on the test program.

  2. usr/src/test/os-tests/tests/syscall/open.c (Diff revisions 2 - 3)
     
     
    Unchecked return value.
  3. usr/src/test/os-tests/tests/syscall/open.c (Diff revisions 2 - 3)
     
     
    Missing an unlink in the error case.
    1. Since the open failed and is using O_CREAT | O_EXCL, it doesn't seem proper to potentially delete a file that the code did not create.

  4. usr/src/test/os-tests/tests/syscall/open.c (Diff revisions 2 - 3)
     
     
    Returning without unlinking the file.
    1. What file are you expecting to be unlinked?

    2. I confused myself between mktemp() and other bits. mktemp() I guess only creates the file name, but doesn't create the underlying file and I had confused myself into thinking it did. That confusion is why I was asking about the unlink() stuff.
  5. usr/src/test/os-tests/tests/syscall/open.c (Diff revisions 2 - 3)
     
     
     
     
     
     
     
     
     
     
     
     
    Missing cleanup of the temporary file in this error path.
    1. See my previous comment about why unlinking the file here seems inappropriate.

  6. 
      
jjelinek
seeemef@mac.com
  1. Ship It!
  2. 
      
rm
  1. 
      
  2. usr/src/test/os-tests/tests/syscall/open.c (Diff revisions 3 - 4)
     
     
     
     
     
    I think if you exit zero like this, the test running will treat it as a passed test.
    1. Any reason ont to fail the test in this case? I think that's common for other tests.

  3. 
      
jjelinek
jjelinek
rm
  1. Ship It!
  2. 
      
jjelinek
Review request changed
seeemef@mac.com
  1. Ship It!
  2. 
      
rm
  1. Ship It!
  2. 
      
Loading...