10225 wsdiff broken after 9979

Review Request #1362 — Created Jan. 12, 2019 and submitted

citrus
illumos-gate
master
10225
17f47b8...
general
10225 wsdiff broken after 9979

See https://www.illumos.org/issues/10225

  • 0
  • 0
  • 3
  • 3
  • 6
Description From Last Updated
citrus
rm
  1. 
      
  2. usr/src/tools/scripts/wsdiff.py (Diff revision 1)
     
     
    Did we set the locale to a utf-8 aware locale? This is true everywhere we're using utf-8 in this file.
    1. I'm using decode() to convert from a bytes object to a string object in a way that's portable across python2 and python3. The thing that is being read is actually stdout from commands like diff so is expected to be plain ASCII (however we do have some extended characters in a few header file comments that can cause issues if one of them was to change in the workspace)

    2. To be clearer, I'm saying that the output of the backend command should be treated as UTF-8, while expecting it to be almost always plain ASCII. decode() will convert that into the local locale but if a character cannot be represented in it then it will be replaced with '?'. This is good enough for the purposes of wsdiff.

  3. 
      
citrus
rm
  1. Ship It!
  2. 
      
olafbohlen
  1. Ship It!
  2. 
      
jclulow
  1. 
      
  2. usr/src/tools/scripts/wsdiff.py (Diff revision 2)
     
     

    If I am reading the documentation correctly, subprocess.check_output() would raise a CalledProcessError exception in the event that the child process exited with a non-zero code. This new function, getoutput(), appears to return the value (which is fine) but we then don't seem to check it here. Does that mean we won't hit the except: block below where a diff failure would otherwise be reported?

    1. This is one of the problems with having to review a fix for a botched previous commit.

      Diff is expected to return non-zero status codes (1 if differences are found for example).

      The original code (i.e. before 9979) was:

      data = commands.getoutput(diff_cmd + " " + tmpf1 + " " + tmpf2)
      

      which is documented as:

      ...the exit status is ignored and the return value is a string containing the command’s output.

      The except cause will still be triggered if any errors are thrown from the Popen stuff in getoutput()

  3. usr/src/tools/scripts/wsdiff.py (Diff revision 2)
     
     

    So I think this should probably be a binary string, i.e., elfmagic = b'\177ELF', and then we should intead open the file in the binary mode:

    with io.open(f, mode='rb') as fd:
    

    Otherwise it seems like we could get confused by possible locale and encoding issues by trying to treat the file as text instead of binary.

    1. Agreed, and fixed (and tested with python2 & python3).

  4. usr/src/tools/scripts/wsdiff.py (Diff revision 2)
     
     

    Should this call to open() actually be io.open() as on L458? I'm a bit confused about the difference between them.

    1. In python3, io.open() is an alias for open()
      However, in python2, open() does not handle encoded files. That is, if the file contains any non-

    2. Trying again.

      In python3, io.open() is an alias for open()
      However, in python2, open() does not handle encoding. That is, unless the data is pure ASCII, io.open() should be used. However, when opening a file in binary mode ('rb') it is fine to use open().
      For consistency with isELF(), I'll change this to use io.open().

  5. usr/src/tools/scripts/wsdiff.py (Diff revision 2)
     
     

    Same question as above -- subprocess.check_output() used to throw on non-zero exit, but we now seem to be silently ignoring the rc value here.

    1. As above, the original code used commands.getoutput() - The length of the returned data is checked below.

  6. usr/src/tools/scripts/wsdiff.py (Diff revision 2)
     
     

    Same question as above -- subprocess.check_output() used to throw on non-zero exit, but we now seem to be silently ignoring the rc value here.

    1. Again, this was originally commands.getoutput().
      Whilst I could add a check for the return status, if anything goes wrong with retrieving the output of uname -p then none of the required utilities will be found and wsdiff will abort early. (uname -p output is used to build the /opt/onbld/bin/i386/ path)

  7. 
      
citrus
citrus
jclulow
  1. Thanks for the explaining the exit status handling stuff! That all makes sense to me.

  2. 
      
citrus
Review request changed

Status: Closed (submitted)

Loading...