12057 Writing part of the string to stderr makes zlogin exit

Review Request #2473 — Created Dec. 11, 2019 and submitted

citrus
illumos-gate
master
12057
f3259a8...
general
12057 Writing part of the string to stderr makes zlogin exit

bloody:illumos:master% make

bloody:illumos:master% mdb -e '__xpg4/x' zlogin
zlogin`__xpg4:
zlogin`__xpg4:  1

bloody:illumos:master% pfexec ./zlogin test
[Connected to zone 'test' pts/3]
Last login: Wed Dec 11 15:16:25 2019 on pts/3
OmniOS 5.11     omnios-master-f3e2549426        December 2019
root@test:~# strconf
ttcompat
ldterm
ptem
pts
root@test:~#
root@test:~# ./zcons
.root@test:~#
root@test:~# logout

[Connection to zone 'test' pts/3 closed]

bloody:illumos:master% PRIMARY_CC=gcc4,/opt/gcc-4.4.4/bin/gcc,gnu make

bloody:illumos:master% mdb -e '__xpg4/x' zlogin
mdb: failed to read data from target: no mapping for address
0x8168800:

bloody:illumos:master% pfexec ./zlogin test
[Connected to zone 'test' pts/3]
Last login: Wed Dec 11 15:18:55 2019 on pts/3
OmniOS 5.11     omnios-master-f3e2549426        December 2019
root@test:~# strconf
ttcompat
ldterm
ptem
pts
root@test:~# ./zcons
.root@test:~#
root@test:~# logout

[Connection to zone 'test' pts/3 closed]
  • 0
  • 0
  • 1
  • 1
  • 2
Description From Last Updated
jlevon
  1. I'd prefer the comment about the module push to clarify that it's not really required to avoid pushing. Otherwise this looks fine.
    I can't see a better way to test this behaviour than using __xpgv4

    1. I've dropped the changes to module push. The kernel will take care of ignoring them in XPG4.

  2. 
      
tsoome
  1. 
      
  2. usr/src/cmd/zlogin/zlogin.c (Diff revision 1)
     
     

    you could actually use I_LOOK to check if we have module in stream.

    1. If I do that, I might as well drop this part of the change since the kernel checks for duplicates anyway.
      That's fine with me too.

  3. 
      
citrus
jlevon
  1. Thanks for fixing this.

  2. 
      
domag02
  1. It would be better to "anchor" to one standard, and code according to that, instead of checking a private variable.

  2. usr/src/cmd/zlogin/zlogin.c (Diff revision 2)
     
     

    This is unnecessary (because it is not a shared library).
    Instead, set the code standard (see STANDARDS(5)) + link with values-xpg4.c (or values-xpg6.c).
    More or less a good example:
    /usr/src/cmd/env/Makefile

    1. The problem here is that the approved/official illumos compilers automatically link with values-xpg6.o in some circumstances. For example, gcc-7.4.0-il-1 does this if -std=gnu99 is passed as an argument. I don't want to have to assume too much about the environment, for example that using gnu89 or gnu99 has particular side-effects and not all distributions are using gcc7.

      Linking explicitly with values-xpg4/6 and using gcc 7.4.0-il-1 and -std=gnu99 causes a duplicate symbol error, as you'd expect.

      ld: fatal: symbol '__xpg4' is multiply-defined:
              (file /usr/lib/values-xpg6.o type=OBJT; file values-xpg.o type=OBJT);
      ld: fatal: file processing errors. No output written to zlogin
      

      It would be better to "anchor" to one standard, and code according to that, instead of checking a private variable.

      FWIW, __xpg4 is documented as public in libc(3lib). I'm not saying that this is a great solution but it seems to be the best approach.

    2. and building with gcc8:

          env: symbol '__xpg4' is multiply-defined
      
          ld: fatal: symbol '__xpg4' is multiply-defined:
                  (file /usr/lib/values-xpg4.o type=OBJT; file values-xpg4.o type=OBJT);
          ld: fatal: file processing errors. No output written to env.xpg4
          collect2: error: ld returned 1 exit status
      
    3. I think this duplicate symbol error can be solved with a modified values-xpg[46].c file.
      For example, like this:
      values-gcc4-xpg6.c:

      .
      .
      .
      
      /*
       * This setting enables strictly conforming C99/SUSv3 behavior.
       */
      
      #include "xpg6.h"
      
      /*
       * GCC version 4.y.z-iln never links with values-xpg6.o automatically,
       * but newer versions do in some circumstances.
       * Fix this behavior to avoid "symbol multiply-defined" errors when linking
       * with GCC-s newer than 4.x.y.
       */
      #if defined(__GNUC__) && __GNUC__ > 4
      
      /* Dummy typedef to avoid "empty translation unit" warnings. */
      typedef int make_iso_compilers_happy;
      
      #else
      
      unsigned int __xpg6 = _C99SUSv3_mode_ON;
      
      /*
       * Also turn on XPG4 mode.
       */
      int __xpg4 = 1;
      
      #endif  /* defined(__GNUC__) && __GNUC__ > 4 */
      

      Of course, this can be reused with other programs, but only in this source code form.
      You can use this as you wish.

    4. The problem is that gcc > 4 does not always link in xpg4-values.o.

      There was quite a lot of discussion about this on IRC
      at https://echelog.com/logs/browse/illumos/1576018800 including:

      [16:48:28] <jlevon> gcc7 -std=c99 defines _XOPEN_VERSION 3 only via feature_tests.h
      [16:49:40] <jlevon> andyf: https://gist.github.com/jlevon/035917677395ff4158d4d8280191db45
      [16:49:50] <jlevon> the first part defines how the values files are used.
      [16:50:04] <jlevon> I guess from this, that __xpg4 is the only way it's going to work.
      

      It is a shame that we don't have a compile-time way to detect if the compiler is going to pull in xpg[46]-values.o - for now, I think a run-time check is a suitable solution.

    5. It is a shame that we don't have a compile-time way to detect if the compiler is...

      __STDC_VERSION__ is a good candidate:

      GCC __STDC__ __STDC_VERSION__
      4.4.4 -m32 -std=c89 1 Undefined
      4.4.4 -m32 -std=c99 1 199901L
      4.4.4 -m32 -std=gnu89 Undefined Undefined
      4.4.4 -m32 -std=gnu99 Undefined 199901L
      4.4.4 -m64 -std=c89 1 Undefined
      4.4.4 -m64 -std=c99 1 199901L
      4.4.4 -m64 -std=gnu89 Undefined Undefined
      4.4.4 -m64 -std=gnu99 Undefined 199901L
      7.3.0 -m32 -std=c89 1 Undefined
      7.3.0 -m32 -std=c99 1 199901L
      7.3.0 -m32 -std=gnu89 Undefined Undefined
      7.3.0 -m32 -std=gnu99 Undefined 199901L
      7.3.0 -m64 -std=c89 1 Undefined
      7.3.0 -m64 -std=c99 1 199901L
      7.3.0 -m64 -std=gnu89 Undefined Undefined
      7.3.0 -m64 -std=gnu99 Undefined 199901L

      Source of my dump script:
      gcc_defines_dump.sh:

      #!/bin/bash
      
      gcc_basedirs_arr=("/opt/gcc/" "/usr/gcc/")
      std_arr=("c89" "c90" "c99" "c11" "c17" "gnu89" "gnu90" "gnu99" "gnu11" "gnu17")
      targ_arr=("32" "64")
      
      for gcc_basedir in "${gcc_basedirs_arr[@]}"; do
          for gcc_dir in "$gcc_basedir"*; do
              if [[ -d "$gcc_dir" && ! -L "$gcc_dir" && -f ${gcc_dir}/bin/gcc ]]; then
                  gcc_ver=$(basename "$gcc_dir")
                  for std in "${std_arr[@]}"; do
                      for targ in "${targ_arr[@]}"; do
                          dumpfile="dump_${gcc_ver}_m${targ}_std_${std}.h"
                          echo Dumpfile is: "$dumpfile"
                          touch $dumpfile
                          ${gcc_dir}/bin/gcc -dM -E -m${targ} -std="$std" - < /dev/null > "$dumpfile"
                          if [ $? -ne 0 ]; then
                              echo Failure, dumpfile "$dumpfile" will be deleted.
                              rm -f "$dumpfile"
                          fi
                      done
                  done
              fi
          done
      done
      
    6. That list is not useful unless it also covers whether xpg4 actually gets linked in, AND it covers gcc's 4, 7, 8, and 9, across at least SmartOS, OmniOS, and OpenIndiana.
      Andy's fix is clearly the simplest approach here.

    7. Yeah, unfortunately STDC_VERSION is a weak indicator of whether xpg values were linked in and doesn't work in all cases. We are going to have to check this at runtime for now.

  3. 
      
tsoome
  1. Ship It!
  2. 
      
alp
  1. Ship It!
  2. 
      
alp
  1. It's still surprising for me that firstly we break behavior of read(), and then should fix it in dozen places. And what about applications? How many patches should we apply to make, for example, tmux reilably work when it's compiled in c99 mode.

    1. XPG4v2 says:

      How read() handles zero-byte STREAMS messages is determined by the current read mode setting.
      In message-nondiscard mode or messagediscard mode, a zero-byte message returns 0 and the message is removed from the STREAM.

      Sun will have changed the STREAMS implementation to match the standard and I imagine they preserved existing behaviour for binary compatibility.
      I think it has always been wrong for anything to assume zero coming back from a read() on a STREAMS device means EOF.

  2. 
      
citrus
Review request changed

Status: Closed (submitted)

Loading...