Add demangle(1) command

Review Request #2518 — Created Feb. 15, 2020 and submitted

jbk
illumos-gate
12310
general

Add demangle(1) command

Run with various inputs both cmdline and via stdin and examined the output.

  • 3
  • 0
  • 20
  • 1
  • 24
Description From Last Updated
So if we fail for some other reason, should we still output what the original symbol was for consistency in ... rm rm
Should we warn at all to stderr about failures to demangle a name? rm rm
I don't think launching into a discussion of what a mangled symbol is right at the start is necessarily the ... rm rm
citrus
  1. Ship It!
  2. 
      
jbk
rm
  1. 
      
  2. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
    We often distinguish between a fatal error and an error for usage. Presumably we should here too?
    1. Sure -- it seems more prevalent for things defined by standards, but easy enough to do. There's no real defined value to use ala EXIT_FAILURE.. though I guess 2 is often used. Any preference on that?

    2. Conventionally in illumos 2 is used for that. It probably is worth us at some point defining something.

  3. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
    Rather than cast zero to a pointer type, why not just use NULL, and then you shouldn't need the pointer either.
    
    Also, this returns a pointer type, see head/xlocale.h, so don't do a comparison with zero.
    1. newlocale(3C) explicitly uses (locale_t)0 -- would it be worth filing a follow up ticket to correct the manpage? Regardless I'll use NULL -- that was my initial inclination until I saw the man page and figured there must be something to the explicitness.

    2. I suspect the reason for this is that the fact that a locale_t is a pointer is not actually defined by the standard (which is where this comes up from). I'm not sure what we should do with respect to the manual page and portable programs. It's just that using a non-pointer type when we consider an LP64 transition definitely leads to less defined behavior about extension.

  4. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
    Please give the user a more actionable error message. If I saw an error message that was something like:
    
    'demangle: newlocale: no such file or directory' I would have no idea what to do. At least something like, 'failed to construct the C locale: <error string>' would be a bit more useful and would speed up the debugging process/confusion.
  5. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
    Should we use warnx here for consistency of having the program name at the start?
    1. Actually errx(EXIT_FAILURE, ...) is probably better -- it'll do the demangle: msg bit (but not append strerror(errno)) and exit for us (since we're exiting immediately afterwards).

  6. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
    Presumably i should match the type of argc, which is an int?
    1. I'm not sure either the kernel or crt0 permits a negative argc value (an interesting experiment perhaps for another time), but no harm in changing it (I just tend to use size_t for loops out of habit).

  7. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
     
    Should we warn at all to stderr about failures to demangle a name?
    1. c++filt and the like do not -- they just print the original symbol. What I have done is for any failure that isn't 'not a mangled symbol' (EINVAL) -- most likely ENOMEM, but I think there could be others -- is print out those errors.

    2. If everyone else is used to software not printing those out then I'm not sure if users will expect it or not. We should make sure that the fact that errors are now printed out is clear in the manual page so someone knows that they may need to redirect stderr.

  8. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
    It's probably worth a comment or explanation of how this heuristic was determined. Without context, it seems arbitrary and it's not clear why certain characters are there and not others.
  9. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
    Please give the user a better error message here and all the other uses of err in this function. I'm only commenting on it here.
  10. usr/src/cmd/demangle/demangle.c (Diff revision 2)
     
     
     
     
     
     
    Shouldn't we be just checking that fprintf returned -1? Also, since you aren't flushing here and are relying on something else to the -1, I'm not sure why you'd check ferror, given that most of the time it probably will be buffered and if not, then the fprintf should propagate the error, no?
  11. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    Shouldn't symbol be in a .Ar macro?
  12. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    I don't think launching into a discussion of what a mangled symbol is right at the start is necessarily the best way to get going. I probably would start with a summary of what the program actually does, like you have at line 46-49. I think that's a bit more user friendly, otherwise what this does is buried behind an explanation of symbol mangling. I'm assuming you did it this way because that's how the GNU c++filt manual is written.
    1. Yeah -- I wasn't sure where to start, so I used that a guide. Instead I'll move lines 46-49 up first and we can see how that sounds.

  13. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
     
    You should refer to the actual operand in this description to refer back to what's in the synopsis.
  14. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    Most manuals spell out stdin to be 'standard in' rather than use the the C symbol name.
  15. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    Same thing with stdout and 'standard output'.
  16. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    Conventionally this uses .Sh, not .Ss.
  17. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
     
    I don't think it quite actually causes us to only demangle symbols from the language. Instead, it always tries to interpret everything as being from the language. If we had names that are valid in both, then we would interpret it in the specified one, which doesn't quite match what that sentence is saying to me.
  18. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    The xxxx for this looks a little weird at first glance as it makes it seem like it was a fill me in and not trying to be a four character value.
  19. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
     
     
     
    I would give a brief explanation of these. For example, for rust I would describe which of its mangling formats are supported and I would explain that auto attempts to figure out what i tis automatically.
  20. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    We normally don't put examples in these kinds of lists. Take a look at sleep(1) for an example.
  21. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    With each example you should have a textual description of what it is demonstrating.
  22. usr/src/man/man1/demangle.1 (Diff revision 2)
     
     
    Should readers be referred to any other utilities in a SEE ALSO or pages?
    
    We should probably explicitly call out the stability of this, in particular that the options are probably not stable and the output format isn't committed.
    1. I'm not sure what things would be appropriate -- the Itanium ABI document might be relevant. There is no documentation (other than source code) for the legacy rust format, so there's not really anything I can think of to point to for that. I'm not sure what other utilities in illumos would be relevant in a SEE ALSO section -- though if you have thoughts on that, I'm happy to add them.

    2. If you don't believe there's anything in particular, that's fine.

  23. 
      
jbk
rm
  1. Sorry for the delay in getting back to this. I saw that there were a number of unresolved things related to non-questions which is why I hadn't looked at the new version. It wasn't clear to me if you had wanted additional feedback or not. I've left follow up on the original feedback and a few more additional things I've noticed below.
  2. usr/src/cmd/demangle/demangle.c (Diff revisions 1 - 3)
     
     
     
     
     
    So if we fail for some other reason, should we still output what the original symbol was for consistency in addition to the warning or is that not necessary?
    1. Hmm.. that should be warn() not warnx() to get strerror(errno) appended (I'll fix that). We are printing the original symbol (argv[i]) in both cases, are you asking "do we need to print the original symbol for a non 'not a mangled symbol' error?" (pardon the double-ish negative)

      We don't have to, but it could be useful if there's some bug in the demangling code. The only other error should really be ENOMEM w/ the current code, but it doesn't mean there can't be a bug that proves that false.

    2. So the reason I raised this is for consistency with the other path, where you always print out things. I guess it may not make as much sense in this case, but it seems like we'd want the behavior to be that there's always one line of output for each word and if there was an error or not.
      
      Looking at this again, this also suggests that the way that we're handling the error status is wrong. Right now if we hit ENOMEM and basically have an internal error that means we can't demangle a symbol, the program will drive on (which is fine), but it will exit zero, saying that everything was fine, when in fact the opposite is true! I think it's worth spending some time overhauling how the exit status is propagated out so someone can actually know something went wrong without having to necessarily see if anything showed up on stderr.
  3. usr/src/man/man1/demangle.1 (Diff revisions 1 - 3)
     
     
    I think you probably want to phrase this more concretely as:
    
    .Sh INTERFACE STABILITY
    The command line options are
    .Sy Uncommitted .
    The output format is
    .Sy Not-an-Interface .
    
    We should use the standard names here from attributes(5).
  4. usr/src/cmd/demangle/demangle.c (Diff revision 3)
     
     
     
     
     
     
     
    Given that you've changed how the symbol version works and warns, should the custr version match?
    1. Yeah -- I'll make them match.

  5. 
      
jbk
jbk
rm
  1. Thanks for working through this stuff. Seems reasonable.

  2. usr/src/man/man1/demangle.1 (Diff revision 5)
     
     
    Is it worth mentioning here that symbol boundaries are only determined by the C locale or is that too much?
    1. I'm not sure -- it feels more like an implementation detail that just happens to make life simpler (based on the choices the languages made).

  3. 
      
citrus
  1. Ship It!
  2. 
      
jbk
Review request changed

Status: Closed (submitted)

Loading...