Support DHCP Client FQDN. Allow IAID/DUID for all v4.

Review Request #268 — Created Nov. 12, 2016 and submitted

seeemef@mac.com
illumos-gate
7388, 8517, 8518, 8519
general
- add options -h,--reqhost and -1,--primary flag to ipadm create-addr for
  DHCP, and add reqhost and primary properties for
  (show|set|reset)-addrprop.
- add dhcpagent options REQUEST_FQDN, DNS_DOMAINNAME, and ADOPT_DOMAINNAME.
- add ncu ip-reqhost and ip-primary properties for NWAM DHCP.
- send for Client Fully Qualified Domain Name (DHCP client option 81) the
  nodename if the interface is primary or the value of -h,--reqhost or
  ip-reqhost in order to request that a compatible DHCP server (such as
  ISC-DHCP or Microsoft Windows DHCP with Windows DNS Dynamic Update) add A
  and PTR resource records for the life of a lease.
- leave existing /etc/hostname.* handling for non-ipadm use to send DHCP
  client option 12 (Hostname), but also recognize -1,--primary (nodename);
  -h,--reqhost; or ip-reqhost if defined. (Hostname is sent if an FQDN
  cannot be determined from configuration).
- add dhcpagent option to use RFC 3315-style client ID (IAID/DUID) for all
  DHCPv4.
- update to allow to GET_TAG of ClientID from dhcpagent state machine to
  let dhcpinfo print a readable version of the system-managed IAID/DUID.
- document defunct client option 89.
- update man pages.
IPADM tests
===========

Initial:

- /etc/nodename is "nod"
- no defined DNS_DOMAINNAME in /etc/default/dhcpagent
- ADOPT_DOMAINNAME is unset (default is no) in /etc/default/dhcpagent
- 2 interfaces: e1000g0, e1000g1

1. create-addr: -T dhcp e1000g1/v4
    - sends DHCP request without Hostname or ClientFQDN
    - show-addr: ok with assigned IP address
    - netstat -D shows no flags for e1000g1
    - no DNS entry

2. create-addr: -T dhcp -1 e1000g0/v4 (use the -1,--primary flag)
    - sends DHCP request with Hostname ("nod")
    - show-addr: ok with assigned IP address
    - netstat -D shows [PRIMARY] for e1000g0

3. edit /etc/default/dhcpagent DNS_DOMAINNAME to foo.example.com, and reboot
    - sends DHCP request with ClientFQDN ("nod" nodename + foo.example.com DNS_DOMAINNAME)
    - DNS registers the new FQDN (in a "foo" child)

    3.b. edit /etc/default/dhcpagent DNS_DOMAINNAME back to example.com, and
        ipadm-refresh-addr e1000g0/v4
    - sends DHCP extension request with new ClientFQDN (nodename + DNS_DOMAINNAME)
    - DNS registers the new FQDN and un-registers the former value

4. set-addrprop -t -p reqhost=nod-m e1000g0/v4
    - sends DHCP extension request message with new ClientFQDN (reqhost + DNS_DOMAINNAME)
    - show-addrprop -o ALL -p reqhost #-- shows correct CURRENT and no PERSISTENT
    - DNS registers the requested FQDN and un-registers the former value

    4.b. (again but without -t flag) set-addrprop -p reqhost=nod-m2 e1000g0/v4
    - sends DHCP extension request message with new ClientFQDN (reqhost + DNS_DOMAINNAME)
    - show-addrprop -o ALL -p reqhost #-- shows correct CURRENT and PERSISTENT
    - DNS registers the requested FQDN and un-registers the former value

5. set-addrprop -p reqhost=nod-m.foo.example.com e1000g0/v4
    - sends DHCP extension request message with new ClientFQDN (reqhost)
    - DNS registers the requested FQDN (in a "foo" child) and un-registers the former value

6. reset-addrprop -p reqhost e1000g0/v4
    - sends DHCP extension request with new ClientFQDN (nodename + DNS_DOMAINNAME)
    - show-addrprop: -o ALL -p reqhost -- shows empty CURRENT and PERSISTENT
    - DNS registers the nodename FQDN and un-registers the former value

7. (non-primary) set-addrprop -p reqhost=nod-r e1000g1/v4
    - sends DHCP extension request message with new ClientFQDN (reqhost + DNS_DOMAINNAME)
    - show-addrprop -o ALL -p reqhost #-- shows correct CURRENT and PERSISTENT
    - DNS registers the requested FQDN

    7.b set-addrprop -p reqhost=nod-r2 e1000g1/v4
    - sends DHCP extension request message with new ClientFQDN (reqhost + DNS_DOMAINNAME)
    - show-addrprop -o ALL -p reqhost #-- shows correct CURRENT and PERSISTENT
    - DNS registers the requested FQDN and un-registers the former value

8. reset-addrprop -p reqhost e1000g1/v4
    - send DHCP extension request without ClientFQDN or Hostname
    - show-addrprop -o ALL -p reqhost #-- shows empty CURRENT and PERSISTENT
    - DNS un-registers the former value

9. edit /etc/default/dhcpagent set ADOPT_DOMAINNAME, unset DNS_DOMAINNAME, and
        refresh-addr e1000g0/v4
    - sends DHCP request with ClientFQDN ("nod" nodename + example.com DNSdmain)
    - DNS registers the requested FQDN and un-registers the former value

10. reset DNSdmain on DHCP server to return foo.example.com, and refresh-addr e1000g0/v4
    - sends DHCP request with ClientFQDN ("nod" nodename + example.com previous DNSdmain)

    10.b (again) refresh-addr e1000g0/v4
    - sends DHCP request with ClientFQDN ("nod" nodename + foo.example.com latest DNSdmain)
    - DNS registers the requested FQDN and un-registers the former value

11. unset DNSdmain on DHCP server, and refresh-addr e1000g0/v4
    - sends DHCP request with ClientFQDN ("nod" nodename + foo.example.com previous DNSdmain)

    11.b refresh-addr e1000g0/v4 again
    - sends DHCP request with ClientFQDN ("nod" nodename + example.com from resolv.conf)

12. reboot
    - sends DHCP request with ClientFQDN ("nod" nodename + example.com from resolv.conf)

13. install new host, activate V4_DEFAULT_IAID_DUID, and create primary dhcp and addrconf addresses
    - sends DHCP Discover message with FQDN and IAID/DUID ClientID
    - IDs show as expected, with common DUID for v6 and v4
    - Windows DNS does NOT register the AAAA. ISC-DHCP 4.3 is documented, however, to
      support this (https://www.reddit.com/r/sysadmin/comments/3ky7yc/howto_with_iscdhcp_43_a_client_configured_with/).

14. create two addresses with active 'primary' flag (ipadm -T dhcp -1 ...)
    - show-addrprop -o ALL -p primary #-- shows both addresses with PERSISTENT "on" but
      only one with CURRENT "on" because dhcpagent enforces that only one can be actively
      primary

NWAM tests
==========

Initial:

- /etc/nodename is "nod"
- no defined DNS_DOMAINNAME in /etc/default/dhcpagent
- 2 interfaces: e1000g0, e1000g1; and 2 DHCP addresses managed already by NWAM

1. set ip-primary=on for e1000g0
    - NWAM bounces the address
    - sends DHCP request with Hostname ("nod" PQDN)
    - DNS server does not register the name
    - netstat -D shows [PRIMARY] for e1000g0

2. edit DNS_DOMAINNAME to example.com, and set ip-primary again to on for e100g0
    - NWAM bounces the address
    - sends DHCP request with ClientFQDN ("nod" hostname + DNS domain name)
    - DNS server registers the FQDN

3. set ip-reqhost for e1000g0 to nod-m
    - NWAM bounces the address
    - sends DHCP request with ClientFQDN ("nod-m" reqhost + DNS domain name)
    - DNS server registers the FQDN and un-registers the former value

4. clear ip-reqhost for e1000g0
    - NWAM bounces the address
    - sends DHCP request with ClientFQDN ("nod" hostname + DNS domain name)
    - DNS server registers the FQDN and un-registers the former value

5. set ip-reqhost for e1000g0 to nod-m.foo.example.com
    - NWAM bounces the address
    - sends DHCP request with ClientFQDN (reqhost)
    - DNS server registers the FQDN (in a sub-folder, "foo") and un-registers the former value
  • 0
  • 0
  • 17
  • 1
  • 18
Description From Last Updated
seeemef@mac.com
seeemef@mac.com
danmcd
  1. First some overall impressions:

    1.) Please make sure what you have mentioned here in the RB ends up in the bug report for #7388.

    2.) The changes here are large enough where a bit of a design overview (e.g. "I now cache more in ipmgmt_aobjmap_t because <X>"). That overview should also go in the bug report.

    1. OK, I've updated the bug report.
  2. 
      
danmcd
  1. These are pass-0/1 comments. For my benefit (allowing further passes), and possibly others', I've put a webrev of this at http://kebe.com/~danmcd/webrevs/7388/ as well. I'll update the webrev every diff revision.

  2. Your new union-fields are prefixes ipmgmt_am. Could/should you do the same for the other fields? Or were you just worried about extra code change?

    1. The #define accessors for the new union-fields are prefixed "ipmgmt_(am|ir)", but the fields for the structs themselves are named "am_atype_cache" and "ir_atype_cache" in ipmgmt_aobjmap_t and ipmgmt_aobjop_rval_t, resp. I.e., the #define accessors get the additional prefix since they aren't protected by the struct namespaces and to name them similarly to their targets.
      
      As you indicate though, as a courtesy to reviewers, I would not want to refactor naming in the same patch with large functional changes.
    2. Thank you for the clarification.

  3. usr/src/common/net/dhcp/dhcp_impl.h (Diff revision 3)
     
     

    I couldn't find this on page one, but this is a sensible place to put this #define. Thank you.

  4. 
      
seeemef@mac.com
lorban
  1. You probably should also edit the IPADM(1M) man page to reflect the newly added flags.

    1. That's already done.
    2. You're right, I missed the man pages. Sorry for the moise!

  2. 
      
seeemef@mac.com
seeemef@mac.com
tsoome
  1. 
      
  2. tab issue? align with previous ones..

  3. tab/space mixup here. would be nice to clear neighbouring ones too.

  4. usr/src/lib/libipadm/common/libipadm.h (Diff revision 6)
     
     

    can you remove this red ugliness (space-tab mix)

  5. I think this mapfile change would be good to get its own issue and description what are the consumers for this symbol. You can still post it all as one patch I guess.

    1. I merely added a note in the issue for this case, 7388.
    2. Is this one officially okay now?

  6. usr/src/tools/scripts/onu.1onbld (Diff revision 6)
     
     

    It seems this man update (while nice), would be better to be posted in separate issue. This will also help to reduce the overall size for this update.

    1. Reverted the man change, as the underlying issue was fixed in a Dec 2016 commit for https://illumos.org/issues/6464.
  7. usr/src/uts/common/netinet/dhcp.h (Diff revision 6)
     
     

    missing tab before the value - is not aligned with others

  8. 
      
seeemef@mac.com
danmcd
  1. 
      
  2. Dumb question, since this is ANSI C, could you not replace these memcpy()s with a direct assignment? E.g.

    head->am_atype_cache = nodep->am_atype_cache;

    ?

    1. Yep you're right. I should have used the straight-forward syntax.

  3. Nice block comment.

  4. 
      
danmcd
  1. One last thing. I see new free() calls for non-changed-code allocators. Have you tested these with LD_PRELOAD=libumem.so and UMEM_DEBUG=default ? You should, and then "gcore" the running process after some time running to see if ::findleaks in mdb finds anything. Ask me if you need help with that.

    1. Can you clarify "non-changed-code allocators"? I think you're meaning the changes in the dhcp_smach_s struct, but I'm not sure. I only confirmed that dhcp_smach_t usage is always begun with a full bzero() and then managed using reset_smach and free_smach.
      
      I'm not familiar with those steps for testing, so I will need help.
    2. Pardon my latency on this. I see new free() calls without any new malloc(). I'm guessing these free()s are supposed to be there and you're doing the right thing.

      Set these environment variables: "LD_PRELOAD=libumem.so" and "UMEM_DEBUG=default" in your running daemons. Then, after some testing, do "gcore $DAEMON_PID". Run mdb on the corefile, and utter "::findleaks" in mdb to see what all is there. It's POSSIBLE there were pre-existing leaks (but there shouldn't be), but it's also possible there may be some new ones.

    3. The new free() calls are for strings allocated either by strdup or by inittab_decode (which mallocs).
      
      Thank you--I'll leak-test using that advice.
    4. It seems I must use LD_PRELOAD=libumem.so.1 for it to work.
      
      After going through this issue's test steps after having started a UMEM_DEBUG dhcpagent, I found no new leaks. I did, however, find a pre-existing leak in save_server_id, and I wrote up issue 8469.
  2. 
      
seeemef@mac.com
seeemef@mac.com
seeemef@mac.com
seeemef@mac.com
seeemef@mac.com
seeemef@mac.com
seeemef@mac.com
seeemef@mac.com
danmcd
  1. 
      
  2. usr/src/cmd/cmd-inet/lib/ipmgmtd/ipmgmt_door.c (Diff revisions 11 - 12)
     
     

    If a function is of type boolean_t, it is acceptable to say if (!boolean_function()) or if (boolean_function()). What is not acceptable is a function of an integer/numerical type, and people use if (int_function()) as a synonym for if (int_function() != 0) or if (!int_function()) as a synonym for if (int_function == 0).

    If anyone gave you crap about this, they're probably wrong. If you saw code like this, whoever wrote it likely didn't understand the problem that was trying to be solved.

    1. The == syntax is all over though. E.g., http://src.illumos.org/source/search?q=%22%3D%3D+B_FALSE%22+OR+%22%3D%3D+B_TRUE%22&defs=&refs=&path=&hist=&project=illumos-gate
    2. How much of that is mistaken/broken/bad-assumptions is a far more interesting question, however. The part of revision 12 where you do this was not necessary (unlike the cstyle fixes for your returns, which was). I'll let it go for now and mark this as "fixed", but it's bad, and someone at RTI may ding you on it anyway, just warning you.

    3. I'll quickly revert it. I have the pre-rebase commit easily from my reflog.

  3. 
      
seeemef@mac.com
danmcd
  1. Several of your functions appear to be "0 on success, -1 on failure, and no other non-zero failure returns". IF that's the case (esp. no other non-zero failure), you may wish to switch them to boolean_t for slightly better reading. If you are planning on having them return other non-zeros to express more diagnostics, please state that in comments, or by including some other non-zero return values a caller will check.

    Again - I would very much like, after this review exchange is done, to have a "git format-patch" so I can webrev it (to better review the man pages) AND to run it on a nontrivial SmartOS test harness.

    And finally, don't forget to update your copyrights to 2017.

    Thank you for your work on this. This is a big first-submission.

    1. I revised those signatures.
      
      I've updated the copyrights for code I've touched again this year, but I'm leaving the copyrights from my original submission in 2016, as I have not touched the bulk of the code and have been patiently awaiting review.
      
      Thanks, Dan!
  2. This is only used internally? If so, why ARGSUSED, why not just fix the function signature? Or lose the ARGSUSED if it's vestigal.

  3. Another case where perhaps this should be a boolean_t? Or did you mean to return the length of bytes read?

  4. You could make this a macro:

    define dhcp_get_nodename(buf, buflen) dhcp_get_oneline(ETCNODENAME, (buf), (buflen))

    1. Actually, I continued it as a function with a new function documentation block.
  5. 1.) None of its callers check its return value.

    2.) It looks like you should've made this boolean_t, no?

    1. For 1), actually, yes, callers were checking the return value.
  6. This is the worst possible way to do this. Which is strange given how your EPIC block-comment sets up everything SO NICELY, too. :)

    Since fqdnopt is an array, just access element 0:

    fqdnopt[0] = [expression].

    Speaking of [expression], you're overcomplicating the setting of the bits you want. You explained everything above, just set it to 5:

    fqdnopt[0] = 0x5; / See above, bits S and E. /

    If you want people to see more clearly, use S_BIT = 0x1 and E_BIT = 0x4 and then have:

    fqdnopt[0] = S_BIT | E_BIT;

    1. For array element [0], I always prefer simple *-dereferencing versus [0]--it saves two characters from code review!

      To explain the arcane syntax I used, please see the curious definition of the Client FQDN bitfield in the RFC 4702 excerpt in that block comment:

       //  0 1 2 3 4 5 6 7
       // +-+-+-+-+-+-+-+-+
       // |  MBZ  |N|E|O|S|
       // +-+-+-+-+-+-+-+-+
      

      The RFC oddly numbers the bits in ascending numerical order (0–7) but left-to-right from most significant to least significant bit. So rather than just having pre-computed hex values (0x1, 0x4) which might be suspicious given the bit numbering per the RFC, I defined the constants using the true bitwise operations of those documented offsets.

      For clarity, I have revised the constants to:

      const uint8_t   S_BIT_POS = 7;
      const uint8_t   E_BIT_POS = 5;
      const uint8_t   S_BIT = 1 << (7 - S_BIT_POS);
      const uint8_t   E_BIT = 1 << (7 - E_BIT_POS);
      

      and then set the assignment as per your suggestion:

      *fqdnopt = S_BIT | E_BIT;
      

      Please let me know if this suffices.

    2. Still a bit baroque, but good enough w.r.t. my prior readability problem. Be warned an RTI advocate may not be so kind.

  7. Same thing here... you probably want this to be a boolean_t.

  8. Same thing here... you probably want this to be a boolean_t.

    Unless you're trying to preserve -1-implies-errno-set semantics?

  9. No strlcpy() check here, with a return of -1 upon failure?

  10. Is -1/--primary required for the -h/--reqhost functionality? If not, please file a separate bug for -1/--primary. IMHO you don't have to seperate them as two commits, but you DO have to track the features separately.

    If so, never mind and ignore this comment.

    1. No, -1/--primary is not required, but it is convenient for allowing the designation of a primary interface so that /etc/nodename can be used as DHCP Hostname or Client P/FQDN for that interface. I.e., it allows using Hostname or Client FQDN without having to duplicate /etc/nodename as ipadm --reqhost or nwadm ip-reqhost.

      ifconfig allows designating an interface as such, but neither ipadm nor nwamd had this ability.

      I created issue 8517.

  11. usr/src/lib/libipadm/common/ipadm_addr.c (Diff revision 13)
     
     

    Same here... check strlcpy() and possibly return IPADM_FAILURE?

  12. 
      
seeemef@mac.com
seeemef@mac.com
seeemef@mac.com
danmcd
  1. Thank you for addressing my concerns. I hope you also address Toomas Soome's concerns. I'd like to take this wad and run in on one of our test platforms, to see how having multiple DHCP-fed zones copes with all of this. If it survives there, I'll come back here and check off Ship-it.

  2. 
      
seeemef@mac.com
danmcd
  1. Assuming I'm correct that dhcp_get_oneline is a new function (and according to src.illumos.org, it is):

    [root@00-0c-29-77-9d-fe ~]# nm /sbin/dhcpagent | grep dhcp_get_oneline
    [159] | 134625289| 260|FUNC |LOCL |0 |14 |dhcp_get_oneline
    [root@00-0c-29-77-9d-fe ~]# [root@00-0c-29-77-9d-fe ~]# vms=vmadm list -H | awk '{print $1}'
    [root@00-0c-29-77-9d-fe ~]# for i in $vms ; do zlogin $i ifconfig net0 ; done
    net0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 2
    inet 172.24.4.66 netmask ffffff00 broadcast 172.24.4.255
    ether 12:f8:3e:80:fa:cf
    net0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 2
    inet 172.24.4.65 netmask ffffff00 broadcast 172.24.4.255
    ether 52:c1:68:38:78:f9
    net0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 2
    inet 172.24.4.64 netmask ffffff00 broadcast 172.24.4.255
    ether 12:cb:ee:8d:5e:96
    net0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 2
    inet 172.24.4.67 netmask ffffff00 broadcast 172.24.4.255
    ether a2:31:af:33:cb:5
    net0: flags=40001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,L3PROTECT> mtu 1500 index 2
    inet 172.24.4.4 netmask ffffff00 broadcast 172.24.4.255
    ether 92:a8:95:a7:dc:94
    [root@00-0c-29-77-9d-fe ~]# pgrep dhcpagent
    4515
    4736
    4692
    4707
    [root@00-0c-29-77-9d-fe ~]# for i in $vms ; do zlogin $i ifconfig net0 dhcp status ; done
    Interface State Sent Recv Declined Flags
    net0 BOUND 1 1 0 [PRIMARY]
    (Began, Expires, Renew) = (07/24/2017 15:13, 07/24/2017 17:13, 07/24/2017 16:13)
    Interface State Sent Recv Declined Flags
    net0 BOUND 1 1 0 [PRIMARY]
    (Began, Expires, Renew) = (07/24/2017 15:13, 07/24/2017 17:13, 07/24/2017 16:13)
    Interface State Sent Recv Declined Flags
    net0 BOUND 1 1 0 [PRIMARY]
    (Began, Expires, Renew) = (07/24/2017 15:13, 07/24/2017 17:09, 07/24/2017 16:11)
    Interface State Sent Recv Declined Flags
    net0 BOUND 1 1 0 [PRIMARY]
    (Began, Expires, Renew) = (07/24/2017 15:13, 07/24/2017 17:13, 07/24/2017 16:15)
    ifconfig: net0: interface is not under DHCP control
    [root@00-0c-29-77-9d-fe ~]#

    Looks like SmartOS + your-bits managed to come up okay. Ship it!

    1. p.s. Mention this as a smoke-test/reality-check on SmartOS using four OS zones with DHCP for their addresses.

  2. 
      
seeemef@mac.com
danmcd
  1. Your pbchk changes are good. My ship-it stands.

  2. 
      
tsoome
  1. Ship It!
  2. 
      
seeemef@mac.com
Review request changed

Status: Closed (submitted)

Loading...