7392 remove event channel support from lofi and implement lofi_devlink_cache.

Review Request #218 — Created Sept. 17, 2016 and submitted

tsoome
illumos-gate
7392
general

lofi: remove event channel support from lofi and implement lofi_devlink_cache.

I did run following sequence myself: modunload -i <lofi>; zoneadm -z lz halt; modunload -i <lofi>; zoneadm -z lz boot
did not got any issues.

richlowe: I have booted/halted/modunloaded a zone 200 times, and not crashed yet.

tsoome
tsoome
tsoome
rm
  1. From an initial glance this looks like a reasonable way to manage things.

  2. 
      
jjelinek
  1. Although the general approach seems fine to me, it is unclear to my why you are putting so much of this knowledge into modctl, when modctl doesn't seem to be relevant. All of the loadable modules are now being polluted by the new external definition for the mod_devlink_cache, but that is not relevant to any loadable module except lofi. Am I missing something here?

    It seems like you could modify this implementation to be much more constrained between the relevant components.

    Since log_sysevent is already lofi-aware via the notify_lofi function, is there some reason you don't keep the relationship constrained to the the lofi module and the log_sysevent code? It seems like it would be much less invasive to have all of the devlink caching implementation inside the lofi module where it is being used. You could have an external hook in log_sysevent which gets initialized when the lofi module is loaded and lets the notify_lofi function know that it needs to start adding the events to the lofi nvlist.

    I'd like to understand why the current implementation was chosen before this gets approved.

    1. Well, that was the approach I took in first place - but the problem is, the event facility itself is zone aware and when lofi module load is triggered by zone boot (or halt), the lofi binding to even channel will be flagged with that local zone - which did create the race condition in first place when zone was halting. So, it felt reasonable to avoid having to use event mechanism from lofi, and move the event processing off of it.

      The modctl felt the most appropriate place for three reasons - first, its dealing with events coming from userland, so we are right at the source.

      Second reason is, the sysevents from userland do not get propagated to kernel event channels, in previous implementation I did forward them specifically for this purpose, to make them visible for lofi; so as we need to extract the data for caching, it does make sense to do it in modctl (since log_sysevent.c has code to do the copyin() I did not want to duplicate that code and kept the notify_lofi() call for that purpose).

      And third, since lofi does actually need just the controller ID (to compose cXtYdZp0 name), I still have hope that at some day someone will implement better way to enumerate and keep track of controllers and not depend on devfsadm doing it in user space, and I strongly suspect such interface might be hosted by modctl, also if my dream of such interface will become true, it also means the controller related events must reach devfsadm and that again will happen via modctl. So modctl is kind of nexus or crossroard there...

      Yea, I actually do not like at all being dependant on devfsadm in such way, but there is none better (by my knowledge), and as already mentioned, I do consider it temporary till something better is available - when it happens, there will be a bit of cleanup to do. Which is also the reason I did try to keep the current solution as private as possible.

    2. Toomas, I understand the original problem but I do not understand your solution. Perhaps the evaluation is incomplete or there is some sublety here which I am not understanding but I do not understand why modctl has to be the place that the global state is defined and why all loadable modules have to be exposed to that now. Is there some reason the global state cannot be defined in log_sysevent and why the lofi module cannot cooperate with log_sysevent as necessary?

    3. Ah right, in that sense, sure, we could just keep the cache in log_sysevent, I did pick modctl because its initialized early enough and therefore the nvlist allocation and lock/cv setup is safely in place; mod_setup() happens a bit before log_event_init() (called by setup_ddi()), and if we have cache in log_sysevent, we should set the cache up in log_event_init(). In that sense, we can have cache moved out of modctl. The only question would be, is it possible to have scenario that somehow we could end up having lofi loaded up before log_sysevent is set up... probably no? I am just learning those interfaces on the go really, so I am quite open for better ideas:)

    4. Toomas, so yes, I would agree with your description where you constrain things to only the log_sysevent code and the lofi code. Looking at the flow for log_event_init, that happens in setup_ddi so you should not have to worry about a race with lofi getting initialized. I don't think this changes your approach too much, but it seems cleaner than exposing the new cache to all of the loadable modules.

  2. 
      
tsoome
jjelinek
  1. Toomas, thanks, this looks like you are on the right path. I think you need to cleanup the names now since using the "mod*" naming doesn't make sense with this tighter implementation. I am also confused about the code in notify_lofi. It looks like you are freeing the event nvlist when this is not a lofi-specific event, but the caller (log_usr_sysevent) still wants to use the event after notify_lofi has been called. Have you tested this with a sysevent consumer so you can see that all of the events are being delivered upstream even though some events are also being used for lofi?

    1. I read the nvlist from event with sysevent_get_attr_list() and its manual states:

      "The interface manages the allocation of the attribute list, but it is up to the caller to free the list when it is no longer needed with a call to nvlist_free()."

      So, I get my private copy from sysevent_get_attr_list(), and making sure it's not leaking, both when driver is not set "lofi" and after nvlist is inserted to cache.

      I was not really sure how to approach the naming, hence I left the old names in for this iteration. Hm, perhaps I should just use "lofi" as prefix there instead of "mod", for header, type/struct and variable.

  2. 
      
jjelinek
  1. Toomas, sorry, yes, I see that sysevent_get_attr_list will duplicate the nvlist, so ignore that part of my comment. I think your suggestion of using a "lofi*" style name is good.

  2. 
      
tsoome
jjelinek
  1. Toomas, thanks for fixing the naming. I think that will be much cleaner going forward. I only have a couple of small nits on your latest set of changes.
    usr/src/uts/common/io/lofi.c
    2989 fix the comment to match the new naming

    usr/src/uts/common/os/modctl.c
    remove this file from your changeset

  2. 
      
tsoome
jjelinek
  1. Hi Toomas. This looks good. There was one thing I noticed while doing a final review. I'm sorry I didn't see this earlier. Other than this nit, the rest looks good.

    You can remove the redundant code in log_sysevent.c at lines 1799-1810 by rewriting it like this. You also don't need the redundant strcmp at line 1804 since you already checked that at lines 1779-1780.

    mutex_enter(&lofi_devlink_cache.ln_lock);
    if (strcmp(class, EC_DEV_ADD) == 0) {
    fnvlist_add_nvlist(lofi_devlink_cache.ln_data, name, nvlist);
    } else {
    / Can not use fnvlist_remove() as we can get ENOENT. /
    (void) nvlist_remove_all(lofi_devlink_cache.ln_data, name);
    }
    cv_broadcast(&lofi_devlink_cache.ln_cv);
    mutex_exit(&lofi_devlink_cache.ln_lock);

    1. yea, I should have noticed this duplication myself as well, thanks:)

  2. 
      
tsoome
jjelinek
  1. Ship It!
  2. 
      
tsoome
rm
  1. Ship It!
  2. 
      
tsoome
Review request changed

Status: Closed (submitted)

Loading...