-
-
usr/src/lib/fm/topo/libtopo/common/topo_builtin.c (Diff revision 1) Is there a reason that this can't use topo_mod_enumerate()? One difference between the two right now is that there's a topo_node_hold() done on the node before we enumerate it and a rele afterwards. Do we need to do a hold/rele on tdg->tdg_rootnode around this in addition to the mod enter/exit? A comment here might be helpful if it should be different.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.h (Diff revision 1) cstyle should probably point out that the '{' should be joined to the prior line.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) ... API is can ... I think there may be an extra word in that bit of the sentence.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Was 'adjecently' maybe adjacency?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) So, in this bit, I think that we're defining a path as a series of nodes that one is traversing is that right? It might help to add a sentence in here that says, each entry between the '/' is the next node on the path. I guess the idea is that each node can be used to define the route, which is why you've called out cables here and that if it's important to call out a specific route between two entries, then all of the nodes are listed. Is that right? Maybe we can add a little bit more here about when it's good to use this or not. For example, would it make sense to use this to represent two logical services that can't talk to one another over a given TCP port, say? Maybe a better way to phrase it is is this OK to use if we don't have every step of the way between two entities?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Can you describe the locking details in a bit more detail here? I can see that we're locking the top-level digraph when inserting edges and verticies for doing various operations, but it's not clear to me what it is supposed to protect as the actual verticies and edges themselves don't do any locking. My memory of the topo API is that while a given module will only be invoked serially for a given bit of topo, it wasn't clear to me that internally you couldn't have two modules iterating and enumerating something both of which might try to add an edge or a vertex at the same time or remove a vertex that's being iterated by someone else.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Just to make sure I have the right understanding, a given scheme should only ever have a single digraph right? The same way that a scheme only ever has a single tree and root?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Do we need to take a topo_mod_hold() for this? Or would that screw with how the reference counting works for modules?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) I would duplicate the comment from +232 here.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) It may be worth considering an error checking mutex here. Otherwise the pthreads APIs won't error about unlocking an unlocked mutex or one you don't own. Though it may be better to do that in one larger pass over libtopo and to add the a pthread version of mutex_enter()/mutex_exit().
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Would it be possible to see how the consumer in the sas topo change would use all this? I would have expected that this function would go through and clean up any vertices and edges that had been associated with a digraph. Doesn't the digraph basically own the vertices and edges once it's been added? Or is the idea that this should always be separate. Maybe you could add a bit about the ownership and destruction path to the theory statement?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) I was a bit surprised to not see a topo_mod_rele() here and just a straight up topo_mod_free() of the root node. Do we not need to call topo_node_destroy() to clean up things here? If this is the way to do things because the root is special (though I'm not sure how things like the children hash list are expected to be cleaned up work), then a comment here would be useful.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) This error path doesn't set a topo error. It also doesn't log an error via topo_mod_dprintf(). Should we send this to the err path for that?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Should we check somewhere that we're not going to add a vertex that causes us to overflow nverticies?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) I'm not sur ethat topo_mod_free() correclty handles freeing a NULL pointer. Do we need to check that these are non-NULL?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) I've noticed that all the iteration functions here have a boolean_t for this being the last entry. That's not something I'm used to seeing in other iteration functions. Does this help consumers out in some way?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) cstyle: I don't believe case statements should be in parens.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) FALLTHRU shouldn't be needed if there's no code here.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Shouldn't this set a topo error?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Should we worry about these counts overflowing?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Should this name reflect that it only does outgoing edges? Is it possible we'll want to later introduce one that has incoming support?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Please remove the parens from these statements.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) This shouldn't be needed if there's no code between the case statements.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) cstyle: This should be joined to the previous line.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) It might be useful if this said semantically what it was trying to do. The fact that it's a helper function is fairly obvious. The set of paths it's trying to accumulate less so.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) It's not clear to me how this stops executin in the face of graph with a loop or if you go down paths that'll never reach to. Maybe the latter is fine because we'll just recursively walk everything, not find it and eventually hit an end. But in the face of loops, it seems like this'll maybe infinite loop?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Since curr_path is only used in a constant form can we make it const?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Don't we need to check whether or not this succeeded?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Isn't this missing a free of pathstr?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) So there's a signed/unsigned integer mismatch here. This function returns an int which is the number of paths. However, we're storing this along the way as an uint_t. This means that we'll end up doing implicit conversion and if we have 2^31 paths, we'll start thinking this failed. I think it'll be clearer to just have this function return 0/-1 and have the number of paths be a uint_t that it takes out.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Shouldn't this be moved up to +551? If you're going to directly return here, don't we need to free pathcomp? It'll probably be easier if everything goes out the err path?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Missing free of curr_path in this error path.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Why does this function return an ssize_t if it never returns a negative number and the consumers of it always pass it back to something that takes a size_t?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Do we need to worry about snprintf returning -1?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Surely cstyle.pl doesn't like the '0' and the || with no spaces between them?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Do we need to worry about snprintf returning -1?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) What does this +1 represent, a null terminator? Maybe it's better to start with bufsz equal to 1 in case someone else changes things around later?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Do we need to worry about snprintf returning -1?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Do we need to worry about snprintf returning -1?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Why is this EMOND_NOMEM here but EMOD_FMRI_NVL when adding things at +719. Shouldn't they both be the same?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) Any reason not to include the NUL character in this?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) So, this construct is a little worrisome. This does copy all of the bytes, but explicitly will not copy over the NUL terminator because n is set to the string length which doesn't include it. This means that by default strncpy() won't insert one and we're relying on the fact that we did a topo_mod_zalloc. Since we're doing the logical equivalent of the strdup, I'd just do a topo_mod_alloc() and then explicitly do a bcopy() of fmrilen+1 bytes to make sure that everything is exactly the same. Otherwise, someone looking at this in the future or someone who incorrectly changes the topo_mod_zalloc() to something else might not realize that it was load bearing for ensuring the NUL terminator.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) If npairs is zero should we still continue?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph.c (Diff revision 1) I presume something was meant to go here?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Should there be consumers of this in libtopo or should this be wired up to something? Or is this coming in a subsequent change?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) If we're going to ignore every single fprintf error code should we at least check ferror() somewhere in the whole thing to make sure that we can note that writing this out might have failed?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Why does the uint64 use hex where as all the other unsigned types use base 10? Should they be consistent so it's easier to tell or does it not matter?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Should this really be being written to the file? Is the goal to make sure that the file is invalid xml?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Here and elsewhere I assume we don't care about propagating the topo error that caused us to fail?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Maybe it's better to have a common function for this since this looks the same as the DATA_TYPE_NVLIST handling.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) For this and all the other array entries, can we leverage the existing nvlist array handling to try and reduce the duplication of all of the ways to write this out with a shared common function?
-
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) It feels like things could change that cause the sizeof (tstamp) to not be large enough. It's probably worth checking that this is correct and/or also giving tstamp more slack.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Should we check ferror() here before returning for the user? Or is it up to the caller to do that?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Since these are coming from the xml document and are just pointers into it, shouldn't we make these const?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Why are we doing this here? We need to do this after all the items are added which is you also have a copy of this at +999. Seems like the only reason to do this is to catch an early duplicate, but I think that might be more confusing than not.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) I guess the idea here is to allow others to add children that topo doesn't understand to the xml file and still deal with it?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) For this and all the other integer properties why do we just truncate the integer? If it's out of range, shouldn't we actually care about that and error on that rather than adding a truncated version?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) If there are more children than the property indicated we'll happily go off of the end of nvlarr. Shouldn't we check here that 'i < nelems' and fail if we hit a case where we have more than specified?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Do we want to check that i actually is equal to nelem to make sure that we found everything that was asked for?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) All of the array entries have the same issues with trusting the number of entries and walking off of the end of the array that I describe in the TDG_XML_NVLIST_ARR case. We should fix them all.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Maybe it's worth adding a topo_hdl_calloc and a free version of the array format to make all of these easier?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Using uint32_t here and int32_t in the free path. Probably should be uint32_t.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Here you're using uint64_t, but everywhere else in this function you're freeing with int64_t. I presume this should have been an int64_t?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) In the failure path, vtx is not freed.
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) Where does nobase.xml come from? Is that a well known URL in libxml?
-
usr/src/lib/fm/topo/libtopo/common/topo_digraph_xml.c (Diff revision 1) I assume that root doesn't need to be freed unlike other libxml2 artifacts like the properties?
-
usr/src/lib/fm/topo/modules/common/ses/ses.c (Diff revision 1) The fact that these three are no inconsistent confuses me. It appears that we use these as the same thing as the topo instance, so why would the maximum be a type that's smaller than the others? I get that we are using sec_maxinstance and trying to use -1 as a sentinel that it's not set, but we're also using it as the largest value that we'll pass to topo functions, so having it be a different type seems wrong. Maybe we should use a different sentinal like UINT64_MAX or some topo max instance value?
-
usr/src/test/os-tests/tests/libtopo/digraph-test.c (Diff revision 1) Conventionally a new line between this and the copyright?
-
usr/src/test/os-tests/tests/libtopo/digraph-test.c (Diff revision 1) Why do we need a timestamp in all this?
-
usr/src/test/os-tests/tests/libtopo/digraph-test.c (Diff revision 1) Worth cleaning up the whitespace? Did 'git pbchk' not mention this or the bit on the next line?
-
usr/src/test/os-tests/tests/libtopo/digraph-test.c (Diff revision 1) Shouldn't we be checking if what we've serialized we can read back in and is the same as what we serialized out?
-
usr/src/test/os-tests/tests/libtopo/digraph-test.c (Diff revision 1) Why does the test require root access? Shouldn't all of the topo bits here work without root access?
-
usr/src/test/os-tests/tests/libtopo/digraph-test.c (Diff revision 1) Feels like we should add some tests that cover doing things like parsing invalid xml files, etc. where we have things that aren't checked for in the dtd like the number of elements in an array mismatch the number found for the different types, etc.
libtopo: add support for directed graph based topologies
Review Request #2529 — Created March 13, 2020 and submitted
Information | |
---|---|
rejohnst | |
illumos-gate | |
12230 | |
Reviewers | |
general | |
Libtopo currently only supports building topologies based on a tree structure. The problem is that not all topologies map naturally to a tree structure.
These changes extend libtopo to support building topologies that are based on directed graph structures.
Specifically, this ticket will cover adding new APIs to libtopo for the following:
- building digraphs
- iterating over graph vertices
- iterating over vertice edges
- finding all paths between an arbitrary pair of vertices
- serializing a graph-based topology to XML
- deserialzing a graph-based topology from XML
See the testing notes in the ticket:
https://www.illumos.org/issues/12330
Change Summary:
This addresses 26 of rm's 74 comments. More changes coming.
Change Summary:
This addresses the rest of rm's remaining comments. There is one open question for rm regarding locking.
Diff: |
Revision 3 (+3924 -86)
|
---|
-
In general, this looks great. Thanks for putting this together.
-
usr/src/lib/fm/topo/libtopo/common/topo_builtin.c (Diff revisions 1 - 3) Should this really not have the topo_mod_enter() and topo_mod_exit() around this any more? Don't we need that for serialization purposes?
Change Summary:
Updates to address rm's most recent comments.
Diff: |
Revision 4 (+3948 -86)
|
---|
Change Summary:
Forgot to add the new test suite to the default runfile. Also a couple minor tweaks to digraph-test.c.
Diff: |
Revision 5 (+3962 -79)
|
---|