-
-
usr/src/boot/sys/boot/zfs/zfs.c (Diff revision 1) Rather than treating this as an array of integers, could we cast the appropriate location to a zio_eck_t and use ZEC_MAGIC and ZIO_SET_CHECKSUM? Or create a struct for this padding space that has the appropriate length char[] followed by a zio_eck_t, and then cast the whole thing. We do that for the vdev_label_phys_t.
-
usr/src/boot/sys/boot/zfs/zfs.c (Diff revision 1) Not sure I understand this conditional. My reading of this loop is "iterate over all of the vdev's children, and at the end, result will be the contents of pad2 from the first one, unless any of them had the empty string as the contents of their pad. If they did, we return the empty string." Is the intent here to protect ourself against a case where only some vdevs get zeroed? A comment explaining that conditional might be helpful.
I think specifically, what confuses me is that we trust earlier vdevs more in one way (we use their result over others if they're all non-empty), but not in others (empty strings have priority?).
-
usr/src/boot/sys/boot/zfs/zfs.c (Diff revision 1) If you have {} on the if statement, you should have them on the else as well.
-
usr/src/uts/common/fs/zfs/vdev_label.c (Diff revision 1) Why do we pass cbp in here? We pass it to our children, but they don't do anything with it.
-
usr/src/uts/common/fs/zfs/vdev_label.c (Diff revision 1) Why do we return the best error? If only one disk managed to successfully write the nextboot information, and the others left the padding area zeroed out, then we're going to have a pretty bad day when it comes time to boot and we're expecting it to work, but we ignore the first disk's nextbook because the others are zeroed out.
7537 want nextboot (one time boot) support
Review Request #249 — Created Nov. 6, 2016 and submitted
Information | |
---|---|
tsoome | |
illumos-gate | |
9af5e6b9b0 | |
7537 | |
68aed11... | |
Reviewers | |
general | |
7537 want nextboot (one time boot) support
bootadm activate -t/-T and boot time/post reboot verification.
root@openindiana:/home/tsoome# beadm list BE Active Mountpoint Space Policy Created openindiana-2020:04:15 - - 19,23M static 2020-04-15 09:28 openindiana-2020:07:09 NR / 11,56G static 2020-07-09 11:16 root@openindiana:/home/tsoome# beadm activate -t openindiana-2020:04:15 Activated successfully root@openindiana:/home/tsoome# beadm list BE Active Mountpoint Space Policy Created openindiana-2020:04:15 T - 10,01G static 2020-04-15 09:28 openindiana-2020:07:09 NR / 1,57G static 2020-07-09 11:16 root@openindiana:/home/tsoome# beadm activate -T openindiana-2020:04:15 Temporary activation removed root@openindiana:/home/tsoome# beadm list BE Active Mountpoint Space Policy Created openindiana-2020:04:15 - - 10,01G static 2020-04-15 09:28 openindiana-2020:07:09 NR / 1,57G static 2020-07-09 11:16 root@openindiana:/home/tsoome# After activate -t, we do have on disk (first label): 00002000 01 01 00 00 00 00 00 00 00 00 00 01 00 00 00 48 |...............H| 00002010 00 00 00 40 00 00 00 07 63 6f 6d 6d 61 6e 64 00 |...@....command.| 00002020 00 00 00 09 00 00 00 01 00 00 00 26 7a 66 73 3a |...........&zfs:| 00002030 72 70 6f 6f 6c 2f 52 4f 4f 54 2f 6f 70 65 6e 69 |rpool/ROOT/openi| 00002040 6e 64 69 61 6e 61 2d 32 30 32 30 3a 30 34 3a 31 |ndiana-2020:04:1| 00002050 35 3a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |5:..............| 00002060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00003fd0 00 00 00 00 00 00 00 00 11 7a 0c b1 7a da 10 02 |.........z..z...| 00003fe0 77 08 7c 75 55 e9 df e3 27 f9 25 4a da 52 65 fc |w.|uU...'.%J.Re.| 00003ff0 43 29 e7 6c 00 4f 1f db 28 36 85 29 be 35 92 73 |C).l.O..(6.).5.s| on reboot, indeed, the loader is using currdev "zfs:rpool/ROOT/openindiana-2020:04:15:", then I did reboot the loader, got currdev "zfs:rpool/ROOT/openindiana-2020:07:09:" mounted for rootfs: tsoome@openindiana:~$ df -h Filesystem Size Used Available Capacity Mounted on rpool/ROOT/openindiana-2020:07:09 28,70G 5,41G 8,19G 40% / And now we can confirm the nvlist on label is there, but empty: 00002000 01 01 00 00 00 00 00 00 00 00 00 01 00 00 00 00 |................| 00002010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00003fd0 00 00 00 00 00 00 00 00 11 7a 0c b1 7a da 10 02 |.........z..z...| 00003fe0 b3 46 76 2e 24 fd fa 01 b7 cd 13 e5 42 1f d8 64 |.Fv.$.......B..d| 00003ff0 ad ae 2f 35 db 6b c2 29 db 4d 99 28 85 3c 13 15 |../5.k.).M.(.<..|
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 2 (+650 -39) |
-
-
usr/src/boot/sys/boot/zfs/zfs.c (Diff revision 2) Why are we using zap_scratch here? I guess it doesn't really matter, but it seems wrong to use the zap_scratch space for non-zap stuff.
-
usr/src/boot/sys/boot/zfs/zfs.c (Diff revision 2) Why do we use a precalculated checksum here? It looks like the sha256 source code is present in usr/src/boot/sys/cddl/boot/zfs/sha256.c. Could we use that instead?
-
-
-
-
usr/src/lib/libzfs/common/libzfs_pool.c (Diff revision 2) It's a little confusing that we are defining the "nextboot" to mean the "dataset" string (for zpool_get_nextboot) and the "zfs:dataset:" string (for zfs_ioc_get_nextboot). Would it be possible to use the same string all the way, or to use different names for these different strings?
-
usr/src/uts/common/fs/zfs/vdev_label.c (Diff revision 2) The semantic here is that we return a random pad2, unless one of them is a zero-length string, in which case we return the zero-length string.
There's no checksum of the pad2, right? So if one of them is damaged such that it becomes a zero-length string, the overall property is lost.
What's the motivation behind this behavior?
-
usr/src/uts/common/fs/zfs/vdev_label.c (Diff revision 2) This seems very specific to treating pad2 as a string. I.e. we aren't really checking if it's "zeroed", only that it's a zero-length string (subsequent bytes may be nonzero). Can we document that somewhere?
Also, I think this is equivalent to:
if (pad2[0] == '\0') *cbp[0] = '\0';
-
-
usr/src/uts/common/fs/zfs/vdev_label.c (Diff revision 2) I guess there's no performance concern with writing each vdev one at a time? If there are a lot of disks, this could be slow, but presumably this is typically used only on the root pool which has a small number of disks?
-
usr/src/uts/common/fs/zfs/zfs_ioctl.c (Diff revision 2) Why is this needed? It looks like we are modifying the pad2 area of the labels, and not modifying anything referenced by the uberblock.
-
usr/src/uts/common/fs/zfs/zfs_ioctl.c (Diff revision 2) Are we intentionally allowing this on suspended / readonly pools? That seems dangerous.
We may also want to pass allow_log=B_TRUE. Even if not used from the CLI, this will cause the ioctl to log itself.
-
usr/src/uts/common/fs/zfs/zfs_ioctl.c (Diff revision 2) What would be the downside of using zfs_secpolicy_read here?
-
usr/src/uts/common/fs/zfs/zfs_ioctl.c (Diff revision 2) What would be the downside of using zfs_secpolicy_read here?
Change Summary:
The initial work was based on FreeBSD zfs nextboot feature, This update is based on https://reviews.freebsd.org/D25512 and https://github.com/openzfs/zfs/commit/108a454a4604df6ea3be817f3cf076726df2c67a
The change from Delphix patch is, I did drop the version word (8 bytes) and switched to only use nvlist. We do not wipe bootenv area, we only update nvlist as needed - nextboot reset will remove the pair, etc. For that, we obviously need the toolkit to update nvlist and write it down, this patch does add exactly that. We also do update all 4 labels. The missing part here is common definitions for pair keys. From OS userland, the data translation to/from nvlist is done in libzfsbootenv.
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 3 (+2301 -396) |
Change Summary:
signed versus unsigned
Branch: |
|
||||
---|---|---|---|---|---|
Commit: |
|
||||
Diff: |
Revision 4 (+2377 -429) |
Change Summary:
replace sprintf by snprintf
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 5 (+2349 -405) |
Testing Done: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Commit: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Diff: |
Revision 6 (+2529 -499) |
Change Summary:
more interfaces for libzfsbootenv
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 7 (+2659 -499) |
Change Summary:
rebase on gate
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 8 (+2659 -499) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 9 (+2689 -499) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 10 (+2902 -524) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 11 (+2902 -524) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 12 (+2931 -524) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 13 (+2980 -524) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 14 (+2979 -524) |
Commit: |
|
||||
---|---|---|---|---|---|
Diff: |
Revision 15 (+2984 -524) |