5786 Fast reboot broken on EFI formatted drives

Review Request #20 — Created April 3, 2015 and submitted

tsoome
illumos-gate
5786
cae583f...
general

5786 Fast reboot broken on EFI formatted drives

the problem is correctly diagnosed by submitter, current code expects findroot device specification in form (id, partition, slice) where partition is MBR partition number and slice is VTOC slice. However, in case of GPT label, the device specification for findroot is (id, partition), where partition is EFI partition and there is no slice, for illumos, in this case slice is same as partition. The fix is using the fact that default slice number is set to whole device (slice value "q"), if findroot device has no slice defined, the default value is preserved.

now while searching for bootsign, slice_match() is allowed to match any slice for slice value "q" and get_sol_prtnum() will get EFI partition number to implement partition number check.

for test purposes, I used reboot command and simple test program:

#include <libgrubmgmt.h>

int main(int argc, char **argv)
{
  grub_boot_args_t fbarg;
  char *fbstr;
  int rc;

  rc = grub_get_boot_args(&fbarg, NULL, atoi(argv[1]));
  printf("%d: fbarg %s\n", rc, fbarg.gba_bootargs);
  grub_cleanup_boot_args(&fbarg);
  return rc;
}

sample menu.lst file (the first 2 entries are actually incorrect for booting, but syntax is correct):

title openindiana-1
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/openindiana-1
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive
#============ End of LIBBE entry =============
title openindiana-2
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/openindiana-2
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive
#============ End of LIBBE entry =============
title openindiana-3
findroot (pool_rpool,0)
bootfs rpool/ROOT/openindiana-3
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive
#============ End of LIBBE entry =============
title openindiana-4
findroot (pool_rpool,0)
bootfs rpool/ROOT/openindiana-4
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive
#============ End of LIBBE entry =============

output from test program:

root@gpt:/home/tsoome#  ./fbargs 0
0: fbarg /tmp/.libgrubmgmt.openindiana-1.r6a4Yb /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/openindiana-1,bootpath="/pci@0,0/pci15ad,1976@10/sd@1,0:a"
root@gpt:/home/tsoome#  ./fbargs 1
0: fbarg /tmp/.libgrubmgmt.openindiana-2.ynaqZb /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/openindiana-2,bootpath="/pci@0,0/pci15ad,1976@10/sd@1,0:a"
root@gpt:/home/tsoome#  ./fbargs 2
0: fbarg / /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/openindiana-3,bootpath="/pci@0,0/pci15ad,1976@10/sd@1,0:a"
root@gpt:/home/tsoome#  ./fbargs 3
0: fbarg /tmp/.libgrubmgmt.openindiana-4.4Haa0b /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/openindiana-4,bootpath="/pci@0,0/pci15ad,1976@10/sd@1,0:a"

So the fastboot bootargs strings are correct for both types of entries.

Also reboot -f command was correctly booting and using indicated menu entry, confirmed by:

tsoome@gpt:~$ eeprom bootcmd
bootcmd=/platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/openindiana-4,bootpath="/pci@0,0/pci15ad,1976@10/sd@1,0:a"

and from booting via grub:

tsoome@gpt:~$ eeprom bootcmd
bootcmd=/platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/140,bootpath="/pci@0,0/pci15ad,1976@10/sd@0,0:a",diskdevid="id1,sd@n6000c2969ea1ab7d25ed261fa77f3007/a"
  • 0
  • 0
  • 3
  • 0
  • 3
Description From Last Updated
tsoome
jeffpc
  1. 
      
  2. this needs to be indented under the inner if

    1. yes, my fault, updated diff, and running full rebuild right now to make sure there are no more surprises.

  3. 
      
tsoome
risto3
  1. Just a quick question before my "Ship It!"
    I tried the slightly modified test program..

    include <libgrubmgmt.h>

    int main(int argc, char *argv)
    {
    grub_boot_args_t fbarg;
    char
    fbstr;
    int rc;

    rc = grub_get_boot_args(&fbarg, NULL, atoi(argv[1]));
    if (rc == 0)
    printf("%d: fbarg %s\n", rc, fbarg.gba_bootargs);
    else
    printf("%s\n", grub_strerror(rc));

    grub_cleanup_boot_args(&fbarg);
    return rc;
    }

    I have the following currently for menu.lst
    and am currently booting 11 or b151013-20150320

    seems ok:
    richard@omnis:/home/richard/src$ pfexec ./fbargs 11
    0: fbarg / /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/b151013-20150320,bootpath="/pci@0,0/pci1002,4391@11/disk@5,0:a"

    but any of the non-illumos ones (including old missing ones):
    richard@omnis:/home/richard/src$ for i in 0 1 2 3 4 5 6 7 8 9 10 11 12; do pfexec ./fbargs $i; done
    No such file or directory
    Invalid GRUB entry
    No such file or directory
    No such file or directory
    Invalid argument
    Bootsign not found
    Unknown bootfs filesystem
    Unknown bootfs filesystem
    0: fbarg /tmp/.libgrubmgmt.hipster-4.C1Wu8o /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/hipster-4,bootpath="/pci@0,0/pci1002,4391@11/disk@5,0:a"
    0: fbarg /tmp/.libgrubmgmt.omnios-b151013.t3av8o /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/omnios-b151013,bootpath="/pci@0,0/pci1002,4391@11/disk@5,0:a"
    0: fbarg /tmp/.libgrubmgmt.b151013-20150208.s5qv8o /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/b151013-20150208,bootpath="/pci@0,0/pci1002,4391@11/disk@5,0:a"
    0: fbarg / /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=rpool/ROOT/b151013-20150320,bootpath="/pci@0,0/pci1002,4391@11/disk@5,0:a"
    No such entry found

    Is it correct to presume that that only valid illumos boot entries are okay?

    ===========================================
    splashimage /boot/grub/splash.xpm.gz
    foreground 343434
    background F7FbFF
    default 11
    timeout 30

    ---------- ADDED BY BOOTADM - DO NOT EDIT ----------

    ---------------------END BOOTADM--------------------

    Unknown partition of type 0 found on /dev/rdsk/c1d0p0 partition: 2

    It maps to the GRUB device: (hd0,1) .

    Unknown partition of type 0 found on /dev/rdsk/c1d0p0 partition: 3

    It maps to the GRUB device: (hd0,2) .

    Unknown partition of type 0 found on /dev/rdsk/c1d0p0 partition: 4

    It maps to the GRUB device: (hd0,3) .

    Unknown partition of type 0 found on /dev/rdsk/c2t0d0p0 partition: 1

    It maps to the GRUB device: (hd0,0) .

    title SmartOS smartos
    findroot (pool_rpool,0,a)
    kernel /boot/memdisk
    module /boot/smartos.img

    title NetBSD
    rootnoverify (hd0,1)
    chainloader +1

    title System Rescue CD
    findroot (pool_rpool,0,a)
    kernel /boot/memdisk iso
    module /boot/systemrescuecd.iso

    title Universal Boot CD
    findroot (pool_rpool,0,a)
    kernel /boot/memdisk iso
    module /boot/ubcd52b1.iso

    title memtest86+-5.01
    findroot (pool_rpool,0,a)
    kernel /boot/memtest86+-5.01.bin

    title Omnios
    findroot (pool_dpool,0,a)
    bootfs dpool/ROOT/omnios-2
    kernel$ /platform/i86pc/kernel/amd64/unix -B $ZFS-BOOTFS
    module$ /platform/i86pc/amd64/boot_archive

    =====================================================

    title oi_151a7jds-backup-1
    findroot (pool_rpool,0,a)
    bootfs rpool/ROOT/oi_151a7jds-backup-1
    kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
    module$ /platform/i86pc/$ISADIR/boot_archive

    ============ End of LIBBE entry =============

    title oi_151a8
    findroot (pool_rpool,0,a)
    bootfs rpool/ROOT/oi_151a8
    kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
    module$ /platform/i86pc/$ISADIR/boot_archive

    ============ End of LIBBE entry =============

    title hipster-4
    findroot (pool_rpool,0,a)
    bootfs rpool/ROOT/hipster-4
    kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
    module$ /platform/i86pc/$ISADIR/boot_archive

    ============ End of LIBBE entry =============

    title omnios-b151013
    bootfs rpool/ROOT/omnios-b151013
    kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
    module$ /platform/i86pc/$ISADIR/boot_archive

    ============ End of LIBBE entry =============

    title b151013-20150208
    bootfs rpool/ROOT/b151013-20150208
    kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
    module$ /platform/i86pc/$ISADIR/boot_archive

    ============ End of LIBBE entry =============

    title b151013-20150320
    bootfs rpool/ROOT/b151013-20150320
    kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
    module$ /platform/i86pc/$ISADIR/boot_archive

    ============ End of LIBBE entry =============

    1. yes, this is because the grub_get_boot_args() is not just looking for findroot, but it will mount BE and check for kernel etc - its totally about building boot args for loading illumos kernel. In case of grub menu file or config, my suggestion is to use "configfile custom.lst" or like and store custom config entries in separate file. And this is because the menu.lst is meant to be managed by bootadm/beadm (altho yes, you can edit it yourself as well).

  2. 
      
risto3
  1. Ship It!
  2. 
      
tsoome
tsoome
jeffpc
  1. 
      
  2. Is the cast because of lint?

    1. actually readability, as SLCNUM_WHOLE_DISK is defined 'q' and slice is int. gcc and lint are happy without type cast (just double checked). in findroot there is also an cast, but thats because of unsigned int in struct.

    2. Ok. Personally, I think unnecessary casts are evil. Up to you.

    3. I see the point. yea, removed it.

  3. What happens if I make an EFI partitioned device and use a partition other than the first?

    1. the physpath is composed from device name in pool config; translated to physical path:
      ``` pool: test
      state: ONLINE
      scan: none requested
      config:

          NAME        STATE     READ WRITE CKSUM
          test        ONLINE       0     0     0
            c3t1d0s1  ONLINE       0     0     0
      

      root@gpt:/home/tsoome# df -h /
      Filesystem Size Used Avail Use% Mounted on
      test/ROOT/test 15G 2,8G 12G 20% /
      root@gpt:/home/tsoome# ./fbargs 0
      0: fbarg / /platform/i86pc/kernel/amd64/unix -B zfs-bootfs=test/ROOT/test,bootpath="/pci@0,0/pci15ad,1976@10/sd@1,0:b"
      root@gpt:/home/tsoome# reboot -f
      Connection to 192.168.225.151 closed by remote host.
      Connection to 192.168.225.151 closed.

      so, the partition number test gets partition 1 in this example, and as I have findroot (pool_test,1) in menu.lst, the partition number is matching:
      

      tsoome@gpt:~$ eeprom bootcmd
      bootcmd=/platform/i86pc/kernel/amd64/unix -B zfs-bootfs=test/ROOT/test,bootpath="/pci@0,0/pci15ad,1976@10/sd@1,0:b"
      ```

    2. I guess I don't know what "efi_parts[0]" is. I was reading that as "check the first partition for being V_USR.

    3. yes you do, its bug:D well, the bug related to tag type check, and my test did not expose it as i did just create another partition for pool and did not change partition tag for partition 0:) fixed and thanks:)

  4. So, if we fail to open the device the first time, we just assume that it's not EFI? That seems wrong.

    1. if we can not open device file for read, there is not much we can do, isn't it so? besides, same happens with MBR - the code opens whole disk file, reads MBR and thats gets out if read failed. the only thing im thinking about now is, perhaps should just get out on failed open(), as illumos does not make distinction between partition and slice in dev. so, if the first open() with slice name failed, im not sure if there is much point to try again with whole disk device -- which is exactly the reason why I didnt write reaction to such failure.

    2. Right. That's what I was thinking - fail the whole thing if we fail to open the device the first time around. Sorry for not being clearer.

    3. done.

  5. 
      
tsoome
jeffpc
  1. 
      
  2. So, is the idea here: efi_alloc_and_read reads the (gpt) partition table from the device fd points to (even if fd is for a partition on that device) and returns the partition index? For example, if rdev is essentially "c0t0d0s3", efi_alloc_and_read fills in the partition table from c0t0d0 and returns 3? If so, LGTM.

    1. Yes, efi_alloc_and_read () returns partition number or negative number on error. There is special case with it however - if you use whole disk device, you get partition #7 - thats corresponding number for ":wd" GPT specific whole disk device. Since no zpool does have such device, the code will never get this partition number (vdev is always built from real partition). Also the tag check is a bit redundant, but I decided to include it to be sure user did set it to have correct value.

  3. 
      
jeffpc
  1. Ship It!
  2. 
      
hans
  1. Ship It!
  2. 
      
tsoome
Review request changed

Status: Closed (submitted)

Loading...