openzfs/zfs

vdev open fail because of the label is missing or invalid

Open

#5,102 opened on Sep 14, 2016

View on GitHub
 (4 comments) (0 reactions) (0 assignees)C (9,908 stars) (1,703 forks)batch import
Status: InactiveStatus: UnderstoodType: Featuregood first issue

Description

Dear All, When testing spare device replacing, pool export and import I ran into a problem with vdev open fail because of the label is missing or invalid.The detail test case as following description:

Conditions: OS:Linux A22770782_00 2.6.33.20 Simulation file: file4(1G), file5(1G), file6(1G), file7(100M)

Test: The first step: Create a raidz pool base on 3 simulation files(file4(1G), file5(1G), file7(100M)) which name is raid5, after that add a spare device file6 to this pool,and then use spare device file6 replacing pool member device file7, also set pool's autoexpand property value to on. The pool status is spare device currently in use andSIZE property value is 272M, more detail information as following:

[root@A22770782_00 ~]# zpool status raid5
      pool: raid5
     state: ONLINE
      scan: resilvered 39.5K in 0h0m with 0 errors on Fri Sep  2 11:25:05 2016
    config:
            NAME                      STATE     READ WRITE CKSUM
            raid5                     ONLINE       0     0     0
              raidz1-0                ONLINE       0     0     0
                /home/wugang/file4    ONLINE       0     0     0
                /home/wugang/file5    ONLINE       0     0     0
                spare-2               ONLINE       0     0     0
                  /home/wugang/file7  ONLINE       0     0     0
                  /home/wugang/file6  ONLINE       0     0     0
            spares
              /home/wugang/file6      INUSE     currently in use
    errors: No known data errors

    root@A22770782_00 ~]# zpool list
    NAME              SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
    raid5             272M   690K   271M         -     0%     0%  1.00x  ONLINE  -

The second step: Remove simulation file7 to /home/ directory, then export and import pool again, so the pool state is degraded and member device file7 can't open, and EXPANDSZproperty value is 2.64G, more information as following:

[root@A22770782_00 ~]# zpool status raid5
      pool: raid5
     state: DEGRADED
    status: One or more devices could not be opened.  Sufficient replicas exist for
            the pool to continue functioning in a degraded state.
    action: Attach the missing device and online it using 'zpool online'.
       see: http://zfsonlinux.org/msg/ZFS-8000-2Q
      scan: resilvered 39.5K in 0h0m with 0 errors on Fri Sep  2 11:25:05 2016
    config:
            NAME                        STATE     READ WRITE CKSUM
            raid5                       DEGRADED     0     0     0
              raidz1-0                  DEGRADED     0     0     0
                /home/wugang/file4      ONLINE       0     0     0
                /home/wugang/file5      ONLINE       0     0     0
                spare-2                 DEGRADED     0     0     0
                  11832754904235861952  UNAVAIL      0     0     0  was /home/wugang/file7
                  /home/wugang/file6    ONLINE       0     0     0
            spares
              /home/wugang/file6        INUSE     currently in use
    errors: No known data errors

    [root@A22770782_00 ~]# zpool list
    NAME              SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
    raid5             272M   528K   271M     2.64G     2%     0%  1.00x  DEGRADED  -

The third step: Restore simulation file7 from /home/ directory, then perform zpool clear which will reopen all vdev and clear errors. the result is pool member device file7 can't open beacuse the label is missing or invalid, and SIZE property value change to 2.91G, more information as following:

[root@A22770782_00 ~]# zpool clear raid5

    [root@A22770782_00 ~]# zpool status raid5
      pool: raid5
     state: DEGRADED
    status: One or more devices could not be used because the label is missing or
            invalid.  Sufficient replicas exist for the pool to continue
            functioning in a degraded state.
    action: Replace the device using 'zpool replace'.
       see: http://zfsonlinux.org/msg/ZFS-8000-4J
      scan: resilvered 11.5K in 0h0m with 0 errors on Fri Sep  2 11:27:32 2016
    config:
            NAME                    STATE     READ WRITE CKSUM
            raid5                   DEGRADED     0     0     0
              raidz1-0              DEGRADED     0     0     0
                /home/wugang/file4  ONLINE       0     0     0
                /home/wugang/file5  ONLINE       0     0     0
                /home/wugang/file7  UNAVAIL      0     0     0  corrupted data
            spares
              /home/wugang/file6    AVAIL   
    errors: No known data errors

    [root@A22770782_00 ~]# zpool list
    NAME              SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
    raid5            2.91G   381K  2.91G         -     0%     0%  1.00x  DEGRADED  -

Solution: In above test case the vdev open fail because its allocatable size has shrunk. The shrunk reason is during import pool in the second step because larger size spare device file6 in use lead to grow up top vdev's vdev_asize value, also increased leaf vdev's vdev_min_asize value. so, when reopen all vdev in third step, the member device file7's corresponding leaf vdev's vdev_asize(about 100M) will less than its vdev_min_asize(about 1G), therefore the vdev open faild and failure code in vdev_open function as following:

/*
     * Make sure the allocatable size hasn't shrunk.
     */
    if (asize < vd->vdev_min_asize) {
        vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,
            VDEV_AUX_BAD_LABEL);
        return (SET_ERROR(EINVAL));
    }

Thanks!

Contributor guide