openzfs/zfs

Pool incorrectly reported corrupt due to missing dev

Open

#10,903 建立於 2020年9月9日

在 GitHub 查看
 (3 留言) (0 反應) (0 負責人)C (9,908 star) (1,703 fork)batch import
Bot: Not StaleStatus: UnderstoodType: Defectgood first issue

描述

System information

Type Version/Name
Distribution Name Centos
Distribution Version 7
Linux Kernel 4.20
Architecture x86_64
ZFS Version 0.8.0-461_gddb4e69db
SPL Version 0.8.0-461_gddb4e69db

Describe the problem you're observing

Presence of "special" vdevs for small writes lead to pool incorrectly deemed irrecoverably corrupt after a crash.

Though I question the logic behind how the host came to be it is somewhat perplexing why zfs reacted the way it did.

No matter what was tried including removing cachefile, rebuilding it, trying to carefully force import... nothing worked. zdb however was able to read the pool and confirm txg / metadata were sane. No io errors were reported by the hardware nor OS.

Describe how to reproduce the problem

TBD. Not clear what caused it.

  1. Create pool zpool create tank
    mirror /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0000
    /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0001
    /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0002
    cache /cache/tank.ARC.zfs
    log /cache/tank.ZIL.zfs

zpool import tank -d /dev/disk/by-id/ -d /cache/

device added sometime later ..

zpool add tank special mirror /special/tank.special.00 /special/tank.special.01 -f

mirror split sometime later ..

zpool detach tank /special/tank.special.01

sometime later.. system crash

  1. wait for NOC to somehow turn off server while replacing UPS batteries (aka unexpected crash).
  2. Observe import error on reboot

Include any warning/errors/backtraces from the system logs

## Import error
# zfs import tank
cannot import 'tank': I/O error
        Destroy and re-create the pool from
        a backup source.
#
# zpool import -d /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0000-part1 \
               -d /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0001-part1 \
               -d /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0002-part1 \
               -d /cache/ \
               tank
cannot import 'tank': I/O error
        Destroy and re-create the pool from
        a backup source.
#
## Lots of zdb / attempts to recover the pool omitted
## 
## Ultimately determined special vdev not being imported
# zpool import -d /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0000-part1 \
               -d /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0001-part1 \
               -d /dev/disk/by-id/ata-HGST_HDN721010ALE604_7JFF0002-part1 \
               -d /cache/ \
               -d /special/ \
               tank
#
# zpool status tank
  pool: tank
 state: ONLINE
  scan: resilvered 33.7G in 0 days 00:07:24 with 0 errors on Sat Jun 13 15:54:52 2020
config:

        NAME                                                    STATE     READ WRITE CKSUM
        tank                                                     ONLINE       0   0     0
          mirror-0                                              ONLINE       0   0     0
            ata-HGST_HDN721010ALE604_7JFF0000                   ONLINE       0   0     0
            ata-HGST_HDN721010ALE604_7JFF0001                   ONLINE       0   0     0
            ata-HGST_HDN721010ALE604_7JFF0002                   ONLINE       0   0     0
        special
         /special/special.00                                  ONLINE       0   0     0
        logs
          nvme-eui.00253853ffe0004-part7                       ONLINE       0   0     0
        cache
          nvme-Samsung_SSD_960_PRO_512GB_000000000000-part8  ONLINE       0   0     0

errors: No known data errors
# 

貢獻者指南