borgbackup/borg

evaluate redundancy / error correction options

Open

#225 opened on Sep 30, 2015

View on GitHub
 (65 comments) (42 reactions) (0 assignees)Python (10,086 stars) (732 forks)batch import
help wantedquestion

Description

There is some danger that bitrot and storage media defects could lead to backup data loss / repository integrity issues. Deduplicating backup systems are more vulnerable to this than non-deduplicating ones, because a defect chunk affects all backup archives using this chunk.

Currently, there is a lot of error detection (CRCs, hashes, HMACs) going on in borgbackup, but it has no built-in support for error correction (see the FAQ about why), but it could be solved maybe using one of these approaches:

  • use borg to have N (N>1) independent backup repos of your data on different targets (if N-1 targets get corrupt then, you have still 1 working left. note that there is no support to create one non-corrupt repo from 2 corrupt repos, although that might be theoretically possible for some cases).
  • snapraid
  • par2
  • FECpp https://github.com/randombit/fecpp (BSD, C++ - make available via Cython?)
  • zfec (GPL/TGPPL, Python 2.x only, PR for Python 3.x exists)
  • RAID (and monitor and scrub the disks), ZFS mirror or RAIDZ* (better not use raid5 or raidz1)
  • zfs copies=N option (N>1)
  • specific filesystems
  • ceph librados
  • https://github.com/Bulat-Ziganshin/FastECC

If we can find some working approaches, we could add them to the documentation. Help and feedback about this is welcome!

Contributor guide