If you are thinking about upgrading your NAS in 2018, I have some unsolicited advive for you: Do not go with RAID5. Do not even go with a traditional RAID setup.

RAID5 looks great for home setups, since you can get some level of redundncy cheaply, but over time and with much bigger disks on the market its disadvantages are becoming more obvious. It is starting to look like the RAID card from that antique shop may have to be paid for with your soul … data *manical laughter*.

You can only ever loose one disk at a time

If you just lost one of the disks in your RAID5, you now have to identify which disk you need to replace. This can be difficult especially when it fails by starting to read/write gibberish some of the time. Once you figured that one out, you have to pray that all the other disks survive rebuilding the raid.

Due to the nature of RAID5, you need to read all of the disks fully in order to recover the data from the one that got away. This puts a lot of stress on the remaining disks bringing them closer to their maybe-immediate demise.

Embrace the striped-mirror

My advice would be to go for some mirrored setup like RAID1 instead. This already saves you from having to read n disks to recover one disk. Still I would not urge anybody to get a traditional RAID1 setup in 2018, neither in hardware nor software.

I would instead advise you to start building a set of striped mirrors with ZFS. Let me illstrate what a striped mirror setup is by showing how you could create one and adapt it to your needs over time:

  1. Crate your storage pool using one 4 TB disk for all your data
  2. Add another 4 TB disk to your pool for redundancy. Now you have a mirror
  3. Add another pair of same-sized disks to increase your storage with the same level of redundancy. This is a striped mirror. A stripe of n mirrors.
  4. If you have SATA ports and physical space left go to step 3, otherwise go to step 5
  5. Replace the smallest pair of disks with bigger ones. In place. No need to copy stuff around
  6. Go to step 5

I will also walk through this whole process using real ZFS commands and a few dummy files as disks in the second half of this post which is a demo that you can follow along with your own PC.

Embrace ZFS

ZFS can give you a bunch of benefits on top of that:

  • Checksums for data and metadata
  • Copy-On-Write - never overwrites data in-place
  • Very cheap snapshots using zfs snap
  • File-system level replication and backups using zfs send and zfs recv
  • Configurable compression and deduplication
  • No downtime when a disks fails

FH LUG lightning talk

This writeup is actually based on a lightning talk I did at the FH Linux User Group in Hagenberg in 2017. Here are the slides for that.

Demo to follow along

The second half of this post will be the demo I did for that 2017 talk. You can follow along with a system that has ZFS support, like FreeBSD or Ubuntu.

We will perfrom there ZFS shenanagains on some ordinary files inside the /tmp folder that we use as disks.

Setup a pool

First we will create a simple 100 MB file called disk.

truncate -s 100M /tmp/disk0

We will use this file as the first ‘disk’ for ZFS to manage, so let’s create a pool from that one disk.

zpool create tank /tmp/disk0

Now let’s also make sure compression is enabled for that pool.

zfs set compression=lz4 tank

We can now use zpool status tank to query the status of our newly created pool.

root@freebsd-zfsdemo-test:/tmp # zpool status tank
  pool: tank
 state: ONLINE
  scan: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        tank          ONLINE       0     0     0
          /tmp/disk0  ONLINE       0     0     0

errors: No known data errors

We can also query information about the datasets in our pool. ZFS divides storage up into datasets. This is where the actual data goes. You can think of them as partitions, but more flexible.

root@freebsd-zfsdemo-test:/tmp # zfs list tank
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank  75.5K  39.9M    23K  /tank

We can see that due to some bookkeeping-overhead not all of the 100 MB are available to us. We can also see that our tank dataset was automatically mounted to /tank. Let’s create a file full of zeroes there.

root@freebsd-zfsdemo-test:/tank # dd if=/dev/zero of=/tank/zeroes bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes transferred in 1.304338 secs (80391454 bytes/sec)

As you can see, thanks to compression we were able to write a 100 MB file full of zeros to 40 MB of storage.

Add a mirror

Now let’s create another disk and add it as a mirror to create some redundancy in our setup.

root@freebsd-zfsdemo-test:/tank # truncate -s 100M /tmp/disk1
root@freebsd-zfsdemo-test:/tank # zpool attach tank /tmp/disk0 /tmp/disk1
root@freebsd-zfsdemo-test:/tank # zpool status tank
  pool: tank
 state: ONLINE
  scan: resilvered 102K in 0h0m with 0 errors on Wed May  2 16:47:04 2018
config:

        NAME            STATE     READ WRITE CKSUM
        tank            ONLINE       0     0     0
          mirror-0      ONLINE       0     0     0
            /tmp/disk0  ONLINE       0     0     0
            /tmp/disk1  ONLINE       0     0     0

Stripe some mirrors

Now lets add another two disk to increase the amount of available storage space.

root@freebsd-zfsdemo-test:/tank # truncate -s 100M /tmp/disk2
root@freebsd-zfsdemo-test:/tank # truncate -s 100M /tmp/disk3
root@freebsd-zfsdemo-test:/tank # zpool add tank mirror /tmp/disk2 /tmp/disk3
root@freebsd-zfsdemo-test:/tank # zpool status tank
  pool: tank
 state: ONLINE
  scan: resilvered 102K in 0h0m with 0 errors on Wed May  2 16:47:04 2018
config:

        NAME            STATE     READ WRITE CKSUM
        tank            ONLINE       0     0     0
          mirror-0      ONLINE       0     0     0
            /tmp/disk0  ONLINE       0     0     0
            /tmp/disk1  ONLINE       0     0     0
          mirror-1      ONLINE       0     0     0
            /tmp/disk2  ONLINE       0     0     0
            /tmp/disk3  ONLINE       0     0     0

errors: No known data errors
root@freebsd-zfsdemo-test:/tank # zfs list tank
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank   106K  79.9M    23K  /tank

The output of zfs list tank confirms that the amount of available space increased.

Try corrupting data

Next we will write some gibberish to one of our disks to simulate the kind of nasty failure where a disk goes bad without telling us. I already pointed out that traditional RAID setups have some problems with that. We will overwrite 1/4 of the disk.

root@freebsd-zfsdemo-test:/tank # dd if=/dev/urandom of=/tmp/disk0 bs=1M count=25
25+0 records in
25+0 records out
26214400 bytes transferred in 17.696901 secs (1481299 bytes/sec)

Now zpool scrub will make ZFS look for errors and as we can see our pool is now listed as DEGRADED. This means we can still read from or write to the pool. As long as one copy of all data is available, we can keep going with degraded performance. We can continue normal operation until we are able to fix things.

root@freebsd-zfsdemo-test:/tank # zpool scrub tank
root@freebsd-zfsdemo-test:/tank # zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 0h0m with 0 errors on Wed May  2 16:57:37 2018
config:

        NAME                      STATE     READ WRITE CKSUM
        tank                      DEGRADED     0     0     0
          mirror-0                DEGRADED     0     0     0
            16262305164111969708  UNAVAIL      0     0     0  was /tmp/disk0
            /tmp/disk1            ONLINE       0     0     0
          mirror-1                ONLINE       0     0     0
            /tmp/disk2            ONLINE       0     0     0
            /tmp/disk3            ONLINE       0     0     0

errors: No known data errors

Replace faulty disks

Of course we should fix things as early as possible, so let’s quickly get some larger disks to replace disk0 and disk1.

root@freebsd-zfsdemo-test:/tank # truncate -s 200M /tmp/disk5
root@freebsd-zfsdemo-test:/tank # truncate -s 200M /tmp/disk6

First we add the first disk and wait until zpool status indicates that the replacement is complete.

root@freebsd-zfsdemo-test:/tank # sudo zpool replace tank /tmp/disk0 /tmp/disk5
root@freebsd-zfsdemo-test:/tank # zpool status tank
  pool: tank
 state: ONLINE
  scan: resilvered 96K in 0h0m with 0 errors on Wed May  2 17:04:53 2018
config:

        NAME            STATE     READ WRITE CKSUM
        tank            ONLINE       0     0     0
          mirror-0      ONLINE       0     0     0
            /tmp/disk5  ONLINE       0     0     0
            /tmp/disk1  ONLINE       0     0     0
          mirror-1      ONLINE       0     0     0
            /tmp/disk2  ONLINE       0     0     0
            /tmp/disk3  ONLINE       0     0     0

errors: No known data errors
root@freebsd-zfsdemo-test:/tank # zfs list tank
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank   113K  79.9M    23K  /tank

Extend storage space to fill a larger disk

Now we can replace the second, non-faulty disk as well and extend our pool utilize the additional storage the new pair of disks provides using zpool online.

root@freebsd-zfsdemo-test:/tank # sudo zpool replace tank /tmp/disk1 /tmp/disk6
root@freebsd-zfsdemo-test:/tank # zpool status tank
  pool: tank
 state: ONLINE
  scan: resilvered 96K in 0h0m with 0 errors on Wed May  2 17:07:43 2018
config:

        NAME            STATE     READ WRITE CKSUM
        tank            ONLINE       0     0     0
          mirror-0      ONLINE       0     0     0
            /tmp/disk5  ONLINE       0     0     0
            /tmp/disk6  ONLINE       0     0     0
          mirror-1      ONLINE       0     0     0
            /tmp/disk2  ONLINE       0     0     0
            /tmp/disk3  ONLINE       0     0     0

errors: No known data errors
root@freebsd-zfsdemo-test:/tank # zfs list tank
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank   113K  79.9M    23K  /tank
root@freebsd-zfsdemo-test:/tank # zpool online -e tank /tmp/disk5
root@freebsd-zfsdemo-test:/tank # zfs list tank
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank   113K   144M    23K  /tank

As you can see we were able to extend our storage in-place and on-the-fly without any downtime.

View history

Another nice feature of ZFS is that it logs all of the ZFS commands that you executed, so that you can get help when something goes wrong.

root@freebsd-zfsdemo-test:/tank # zpool history tank
History for 'tank':
2018-05-02.16:43:07 zpool create tank /tmp/disk0
2018-05-02.16:43:14 zfs set compression=lz4 tank
2018-05-02.16:47:09 zpool attach tank /tmp/disk0 /tmp/disk1
2018-05-02.16:50:26 zpool add tank mirror /tmp/disk2 /tmp/disk3
2018-05-02.16:57:42 zpool scrub tank
2018-05-02.17:04:55 zpool replace tank /tmp/disk0 /tmp/disk5
2018-05-02.17:07:48 zpool replace tank /tmp/disk1 /tmp/disk6
2018-05-02.17:19:40 zpool online -e tank /tmp/disk5

Destroy your data

When you don’t like having your data anymore, for example because your reached the end of some tutorial, you can destroy the pool with your data with the following command.

sudo zpool destroy -f tank

Thank you for reading.