What is the difference between RAIDZ1, RAIDZ2, and RAIDZ3?

RAIDZ1 uses a single parity block per stripe and can tolerate one disk failure. RAIDZ2 uses double parity and survives two simultaneous disk failures. RAIDZ3 uses triple parity and withstands three disk failures. As parity level increases, write performance decreases slightly and usable capacity shrinks, but resilience improves significantly. For pools with large disks where rebuild times are long, RAIDZ2 or RAIDZ3 is recommended.

Can I add a single disk to an existing RAIDZ vdev?

Historically, ZFS did not allow expanding a RAIDZ vdev by adding individual disks. Starting with OpenZFS 2.3 and later, RAIDZ expansion is supported via the zpool attach command for RAIDZ vdevs. On older versions, you can only add entire new vdevs to a pool or replace existing disks with larger ones and then run zpool online -e to expand.

How often should I run zpool scrub?

A monthly scrub is a good baseline for most production systems. Scrubbing reads every block in the pool and verifies its checksum, repairing any silent corruption from healthy copies. For mission-critical data or pools using consumer-grade drives, consider biweekly scrubs. You can schedule scrubs via cron or a systemd timer.

Does ZFS replace the need for hardware RAID?

Yes. ZFS manages redundancy at the software level using mirrors and RAIDZ, and it adds end-to-end checksumming that hardware RAID controllers do not provide. In fact, using a hardware RAID controller with ZFS is discouraged because it hides the individual disks from ZFS, preventing self-healing. Use disks in JBOD or passthrough mode instead.

ZFS Storage Pool Management on Linux: Complete Guide

ZFS is a combined file system and volume manager originally developed by Sun Microsystems. Unlike traditional Linux file systems like ext4 or XFS, ZFS manages the disks, redundancy, compression, snapshots, and integrity checking as a unified system. Each block written to disk is checksummed, and ZFS automatically detects and repairs silent data corruption — something no hardware RAID controller can do.

On Linux, ZFS is available through the OpenZFS project and ships in the default Ubuntu repositories. This guide covers everything you need to create, manage, and maintain ZFS storage pools in production.

Prerequisites

Before you begin, make sure you have:

Ubuntu Server 22.04 LTS or 24.04 LTS (Debian and other distros work with minor adjustments).
Root or sudo access to the server.
Two or more unused disks or block devices. These can be physical drives, virtual disks, or partitions (though whole-disk use is preferred).
RAM: ZFS benefits from memory. A minimum of 2 GB is workable, but 8 GB or more is recommended for production pools, especially when using deduplication.

Check your kernel version:

uname -r

ZFS requires a compatible kernel. Ubuntu’s default HWE kernel works out of the box.

Installing ZFS on Ubuntu

Install the ZFS userspace tools and kernel module:

sudo apt update
sudo apt install zfsutils-linux -y

Verify the installation:

zfs version

You should see both the ZFS userland version and the kernel module version. Confirm the kernel module is loaded:

lsmod | grep zfs

If the module is not loaded, load it manually:

sudo modprobe zfs

Identifying and Preparing Disks

List all block devices on the system:

lsblk -o NAME,SIZE,TYPE,MOUNTPOINT,MODEL

For production pools, always reference disks by their persistent identifiers to avoid device name changes after reboot:

ls -la /dev/disk/by-id/ | grep -v part

Before creating a pool, wipe any existing partition tables or filesystem signatures:

sudo sgdisk --zap-all /dev/disk/by-id/scsi-SATA_VBOX_HARDDISK_VBxxxxxxxx

Repeat for each disk you plan to use.

Creating a ZFS Storage Pool

Mirror Pool (RAID 1 Equivalent)

A mirror writes identical copies to two or more disks. You lose 50% of total capacity but gain full redundancy:

sudo zpool create datapool mirror \
  /dev/disk/by-id/scsi-SATA_disk1 \
  /dev/disk/by-id/scsi-SATA_disk2

RAIDZ1 Pool (RAID 5 Equivalent)

RAIDZ1 distributes data and one parity block across the disks. Requires a minimum of three disks:

sudo zpool create datapool raidz \
  /dev/disk/by-id/scsi-SATA_disk1 \
  /dev/disk/by-id/scsi-SATA_disk2 \
  /dev/disk/by-id/scsi-SATA_disk3

RAIDZ2 Pool (RAID 6 Equivalent)

RAIDZ2 uses double parity and requires at least four disks:

sudo zpool create datapool raidz2 \
  /dev/disk/by-id/scsi-SATA_disk1 \
  /dev/disk/by-id/scsi-SATA_disk2 \
  /dev/disk/by-id/scsi-SATA_disk3 \
  /dev/disk/by-id/scsi-SATA_disk4

Verifying the Pool

After creation, check the pool status:

zpool status datapool

The output shows each vdev, its component disks, and the health state. View pool capacity and usage:

zpool list

By default, ZFS mounts the pool at /<pool_name>. Confirm:

df -h | grep datapool

Configuring Pool and Dataset Properties

Setting Compression

LZ4 compression is fast and provides meaningful space savings with negligible CPU overhead. Enable it on the pool:

sudo zfs set compression=lz4 datapool

Check the compression ratio:

zfs get compressratio datapool

For workloads with highly compressible data (logs, text), you can use zstd for better ratios:

sudo zfs set compression=zstd datapool/logs

Mount Points

Change the mount point for a dataset:

sudo zfs set mountpoint=/srv/data datapool

Record Size

For database workloads, match the ZFS record size to your database page size:

sudo zfs set recordsize=16K datapool/postgres

For general file storage, the default 128K record size works well.

Creating and Managing Datasets

Datasets are independent filesystems within a pool. They share the pool’s storage but each has its own properties, snapshots, and quotas.

Creating Datasets

sudo zfs create datapool/documents
sudo zfs create datapool/vms
sudo zfs create datapool/logs
sudo zfs create datapool/backups

Setting Quotas and Reservations

Limit a dataset to a maximum of 500 GB:

sudo zfs set quota=500G datapool/logs

Guarantee a minimum of 200 GB for a dataset:

sudo zfs set reservation=200G datapool/vms

Listing All Datasets

zfs list -o name,used,avail,refer,mountpoint

Output example:

NAME                 USED  AVAIL  REFER  MOUNTPOINT
datapool             120G  1.80T   128K  /datapool
datapool/backups    45.2G  1.80T  45.2G  /datapool/backups
datapool/documents  12.8G  1.80T  12.8G  /datapool/documents
datapool/logs       8.50G   491G  8.50G  /datapool/logs
datapool/vms        53.5G  1.80T  53.5G  /datapool/vms

Snapshots and Rollbacks

Snapshots are one of ZFS’s most powerful features. They capture the exact state of a dataset at a point in time using copy-on-write semantics, meaning they are instantaneous and initially consume no additional space.

Creating Snapshots

sudo zfs snapshot datapool/documents@2026-01-03

Create recursive snapshots for all child datasets:

sudo zfs snapshot -r datapool@daily-2026-01-03

Listing Snapshots

zfs list -t snapshot -o name,used,creation

Accessing Snapshot Data

Each dataset has a hidden .zfs/snapshot directory:

ls /datapool/documents/.zfs/snapshot/2026-01-03/

You can copy individual files out of a snapshot without performing a full rollback.

Rolling Back to a Snapshot

sudo zfs rollback datapool/documents@2026-01-03

Rolling back destroys all changes made after the snapshot. For intermediate snapshots, use the -r flag to destroy more recent snapshots:

sudo zfs rollback -r datapool/documents@2026-01-03

Destroying Old Snapshots

sudo zfs destroy datapool/documents@2026-01-03

ZFS Send and Receive (Replication)

ZFS can serialize a snapshot into a stream and send it to another pool, machine, or file. This is the foundation for offsite backups and disaster recovery.

Local Replication

sudo zfs send datapool/documents@2026-01-03 | sudo zfs receive backuppool/documents

Incremental Replication

After an initial full send, subsequent transfers only send the differences:

sudo zfs snapshot datapool/documents@2026-01-04
sudo zfs send -i datapool/documents@2026-01-03 datapool/documents@2026-01-04 \
  | sudo zfs receive backuppool/documents

Remote Replication over SSH

sudo zfs send -i datapool/documents@2026-01-03 datapool/documents@2026-01-04 \
  | ssh backupuser@remote-server sudo zfs receive backuppool/documents

For large transfers, pipe through pv to monitor throughput:

sudo zfs send datapool/documents@2026-01-04 \
  | pv \
  | ssh backupuser@remote-server sudo zfs receive backuppool/documents

Scrubbing and Data Integrity

Scrubbing reads every block in the pool and verifies its checksum against the stored metadata. If corruption is found and a redundant copy exists (mirror or RAIDZ), ZFS automatically repairs the damaged block.

Running a Scrub

sudo zpool scrub datapool

Monitor scrub progress:

zpool status datapool

The output shows percentage complete, estimated time remaining, and any errors found.

Automating Scrubs with a Systemd Timer

Create the service unit:

sudo tee /etc/systemd/system/zfs-scrub@.service > /dev/null << 'EOF'
[Unit]
Description=ZFS scrub on %i

[Service]
Type=oneshot
ExecStart=/sbin/zpool scrub %i
EOF

Create the timer:

sudo tee /etc/systemd/system/zfs-scrub@.timer > /dev/null << 'EOF'
[Unit]
Description=Monthly ZFS scrub on %i

[Timer]
OnCalendar=monthly
Persistent=true

[Install]
WantedBy=timers.target
EOF

Enable and start the timer:

sudo systemctl daemon-reload
sudo systemctl enable --now zfs-scrub@datapool.timer

Verify the timer is active:

systemctl list-timers | grep zfs

Adding Devices to an Existing Pool

Adding a New Mirror Vdev

sudo zpool add datapool mirror \
  /dev/disk/by-id/scsi-SATA_disk5 \
  /dev/disk/by-id/scsi-SATA_disk6

Adding a Cache Device (L2ARC)

An SSD as a read cache accelerates random reads:

sudo zpool add datapool cache /dev/disk/by-id/nvme-SSD_cache1

Adding a Log Device (SLOG)

A fast SSD as a separate intent log improves synchronous write performance:

sudo zpool add datapool log mirror \
  /dev/disk/by-id/nvme-SSD_log1 \
  /dev/disk/by-id/nvme-SSD_log2

Always mirror SLOG devices — losing an unmirrored SLOG during a power failure can result in data loss.

Replacing a Failed Disk

When a disk fails, ZFS marks it as DEGRADED in the pool status. Replace it:

sudo zpool replace datapool \
  /dev/disk/by-id/scsi-SATA_failed_disk \
  /dev/disk/by-id/scsi-SATA_new_disk

Monitor the resilver progress:

zpool status datapool

Resilvering only copies the actual data blocks, not the entire disk, making it significantly faster than traditional RAID rebuilds.

Monitoring and Troubleshooting

Pool Health Check

zpool status -v datapool

Key states to watch for:

State	Meaning
`ONLINE`	Healthy, fully operational
`DEGRADED`	Running with reduced redundancy
`FAULTED`	Pool cannot be used; too many disks failed
`UNAVAIL`	Device cannot be opened

I/O Statistics

zpool iostat datapool 5

This displays read/write bandwidth and IOPS every 5 seconds.

ZFS Event Daemon

Monitor ZFS events in real time:

sudo zpool events -v | tail -20

Common Issues and Fixes

Pool not importing after reboot:

sudo zpool import
sudo zpool import datapool

Dataset not mounting:

sudo zfs mount -a

Clearing transient errors after fixing the underlying issue:

sudo zpool clear datapool

Performance Tuning

ARC Size Tuning

ZFS uses a portion of RAM as its Adaptive Replacement Cache (ARC). Check current ARC usage:

cat /proc/spl/kstat/zfs/arcstats | grep -E "^size|^c_max"

Limit ARC to 4 GB (useful on shared servers):

echo "options zfs zfs_arc_max=4294967296" | sudo tee /etc/modprobe.d/zfs.conf
sudo update-initramfs -u

Ashift (Sector Size Alignment)

Modern drives use 4K physical sectors. Set ashift at pool creation time:

sudo zpool create -o ashift=12 datapool mirror disk1 disk2

An ashift of 12 means 4096-byte sectors. Using the wrong ashift causes severe performance degradation and cannot be changed after pool creation.

Summary

ZFS provides a robust, self-healing storage solution for Linux servers that combines volume management, redundancy, compression, and snapshotting into a single coherent system. The key practices for production use are:

Always use disk-by-id paths when creating pools.
Enable LZ4 compression by default for most workloads.
Take regular snapshots and test restores.
Schedule monthly scrubs. to catch and repair silent corruption.
Mirror your SLOG and avoid running RAIDZ1 on pools with large disks where rebuild times exceed 24 hours.
Monitor pool health with zpool status and integrate it into your alerting system.

With proper planning and regular maintenance, a ZFS pool can reliably serve data for years while protecting against disk failures and bit rot that would go undetected on traditional file systems.