ZFS Storage Pool Management on Linux: Complete Guide
ZFS is a combined file system and volume manager originally developed by Sun Microsystems. Unlike traditional Linux file systems like ext4 or XFS, ZFS manages the disks, redundancy, compression, snapshots, and integrity checking as a unified system. Each block written to disk is checksummed, and ZFS automatically detects and repairs silent data corruption — something no hardware RAID controller can do.
On Linux, ZFS is available through the OpenZFS project and ships in the default Ubuntu repositories. This guide covers everything you need to create, manage, and maintain ZFS storage pools in production.
Prerequisites
Before you begin, make sure you have:
- Ubuntu Server 22.04 LTS or 24.04 LTS (Debian and other distros work with minor adjustments).
- Root or sudo access to the server.
- Two or more unused disks or block devices. These can be physical drives, virtual disks, or partitions (though whole-disk use is preferred).
- RAM: ZFS benefits from memory. A minimum of 2 GB is workable, but 8 GB or more is recommended for production pools, especially when using deduplication.
Check your kernel version:
uname -r
ZFS requires a compatible kernel. Ubuntu’s default HWE kernel works out of the box.
Installing ZFS on Ubuntu
Install the ZFS userspace tools and kernel module:
sudo apt update
sudo apt install zfsutils-linux -y
Verify the installation:
zfs version
You should see both the ZFS userland version and the kernel module version. Confirm the kernel module is loaded:
lsmod | grep zfs
If the module is not loaded, load it manually:
sudo modprobe zfs
Identifying and Preparing Disks
List all block devices on the system:
lsblk -o NAME,SIZE,TYPE,MOUNTPOINT,MODEL
For production pools, always reference disks by their persistent identifiers to avoid device name changes after reboot:
ls -la /dev/disk/by-id/ | grep -v part
Before creating a pool, wipe any existing partition tables or filesystem signatures:
sudo sgdisk --zap-all /dev/disk/by-id/scsi-SATA_VBOX_HARDDISK_VBxxxxxxxx
Repeat for each disk you plan to use.
Creating a ZFS Storage Pool
Mirror Pool (RAID 1 Equivalent)
A mirror writes identical copies to two or more disks. You lose 50% of total capacity but gain full redundancy:
sudo zpool create datapool mirror \
/dev/disk/by-id/scsi-SATA_disk1 \
/dev/disk/by-id/scsi-SATA_disk2
RAIDZ1 Pool (RAID 5 Equivalent)
RAIDZ1 distributes data and one parity block across the disks. Requires a minimum of three disks:
sudo zpool create datapool raidz \
/dev/disk/by-id/scsi-SATA_disk1 \
/dev/disk/by-id/scsi-SATA_disk2 \
/dev/disk/by-id/scsi-SATA_disk3
RAIDZ2 Pool (RAID 6 Equivalent)
RAIDZ2 uses double parity and requires at least four disks:
sudo zpool create datapool raidz2 \
/dev/disk/by-id/scsi-SATA_disk1 \
/dev/disk/by-id/scsi-SATA_disk2 \
/dev/disk/by-id/scsi-SATA_disk3 \
/dev/disk/by-id/scsi-SATA_disk4
Verifying the Pool
After creation, check the pool status:
zpool status datapool
The output shows each vdev, its component disks, and the health state. View pool capacity and usage:
zpool list
By default, ZFS mounts the pool at /<pool_name>. Confirm:
df -h | grep datapool
Configuring Pool and Dataset Properties
Setting Compression
LZ4 compression is fast and provides meaningful space savings with negligible CPU overhead. Enable it on the pool:
sudo zfs set compression=lz4 datapool
Check the compression ratio:
zfs get compressratio datapool
For workloads with highly compressible data (logs, text), you can use zstd for better ratios:
sudo zfs set compression=zstd datapool/logs
Mount Points
Change the mount point for a dataset:
sudo zfs set mountpoint=/srv/data datapool
Record Size
For database workloads, match the ZFS record size to your database page size:
sudo zfs set recordsize=16K datapool/postgres
For general file storage, the default 128K record size works well.
Creating and Managing Datasets
Datasets are independent filesystems within a pool. They share the pool’s storage but each has its own properties, snapshots, and quotas.
Creating Datasets
sudo zfs create datapool/documents
sudo zfs create datapool/vms
sudo zfs create datapool/logs
sudo zfs create datapool/backups
Setting Quotas and Reservations
Limit a dataset to a maximum of 500 GB:
sudo zfs set quota=500G datapool/logs
Guarantee a minimum of 200 GB for a dataset:
sudo zfs set reservation=200G datapool/vms
Listing All Datasets
zfs list -o name,used,avail,refer,mountpoint
Output example:
NAME USED AVAIL REFER MOUNTPOINT
datapool 120G 1.80T 128K /datapool
datapool/backups 45.2G 1.80T 45.2G /datapool/backups
datapool/documents 12.8G 1.80T 12.8G /datapool/documents
datapool/logs 8.50G 491G 8.50G /datapool/logs
datapool/vms 53.5G 1.80T 53.5G /datapool/vms
Snapshots and Rollbacks
Snapshots are one of ZFS’s most powerful features. They capture the exact state of a dataset at a point in time using copy-on-write semantics, meaning they are instantaneous and initially consume no additional space.
Creating Snapshots
sudo zfs snapshot datapool/documents@2026-01-03
Create recursive snapshots for all child datasets:
sudo zfs snapshot -r datapool@daily-2026-01-03
Listing Snapshots
zfs list -t snapshot -o name,used,creation
Accessing Snapshot Data
Each dataset has a hidden .zfs/snapshot directory:
ls /datapool/documents/.zfs/snapshot/2026-01-03/
You can copy individual files out of a snapshot without performing a full rollback.
Rolling Back to a Snapshot
sudo zfs rollback datapool/documents@2026-01-03
Rolling back destroys all changes made after the snapshot. For intermediate snapshots, use the -r flag to destroy more recent snapshots:
sudo zfs rollback -r datapool/documents@2026-01-03
Destroying Old Snapshots
sudo zfs destroy datapool/documents@2026-01-03
ZFS Send and Receive (Replication)
ZFS can serialize a snapshot into a stream and send it to another pool, machine, or file. This is the foundation for offsite backups and disaster recovery.
Local Replication
sudo zfs send datapool/documents@2026-01-03 | sudo zfs receive backuppool/documents
Incremental Replication
After an initial full send, subsequent transfers only send the differences:
sudo zfs snapshot datapool/documents@2026-01-04
sudo zfs send -i datapool/documents@2026-01-03 datapool/documents@2026-01-04 \
| sudo zfs receive backuppool/documents
Remote Replication over SSH
sudo zfs send -i datapool/documents@2026-01-03 datapool/documents@2026-01-04 \
| ssh backupuser@remote-server sudo zfs receive backuppool/documents
For large transfers, pipe through pv to monitor throughput:
sudo zfs send datapool/documents@2026-01-04 \
| pv \
| ssh backupuser@remote-server sudo zfs receive backuppool/documents
Scrubbing and Data Integrity
Scrubbing reads every block in the pool and verifies its checksum against the stored metadata. If corruption is found and a redundant copy exists (mirror or RAIDZ), ZFS automatically repairs the damaged block.
Running a Scrub
sudo zpool scrub datapool
Monitor scrub progress:
zpool status datapool
The output shows percentage complete, estimated time remaining, and any errors found.
Automating Scrubs with a Systemd Timer
Create the service unit:
sudo tee /etc/systemd/system/zfs-scrub@.service > /dev/null << 'EOF'
[Unit]
Description=ZFS scrub on %i
[Service]
Type=oneshot
ExecStart=/sbin/zpool scrub %i
EOF
Create the timer:
sudo tee /etc/systemd/system/zfs-scrub@.timer > /dev/null << 'EOF'
[Unit]
Description=Monthly ZFS scrub on %i
[Timer]
OnCalendar=monthly
Persistent=true
[Install]
WantedBy=timers.target
EOF
Enable and start the timer:
sudo systemctl daemon-reload
sudo systemctl enable --now zfs-scrub@datapool.timer
Verify the timer is active:
systemctl list-timers | grep zfs
Adding Devices to an Existing Pool
Adding a New Mirror Vdev
sudo zpool add datapool mirror \
/dev/disk/by-id/scsi-SATA_disk5 \
/dev/disk/by-id/scsi-SATA_disk6
Adding a Cache Device (L2ARC)
An SSD as a read cache accelerates random reads:
sudo zpool add datapool cache /dev/disk/by-id/nvme-SSD_cache1
Adding a Log Device (SLOG)
A fast SSD as a separate intent log improves synchronous write performance:
sudo zpool add datapool log mirror \
/dev/disk/by-id/nvme-SSD_log1 \
/dev/disk/by-id/nvme-SSD_log2
Always mirror SLOG devices — losing an unmirrored SLOG during a power failure can result in data loss.
Replacing a Failed Disk
When a disk fails, ZFS marks it as DEGRADED in the pool status. Replace it:
sudo zpool replace datapool \
/dev/disk/by-id/scsi-SATA_failed_disk \
/dev/disk/by-id/scsi-SATA_new_disk
Monitor the resilver progress:
zpool status datapool
Resilvering only copies the actual data blocks, not the entire disk, making it significantly faster than traditional RAID rebuilds.
Monitoring and Troubleshooting
Pool Health Check
zpool status -v datapool
Key states to watch for:
| State | Meaning |
|---|---|
ONLINE | Healthy, fully operational |
DEGRADED | Running with reduced redundancy |
FAULTED | Pool cannot be used; too many disks failed |
UNAVAIL | Device cannot be opened |
I/O Statistics
zpool iostat datapool 5
This displays read/write bandwidth and IOPS every 5 seconds.
ZFS Event Daemon
Monitor ZFS events in real time:
sudo zpool events -v | tail -20
Common Issues and Fixes
Pool not importing after reboot:
sudo zpool import
sudo zpool import datapool
Dataset not mounting:
sudo zfs mount -a
Clearing transient errors after fixing the underlying issue:
sudo zpool clear datapool
Performance Tuning
ARC Size Tuning
ZFS uses a portion of RAM as its Adaptive Replacement Cache (ARC). Check current ARC usage:
cat /proc/spl/kstat/zfs/arcstats | grep -E "^size|^c_max"
Limit ARC to 4 GB (useful on shared servers):
echo "options zfs zfs_arc_max=4294967296" | sudo tee /etc/modprobe.d/zfs.conf
sudo update-initramfs -u
Ashift (Sector Size Alignment)
Modern drives use 4K physical sectors. Set ashift at pool creation time:
sudo zpool create -o ashift=12 datapool mirror disk1 disk2
An ashift of 12 means 4096-byte sectors. Using the wrong ashift causes severe performance degradation and cannot be changed after pool creation.
Summary
ZFS provides a robust, self-healing storage solution for Linux servers that combines volume management, redundancy, compression, and snapshotting into a single coherent system. The key practices for production use are:
- Always use disk-by-id paths when creating pools.
- Enable LZ4 compression by default for most workloads.
- Take regular snapshots and test restores.
- Schedule monthly scrubs. to catch and repair silent corruption.
- Mirror your SLOG and avoid running RAIDZ1 on pools with large disks where rebuild times exceed 24 hours.
- Monitor pool health with
zpool statusand integrate it into your alerting system.
With proper planning and regular maintenance, a ZFS pool can reliably serve data for years while protecting against disk failures and bit rot that would go undetected on traditional file systems.