TL;DR — Quick Summary
ZFS snapshots provide instant, space-efficient data protection. Learn pools, datasets, rollback, send/receive replication, and sanoid automation for production.
ZFS snapshots deliver near-instant, space-efficient data protection that no traditional backup tool can match at the filesystem level. This guide covers everything from ZFS fundamentals through automated retention policies with sanoid, giving you a production-ready backup strategy you can deploy today.
Prerequisites
- Linux server running Ubuntu 22.04/24.04, Debian 12, or RHEL 9/Fedora
- One or more block devices (physical disks, partitions, or virtual disks for testing)
- Root or sudo access
- Basic familiarity with Linux command line
ZFS Fundamentals
ZFS was designed at Sun Microsystems and combines a volume manager, RAID controller, and filesystem into one coherent layer. Three concepts underpin everything:
Copy-on-Write (CoW): ZFS never overwrites existing data. When a block changes, ZFS writes the new version to a new location and updates the pointer. The old block remains until the transaction commits. This is how snapshots are instant — a snapshot is just a pointer to the current block tree.
Pools, vdevs, and datasets:
- A pool (
zpool) is the top-level storage container built from one or more vdevs. - A vdev (virtual device) defines redundancy: single disk,
mirror,raidz,raidz2, orraidz3. - A dataset is a mountable filesystem inside the pool, inheriting and overriding pool properties.
Checksums and self-healing: Every block stores a checksum. On read, ZFS verifies the checksum and, if the pool has redundancy, automatically repairs corrupt blocks from the redundant copy — silently, without administrator intervention.
Creating ZFS Pools
Install ZFS
# Ubuntu / Debian
sudo apt install zfsutils-linux -y
# RHEL 9 / Fedora (via DKMS)
sudo dnf install https://zfsonlinux.org/epel/zfs-release-2-3$(rpm --eval "%{dist}").noarch.rpm -y
sudo dnf install zfs -y
sudo modprobe zfs
Pool Topologies
# Single disk (no redundancy — test/dev only)
sudo zpool create tank /dev/sdb
# Mirror (RAID-1 equivalent — 2+ disks, 1 can fail)
sudo zpool create tank mirror /dev/sdb /dev/sdc
# RAIDZ1 (RAID-5 equivalent — 3+ disks, 1 parity)
sudo zpool create tank raidz /dev/sdb /dev/sdc /dev/sdd
# RAIDZ2 (RAID-6 equivalent — 4+ disks, 2 parity) — recommended for production
sudo zpool create tank raidz2 /dev/sdb /dev/sdc /dev/sdd /dev/sde
# RAIDZ3 — 3 parity disks, maximum resilience
sudo zpool create tank raidz3 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
# Mixed vdev: 2 mirrors + L2ARC SSD cache
sudo zpool create tank \
mirror /dev/sdb /dev/sdc \
mirror /dev/sdd /dev/sde \
cache /dev/nvme0n1
Check pool status:
zpool status tank
zpool list
Dataset Management
Datasets are where you tune ZFS behavior per workload:
# Create a dataset
zfs create tank/data
# Custom mountpoint
zfs create -o mountpoint=/srv/web tank/web
# Enable LZ4 compression (nearly always a win — fast and effective)
zfs set compression=lz4 tank/data
# Enable deduplication (RAM-intensive — 1 GB RAM per 1 TB deduped)
zfs set dedup=on tank/vms
# Quota — hard limit on dataset size
zfs set quota=500G tank/data
# Reservation — guarantee minimum space
zfs set reservation=100G tank/databases
# Tune recordsize for database workloads
zfs set recordsize=16K tank/databases
zfs set recordsize=1M tank/backup # Large sequential files
# List all datasets with key properties
zfs get compression,quota,recordsize tank/data
Snapshots: Creation and Management
Taking Snapshots
# Single dataset snapshot (naming convention: @YYYY-MM-DD or @YYYY-MM-DDTHH:MM)
zfs snapshot tank/data@2026-03-23
# Recursive snapshot — captures dataset and all children at the same moment
zfs snapshot -r tank@2026-03-23T03:00
# Snapshot with descriptive label
zfs snapshot tank/data@before-nginx-upgrade
Listing Snapshots
# All snapshots
zfs list -t snapshot
# Snapshots for a specific dataset
zfs list -t snapshot -r tank/data
# Show space used by each snapshot
zfs list -t snapshot -o name,used,referenced,written tank/data
The used column shows how much space a snapshot consumes (the delta between when it was taken and now). A brand-new snapshot uses ~zero space.
Snapshot Space Usage
# Show how much would be freed by destroying a snapshot
zfs list -t snapshot -o name,used tank/data
# Overall space accounting
zfs list -o name,used,available,refer tank
Rollback
Rollback reverts a dataset to a prior snapshot state. It destroys all data written after the snapshot.
# Roll back to most recent snapshot only
zfs rollback tank/data@2026-03-23
# Roll back across intermediate snapshots (destroys everything in between)
zfs rollback -r tank/data@before-nginx-upgrade
Warning:
zfs rollback -rdestroys ALL intermediate snapshots between the target and the current state. If you need those snapshots, clone the current state first (see Clones below) before rolling back.
The -r flag is required when intermediate snapshots exist between the target and the current filesystem. Without it, ZFS refuses to roll back.
Clones: Safe Rollback Alternative
A clone creates a writable dataset from a snapshot without touching the source:
# Create a clone from a snapshot
zfs clone tank/data@2026-03-23 tank/data-clone
# Mount and verify the clone
ls /tank/data-clone
# If clone should become the primary, promote it
zfs promote tank/data-clone
After zfs promote, the clone becomes independent and the original dataset becomes a dependent. This is how you “apply” a snapshot without destroying current data.
Send and Receive: Replication
zfs send serializes a snapshot stream; zfs receive imports it. This is the backbone of ZFS replication.
Initial Full Send
# Local copy
zfs send tank/data@2026-03-23 | zfs receive backup/data
# Remote replication over SSH
zfs send tank/data@2026-03-23 | ssh user@backup-server zfs receive backup/data
Incremental Send
# -i: send delta between two specific snapshots
zfs send -i tank/data@2026-03-22 tank/data@2026-03-23 | ssh user@backup-server zfs receive backup/data
# -I: send all intermediate snapshots (capital I)
zfs send -I tank/data@2026-03-20 tank/data@2026-03-23 | ssh user@backup-server zfs receive backup/data
Compressed and Encrypted Send
# -c: transfer already-compressed blocks without CPU overhead on sender
zfs send -c -i tank/data@snap1 tank/data@snap2 | ssh user@remote zfs receive backup/data
# -w: raw encrypted stream (for natively encrypted datasets)
zfs send -w tank/data@snap1 | ssh user@remote zfs receive backup/data
Native Encryption
# Create an encrypted dataset
zfs create -o encryption=aes-256-gcm -o keylocation=prompt -o keyformat=passphrase tank/secrets
# Load key on pool import
zfs load-key tank/secrets
# Send raw encrypted stream (remote server never sees plaintext)
zfs send -w tank/secrets@snap1 | ssh user@remote zfs receive backup/secrets
Automated Snapshots with sanoid
sanoid manages snapshot creation and retention; syncoid handles send/receive replication. Both are part of the same package.
Install sanoid
# Ubuntu / Debian
sudo apt install sanoid -y
# From source (latest)
git clone https://github.com/jimsalterjrs/sanoid.git
cd sanoid && sudo make install
Configure Retention Policy
Edit /etc/sanoid/sanoid.conf:
[tank/data]
use_template = production
[tank/databases]
use_template = production
recursive = yes
[template_production]
frequently = 0
hourly = 24
daily = 30
weekly = 8
monthly = 12
yearly = 0
autosnap = yes
autoprune = yes
This keeps 24 hourly, 30 daily, 8 weekly, and 12 monthly snapshots automatically.
Test and Enable
# Dry run
sudo sanoid --cron --verbose --dryrun
# Enable systemd timer
sudo systemctl enable --now sanoid.timer
Syncoid for Replication
# Replicate tank/data to remote server
sudo syncoid tank/data user@backup-server:backup/data
# Recursive replication
sudo syncoid -r tank user@backup-server:backup
Add syncoid to a daily cron or systemd timer for automated off-site replication.
Custom Cron Alternative
#!/bin/bash
# /usr/local/bin/zfs-snapshot.sh
DATASET="tank/data"
SNAP="${DATASET}@$(date +%Y-%m-%dT%H:%M)"
zfs snapshot "$SNAP"
# Keep last 48 hourly snapshots
zfs list -t snapshot -o name -s creation "$DATASET" | head -n -48 | xargs -r zfs destroy
0 * * * * /usr/local/bin/zfs-snapshot.sh
Scrub Scheduling
Scrubs verify every block’s checksum across the entire pool and repair any errors using redundant copies:
# Manual scrub
sudo zpool scrub tank
# Check scrub status
zpool status tank | grep scan
# Schedule monthly scrub via systemd
sudo systemctl enable --now zfs-scrub-monthly@tank.timer
# Or via cron
0 2 1 * * zpool scrub tank
Review scrub output in zpool status — look for scan: scrub repaired 0B meaning no errors.
ZFS vs Alternative Snapshot Technologies
| Feature | ZFS | Btrfs | LVM Snapshots | ext4+LVM | XFS |
|---|---|---|---|---|---|
| Instant snapshots | Yes | Yes | Yes | Yes (via LVM) | No native |
| Recursive snapshots | Yes (-r) | Subvolumes only | No | No | No |
| Send/Receive replication | Yes (native) | Yes (btrfs send) | No | No | No |
| Self-healing checksums | Yes | Yes | No | No | No |
| Native encryption | Yes (ZoL 0.8+) | No (kernel 6.6+) | Via dm-crypt | Via dm-crypt | Via dm-crypt |
| RAID built-in | Yes (raidz) | Yes (raid1/5/6) | No | No | No |
| Deduplication | Yes (RAM heavy) | Yes (limited) | No | No | No |
| Production maturity | High | Medium | High | High | High |
| Snapshot overhead | Very low | Low | Low–Medium | Low–Medium | N/A |
ZFS wins on feature depth and maturity. Btrfs is the Linux-native alternative with improving stability. LVM snapshots work on any filesystem but lack replication and checksums.
Production Backup Strategy with sanoid + syncoid
A practical three-tier strategy for a production server:
Tier 1 — Local snapshots (sanoid): Hourly for 24 hours, daily for 30 days, weekly for 8 weeks. Enables fast rollback from accidents and software updates.
Tier 2 — Remote server replication (syncoid): Daily full sync to an on-site backup server over LAN. Near-instant recovery from hardware failure.
Tier 3 — Off-site replication (syncoid over SSH): Weekly encrypted send to a cloud VM or colocation server. Protection against site-level disasters.
# /etc/sanoid/sanoid.conf — production template
[template_production]
hourly = 24
daily = 30
weekly = 8
monthly = 6
autosnap = yes
autoprune = yes
[tank/data]
use_template = production
[tank/databases]
use_template = production
recursive = yes
# /etc/cron.d/syncoid
# Tier 2: local NAS
30 2 * * * root syncoid -r tank nas-server:backup
# Tier 3: off-site
0 3 * * 0 root syncoid -r tank user@offsite.example.com:backup
Gotchas and Edge Cases
Deduplication RAM requirement: ZFS dedup requires a deduplication table (DDT) in RAM — roughly 5 GB per TB of deduplicated data. Enabling dedup without sufficient RAM causes swap thrashing. Use dedup only for VM image stores or backup repos with high redundancy.
Snapshot accumulation: Forgotten snapshots hold blocks that ZFS cannot reclaim. A pool that appears “full” but has little referenced data usually has large used values on old snapshots. Always prune with sanoid or explicit zfs destroy.
Rollback on mounted datasets: You cannot roll back a snapshot while a filesystem is busy. Unmount or use the -f flag to force, but be aware this forcibly unmounts any consumers.
Import on different hardware: ZFS pools are portable — zpool export tank then zpool import tank on any machine. Pool GUIDs prevent accidental double-imports.
ARC sizing on Linux: ZFS uses an Adaptive Replacement Cache (ARC) that can consume most available RAM. Set a cap in /etc/modprobe.d/zfs.conf: options zfs zfs_arc_max=4294967296 (4 GB). Tune per server based on workload.
Summary
- ZFS copy-on-write makes snapshots instant and space-efficient — a snapshot costs nothing until data diverges.
- Use
raidz2for production pools; mirror for two-disk setups. - Tune
recordsizeandcompression=lz4per dataset workload. zfs snapshot -rfor atomic recursive snapshots;zfs rollback -rdestroys intermediates — prefer clones for safe testing.zfs send | zfs receiveover SSH provides native, encrypted, incremental replication.- sanoid + syncoid is the production standard for automated retention and replication.
- Schedule monthly scrubs to catch silent data corruption early.