ZFS Snapshots for Backup and Data Protection on Linux

March 22, 2026

14 min read

Fresh

TL;DR — Quick Summary

ZFS snapshots protect Linux data with zero-cost copy-on-write. Learn to create, roll back, clone, and replicate datasets using zfs send/receive and sanoid.

ZFS snapshots are one of the most powerful data protection tools available on Linux. Unlike traditional backup approaches that copy entire datasets, ZFS snapshots leverage copy-on-write (COW) semantics to capture the exact state of a filesystem at a point in time — instantaneously and with zero initial space cost. This guide covers everything from ZFS fundamentals to production-grade automated snapshot pipelines with sanoid and syncoid.

Prerequisites

Ubuntu 22.04/24.04 LTS (or any distro with OpenZFS 2.1+)
One or more block devices for testing (real disks or loopback files)
sudo access
Basic familiarity with Linux block device naming (/dev/sd*, /dev/vd*)

ZFS Core Concepts

Before touching snapshots, understanding three ZFS fundamentals will make every command intuitive.

Copy-on-Write (COW): ZFS never overwrites live data in place. When a block is modified, ZFS writes the new version to a free location and then atomically updates the pointer. The old block remains untouched — this is what snapshots exploit.

Pools and Datasets: A pool (zpool) is the top-level storage container formed from one or more vdevs (virtual devices: mirrors, RAIDZ stripes). A dataset (zfs create) is a mountable filesystem within a pool. Snapshots, clones, and volumes are all dataset variants.

Checksums: Every block carries a 256-bit checksum (SHA-256 or fletcher4). On read, ZFS verifies the checksum and can self-heal from a redundant copy if the primary block is corrupt. This eliminates silent data corruption — the number-one cause of undetected backup failures.

Installing ZFS on Ubuntu

sudo apt update
sudo apt install zfsutils-linux -y
zfs --version
# zfs-2.2.x / zfs-kmod-2.2.x

For kernel module loading issues, ensure linux-headers-$(uname -r) is installed. On Ubuntu 22.04+, ZFS is shipped with the kernel; no DKMS compilation is needed.

Creating Pools

Mirror (RAID-1 equivalent)

sudo zpool create tank mirror /dev/sdb /dev/sdc

RAIDZ Variants

# RAIDZ1 — 1 parity disk, tolerates 1 failure
sudo zpool create tank raidz /dev/sdb /dev/sdc /dev/sdd

# RAIDZ2 — 2 parity disks, tolerates 2 failures (recommended for >=6 drives)
sudo zpool create tank raidz2 /dev/sd{b,c,d,e,f,g}

# RAIDZ3 — 3 parity disks, tolerates 3 failures (large arrays only)
sudo zpool create tank raidz3 /dev/sd{b,c,d,e,f,g,h,i}

Check pool status at any time:

zpool status tank
zpool list

Dataset Management

Datasets are where snapshots actually live. Key properties to set at creation time:

# Create dataset with LZ4 compression (transparent, near-zero CPU overhead)
zfs create -o compression=lz4 tank/data

# Set a mountpoint explicitly
zfs set mountpoint=/srv/data tank/data

# Use zstd for better compression ratio (slightly higher CPU)
zfs set compression=zstd tank/data

# Tune recordsize for database workloads
zfs set recordsize=16K tank/postgres    # PostgreSQL (8K default page × 2)
zfs set recordsize=128K tank/mysql      # MySQL InnoDB (matches extent size)

# Apply quotas and reservations
zfs set quota=500G tank/data            # Hard cap — no writes beyond this
zfs set reservation=100G tank/data      # Guarantee 100G always available

Show all properties for a dataset:

zfs get all tank/data

Key monitoring properties:

Property	Meaning
`used`	Space consumed by dataset + its snapshots
`available`	Free space the dataset can still use
`referenced`	Space used by the live dataset only (excludes snapshots)
`compressratio`	Achieved compression ratio (e.g., `2.14x`)

Snapshot Fundamentals

Creating Snapshots

# Single snapshot
zfs snapshot tank/data@2026-03-22

# Recursive snapshot (dataset + all children)
zfs snapshot -r tank@2026-03-22

# Name with timestamp from shell
zfs snapshot tank/data@$(date +%Y-%m-%dT%H%M)

Listing Snapshots

zfs list -t snapshot
zfs list -t snapshot -o name,used,referenced,creation

Rolling Back

Rollback reverts the dataset to the snapshot state. Any data written after the snapshot is discarded.

zfs rollback tank/data@2026-03-22

If newer snapshots exist between the current state and the target, add -r to destroy them:

zfs rollback -r tank/data@2026-03-21

Accessing Snapshot Contents Without Rolling Back

Every snapshot is accessible as a hidden .zfs/snapshot/ directory on the dataset mountpoint:

ls /srv/data/.zfs/snapshot/
# 2026-03-22  2026-03-21  2026-03-20

# Restore a single file
cp /srv/data/.zfs/snapshot/2026-03-22/important.sql /srv/data/restored.sql

Destroying Snapshots

# Single snapshot
zfs destroy tank/data@2026-03-20

# Range of snapshots
zfs destroy tank/data@2026-03-01%2026-03-15

# All snapshots of a dataset
zfs destroy -r tank/data@%

Clones: Writable Snapshots

A clone is a writable dataset derived from a snapshot. It starts sharing all blocks with the source and only diverges as writes occur — ideal for staging environments or testing migrations.

# Create a clone
zfs clone tank/data@2026-03-22 tank/data-staging

# Mount it (inherits properties unless overridden)
zfs set mountpoint=/srv/staging tank/data-staging

# Promote: make the clone independent of the source
zfs promote tank/data-staging

After zfs promote, the dependency flips: the original dataset becomes dependent on the clone’s snapshot lineage. This enables migrating a clone into production without duplicating data.

Send/Receive: ZFS Replication

zfs send serializes a snapshot stream; zfs receive writes it to another pool — locally or over a network. This is the foundation of ZFS-native backup.

Full Initial Send

# Local copy to a second pool
zfs send tank/data@2026-03-22 | zfs receive backup/data

# Over SSH (compress the stream with -c for network efficiency)
zfs send -c tank/data@2026-03-22 | ssh backup-host zfs receive backup/data

Incremental Sends

Incremental sends transmit only blocks changed between two snapshots — the key to efficient daily backup:

# -i = from prev to now (single increment)
zfs send -ci tank/data@2026-03-21 tank/data@2026-03-22 | \
  ssh backup-host zfs receive -F backup/data

# -I = from prev to now, includes any intermediate snapshots
zfs send -cI tank/data@2026-03-20 tank/data@2026-03-22 | \
  ssh backup-host zfs receive -F backup/data

Resume Interrupted Transfers

Large sends that are interrupted can resume from where they stopped:

# On the receiving side, get the resume token
zfs get receive_resume_token backup/data

# Restart the send using the token
zfs send -t <token> | ssh backup-host zfs receive -s backup/data

Replication Flags Reference

Flag	Effect
`-c`	Compress the stream (uses dataset compression property)
`-i`	Incremental from one snapshot to another
`-I`	Incremental including all intermediate snapshots
`-R`	Replicate dataset recursively (all children + snapshots)
`-s`	Save resume state on the receiver (for `-t` resume)
`-v`	Verbose output (bytes sent, throughput)

Automated Snapshots with Sanoid

Sanoid is the production-standard tool for policy-driven ZFS snapshot management. Syncoid (bundled with sanoid) handles replication.

Install Sanoid

sudo apt install sanoid -y
# or from GitHub for latest version:
sudo apt install libconfig-inifiles-perl libcapture-tiny-perl
git clone https://github.com/jimsalterjrs/sanoid.git
sudo cp sanoid/sanoid sanoid/syncoid /usr/local/sbin/
sudo mkdir /etc/sanoid
sudo cp sanoid/sanoid.defaults.conf /etc/sanoid/

Configure Policies

# /etc/sanoid/sanoid.conf
[tank/data]
  use_template = production
  recursive = yes

[template_production]
  frequently = 0
  hourly = 24
  daily = 30
  monthly = 3
  yearly = 0
  autosnap = yes
  autoprune = yes

Enable the Systemd Timer

sudo systemctl enable --now sanoid.timer
sudo systemctl status sanoid.timer

# Test manually
sudo sanoid --take-snapshots --verbose
sudo sanoid --prune-snapshots --verbose

Syncoid for Replication

# Pull backup from tank/data to backup/data on remote host
syncoid tank/data backup-host:backup/data

# Recursive replication
syncoid --recursive tank backup-host:backup

Scrubbing and Self-Healing

ZFS scrub reads every allocated block and verifies its checksum. Corrupt blocks are repaired automatically if a good copy exists (mirror or RAIDZ).

# Start a scrub
sudo zpool scrub tank

# Check scrub status
zpool status tank | grep -A5 scrub

# Stop a running scrub
sudo zpool scrub -s tank

Schedule monthly scrubs with a systemd timer or cron:

# /etc/cron.d/zfs-scrub
0 2 1 * * root /sbin/zpool scrub tank

After a scrub completes, zpool status reports:

scan: scrub repaired 0B in 00:12:34 with 0 errors on Sun Mar  1 02:12:34 2026

Any non-zero error count warrants investigation — check zpool status -v for per-disk error counts.

ZFS vs. Alternatives

Feature	ZFS	Btrfs	LVM + ext4	XFS
Copy-on-write	Yes	Yes	No (LVM snapshots use COW)	No
Built-in RAID	Yes (RAIDZ)	Yes (limited)	Via LVM	No
Native snapshots	Yes	Yes	Via LVM	No
Send/receive replication	Yes (native)	Yes (btrfs send)	No	No
Checksums / self-heal	Yes	Yes (limited)	No	No
Compression	Yes (lz4, zstd)	Yes	No (filesystem level)	No
ARC/L2ARC caching	Yes	No	No	No
Production maturity (Linux)	High (OpenZFS)	Medium	Very high	Very high
Best for	NAS, databases, servers	Desktop, RAID-1	Legacy/compat	Large files, HPC

Production Patterns

Boot Environments

OpenZFS supports boot environments on the root pool, enabling atomic OS upgrades:

# Create a boot environment before upgrading
sudo zfs snapshot -r rpool/ROOT/ubuntu@before-upgrade

# Roll back if the upgrade fails
sudo zboot list                   # list boot environments (if bectl installed)
sudo bectl activate ubuntu@before-upgrade

Docker on ZFS

Docker’s ZFS storage driver uses datasets per container and snapshot per image layer:

# Configure Docker to use ZFS driver
echo '{"storage-driver": "zfs"}' | sudo tee /etc/docker/daemon.json
sudo systemctl restart docker
docker info | grep "Storage Driver"

Set recordsize=128K on the Docker dataset for large layer writes:

zfs set recordsize=128K tank/docker

ARC and L2ARC Tuning

ZFS uses the Adaptive Replacement Cache (ARC) in RAM. By default it grows to half of RAM.

# Limit ARC to 8 GB on a shared host
echo "options zfs zfs_arc_max=8589934592" | \
  sudo tee /etc/modprobe.d/zfs.conf

# Add an L2ARC (SSD read cache) to a pool
sudo zpool add tank cache /dev/nvme0n1p1

Database-Friendly recordsize

Database	Recommended recordsize	Reason
PostgreSQL	8K or 16K	Matches 8K page size
MySQL InnoDB	16K	Matches InnoDB page size
MySQL MyISAM	128K	Sequential scan friendly
MongoDB	16K	WiredTiger default page
SQLite	4K	Matches default page size

Gotchas and Edge Cases

Never run zpool import on an active pool: importing the same pool on two hosts simultaneously causes split-brain and pool corruption.
Snapshot hold vs. destroy: If zfs send is streaming a snapshot and you destroy it, the stream is interrupted. Use zfs hold to pin a snapshot during a send.
Delegation for non-root: zfs allow tank send,snapshot,hold user1 lets user1 send snapshots without sudo.
Dedup is rarely worth it: ZFS deduplication stores a hash table in RAM (typically 5 GB per TB of unique data). Unless you have enormous RAM, use compression instead.
Swap on ZFS: Avoid using a ZFS zvol for swap — potential deadlocks exist under extreme memory pressure. Use a raw partition or tmpfs for swap.

Summary

ZFS snapshots are instant, space-efficient point-in-time copies built on copy-on-write.
Use zfs snapshot for manual snaps and sanoid for automated policy-driven management.
zfs send | zfs receive replicates datasets efficiently — only changed blocks travel over the wire with incremental sends.
Scrubbing monthly catches silent corruption before it spreads.
Set compression=lz4 on all datasets and tune recordsize for your workload.
Snapshots on the same pool are not a substitute for off-pool backups — always replicate to a separate host.

Guide & Instructions

Estimated Time: 30m

Tools Needed:

zfs
zpool
sanoid

Install ZFS on Ubuntu

Run sudo apt install zfsutils-linux -y, then verify with zfs --version.

Create a pool and dataset

Create a mirror pool with zpool create tank mirror /dev/sdb /dev/sdc, then a dataset with zfs create -o compression=lz4 tank/data.

Take and list snapshots

Create a snapshot with zfs snapshot tank/data@2026-03-22 and list all snapshots with zfs list -t snapshot.

Roll back or clone a snapshot

Roll back with zfs rollback tank/data@2026-03-22 or create a writable clone with zfs clone tank/data@2026-03-22 tank/data-clone.

Replicate with zfs send/receive

Send an incremental snapshot stream over SSH: zfs send -ci tank/data@prev tank/data@now | ssh backup zfs receive backup/data.

Automate with sanoid

Install sanoid, define policies in /etc/sanoid/sanoid.conf, and enable the systemd timer with systemctl enable --now sanoid.timer.

Frequently Asked Questions

Are ZFS snapshots a full backup?

No. Snapshots are read-only references to a dataset state stored on the same pool. They protect against accidental deletion or corruption but not against disk failure or pool loss. Use zfs send/receive to replicate snapshots to a separate pool or remote host for true backup protection.

How much space does a ZFS snapshot use?

Initially zero. A snapshot only consumes space as the original data changes. Each changed or deleted block is retained by the snapshot while the live dataset receives the new version. Old datasets with many writes accumulate snapshot space quickly, so monitor with zfs list -t snapshot.

Can I take a ZFS snapshot of a live database?

Yes, but flush dirty pages first. For PostgreSQL, run CHECKPOINT before snapping. For MySQL/InnoDB, use FLUSH TABLES WITH READ LOCK, snap, then UNLOCK TABLES. Alternatively, set recordsize=16K (PostgreSQL) or 128K (MySQL) on the dataset so ZFS writes align with database page sizes.

What is the difference between sanoid and zfs-auto-snapshot?

zfs-auto-snapshot is a simpler cron-driven script with basic retention policies. Sanoid is a more powerful Perl daemon with a declarative configuration in /etc/sanoid/sanoid.conf, per-dataset policies, recursive snapshotting, and built-in integration with syncoid for efficient replication.