RSYNC BACKUP STRATEGY Source Server /var/www /etc /home Production Data rsync SSH encrypted Snapshot N Full backup view --link-dest Snapshot N-1 Hard links shared inodes Backup Server /backup/daily.0 /backup/daily.1 /backup/daily.2 Retention: 30 days Automation cron @ 02:00 Log rotation Email alerts Monitoring Incremental snapshot backups with automated scheduling and monitoring

Data loss is not a question of if but when. Hardware fails, ransomware encrypts, and human error deletes. The only reliable protection is a tested, automated backup strategy. rsync is one of the most powerful and flexible tools available on Linux for this purpose — it transfers only the differences between source and destination, works seamlessly over SSH, and supports snapshot-style backups that look like full copies while consuming a fraction of the disk space.

This guide covers everything you need to design, implement, and automate a production-grade rsync backup strategy for your Linux servers.


What Is Rsync?

Rsync (Remote Sync) is a fast, versatile file-copying utility for Linux and Unix systems. Originally written by Andrew Tridgell and Paul Mackerras, it uses a delta-transfer algorithm that sends only the differences between the source files and the existing files at the destination. This makes it significantly more efficient than tools like cp or scp for recurring backups.

Key characteristics:

  • Delta transfers — only changed bytes are sent, not entire files
  • Preservation of metadata — permissions, ownership, timestamps, symbolic links
  • SSH integration — encrypted transfers to remote servers out of the box
  • Hard link support — enables disk-efficient snapshot backups
  • Flexible filtering — include/exclude patterns for fine-grained control
# Check if rsync is installed and its version
rsync --version

If rsync is not installed:

# Debian/Ubuntu
sudo apt update && sudo apt install -y rsync

# RHEL/CentOS/Fedora
sudo dnf install -y rsync

Why Rsync for Backups?

There are many backup tools available — tar, cp, scp, Duplicity, Borg, Restic — so why choose rsync?

  1. Speed: After the initial full copy, subsequent runs transfer only changed data. A 500 GB backup where 2 GB changed takes seconds, not hours.
  2. Simplicity: rsync is a single command with well-documented options. No proprietary format, no deduplication database to corrupt.
  3. Transparency: Backups are plain files and directories. You can browse, grep, and restore with standard Unix tools — no special restore utility needed.
  4. Universality: rsync is available on every Linux distribution, macOS, and even Windows (via WSL or Cygwin). It is already installed on most servers.
  5. Composability: rsync integrates cleanly with cron, systemd timers, SSH, and shell scripts.

The trade-off is that rsync does not provide built-in encryption at rest, deduplication across backups, or a catalog database. For those features, consider Borg or Restic. But for straightforward, reliable, and transparent backups, rsync is hard to beat.


Prerequisites

Before implementing the strategies in this guide, ensure you have:

  • Two Linux systems — a source (production) server and a backup destination (local disk, NAS, or remote server)
  • rsync installed on both source and destination (version 3.1+ recommended)
  • SSH key-based authentication configured between source and destination for passwordless, automated transfers (see our SSH Hardening Guide)
  • Sufficient disk space on the destination — at minimum 1.5x the size of the data you are backing up
  • Root or sudo access on both machines if you need to preserve ownership and special file attributes
# Verify rsync version on both machines
rsync --version | head -1

# Verify SSH key authentication works without a password prompt
ssh -o BatchMode=yes user@backup-server echo "SSH key auth works"

Basic Rsync Syntax and Options

The general syntax of rsync is:

rsync [OPTIONS] SOURCE DESTINATION

Understanding the most important flags is essential before building a backup strategy:

# Archive mode: preserves permissions, ownership, timestamps, symlinks, devices
rsync -a /source/ /destination/

# Archive + verbose + human-readable sizes + progress
rsync -avhP /source/ /destination/

# Dry run -- shows what would be transferred without making changes
rsync -avhn /source/ /destination/

Critical detail: trailing slashes matter. /source/ (with trailing slash) copies the contents of the directory. /source (without trailing slash) copies the directory itself. This is one of the most common rsync mistakes.

# Copies contents of /var/www/ into /backup/www/
rsync -avh /var/www/ /backup/www/

# Copies the directory /var/www (as a subdirectory) into /backup/
# Result: /backup/www/
rsync -avh /var/www /backup/

Local Backups

The simplest rsync strategy is a local backup to a separate disk or partition. This protects against accidental deletion and filesystem corruption (but not hardware failure of the entire machine).

# Mirror /home to an external drive mounted at /mnt/backup
rsync -avh --delete /home/ /mnt/backup/home/

The --delete flag is important: it removes files from the destination that no longer exist on the source, keeping the backup an exact mirror. Without it, deleted files accumulate on the backup forever.

# Backup multiple directories in one command
rsync -avh --delete \
  /etc/ \
  /var/www/ \
  /home/ \
  /mnt/backup/

# Backup with a log file for auditing
rsync -avh --delete --log-file=/var/log/rsync-backup.log \
  /home/ /mnt/backup/home/

For system-level backups that include device files and special attributes:

# Full system backup (requires root)
sudo rsync -aAXvh --delete \
  --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} \
  / /mnt/backup/system/

The flags -aAX preserve archive mode (-a), ACLs (-A), and extended attributes (-X), which are necessary for a complete system backup.


Remote Backups over SSH

Remote backups protect against local hardware failure, fire, theft, and other physical disasters. rsync uses SSH as its default transport for remote transfers, providing encryption without additional configuration.

# Push backup: send local data to a remote server
rsync -avhz -e ssh /var/www/ backupuser@192.168.1.100:/backup/www/

# Pull backup: pull data from a remote server to local
rsync -avhz -e ssh backupuser@192.168.1.100:/var/www/ /backup/www/

The -z flag enables compression during transfer, which is beneficial over slower network links. On a local gigabit network, compression may actually slow things down due to CPU overhead.

To use a non-standard SSH port or a specific key:

# Use a custom SSH port and identity file
rsync -avhz -e "ssh -p 2222 -i /home/backupuser/.ssh/backup_key" \
  /var/www/ backupuser@remote:/backup/www/

For automated backups, the SSH key must not have a passphrase (or use ssh-agent). Restrict the backup key on the remote server to limit its capabilities:

# On the remote server, in ~backupuser/.ssh/authorized_keys:
# Restrict the key to rsync only
command="rsync --server --sender -vlogDtprze.iLsfxCIvu . /backup/",no-port-forwarding,no-X11-forwarding,no-agent-forwarding ssh-ed25519 AAAAC3Nz... backup@source

This is the most powerful rsync backup strategy. Using --link-dest, you can create what appears to be a full backup every day, but only changed files consume additional disk space. Unchanged files are hard-linked to the previous backup.

How it works:

  1. rsync compares each source file with the corresponding file in the --link-dest directory
  2. If a file is unchanged, rsync creates a hard link (zero additional space)
  3. If a file is changed or new, rsync copies it normally
  4. The result is a directory that looks and behaves like a complete backup
# Create a snapshot-style backup
DATE=$(date +%Y-%m-%d_%H-%M-%S)
DEST="/backup/snapshots/$DATE"
LATEST="/backup/snapshots/latest"

rsync -avh --delete \
  --link-dest="$LATEST" \
  /home/ "$DEST/"

# Update the "latest" symlink
rm -f "$LATEST"
ln -s "$DEST" "$LATEST"

The disk usage is remarkable. Consider a 100 GB dataset where 1% changes daily:

DayApparent SizeActual Disk Usage
Day 1 (full)100 GB100 GB
Day 2100 GB~1 GB (new data only)
Day 3100 GB~1 GB
30 days total3,000 GB apparent~129 GB actual

Each snapshot directory is fully browsable with standard tools. You can cd into any day’s backup and see the complete state of your files at that point in time.

# Browse the backup from 5 days ago
ls /backup/snapshots/2026-01-17_02-00-00/home/jc/documents/

# Restore a specific file from 3 days ago
cp /backup/snapshots/2026-01-19_02-00-00/home/jc/report.pdf /home/jc/report.pdf

# Check actual disk usage per snapshot (shows only unique data)
du -sh /backup/snapshots/*/

Exclude Patterns and Filters

Not everything needs to be backed up. Excluding temporary files, caches, and build artifacts saves time and disk space.

# Exclude patterns inline
rsync -avh --delete \
  --exclude='*.tmp' \
  --exclude='*.log' \
  --exclude='.cache/' \
  --exclude='node_modules/' \
  --exclude='__pycache__/' \
  /home/ /backup/home/

For complex exclusion rules, use an exclude file:

# Create /etc/rsync-excludes.txt
cat > /etc/rsync-excludes.txt << 'EOF'
# Temporary and cache files
*.tmp
*.swp
*.swo
*~
.cache/
.thumbnails/

# Build artifacts
node_modules/
vendor/
__pycache__/
*.pyc
target/
build/
dist/

# System files that should not be backed up
/proc/
/sys/
/dev/
/run/
/tmp/
/mnt/
/media/
/lost+found/

# Large media files (optional)
# *.iso
# *.mp4
EOF
# Use the exclude file
rsync -avh --delete --exclude-from=/etc/rsync-excludes.txt \
  /home/ /backup/home/

rsync also supports include/exclude combinations for more granular control:

# Only back up .conf and .sh files from /etc
rsync -avh \
  --include='*/' \
  --include='*.conf' \
  --include='*.sh' \
  --exclude='*' \
  /etc/ /backup/etc-configs/

Bandwidth Limiting and Compression

When backing up over a WAN or shared network link, you need to control how much bandwidth rsync consumes to avoid impacting production traffic.

# Limit bandwidth to 5 MB/s (value is in KB/s)
rsync -avhz --bwlimit=5000 -e ssh \
  /var/www/ backupuser@remote:/backup/www/

# Limit to 10 Mbps (~1250 KB/s)
rsync -avhz --bwlimit=1250 -e ssh \
  /var/www/ backupuser@remote:/backup/www/

Compression options:

# Enable transfer compression (useful over slow links)
rsync -avhz -e ssh /source/ user@remote:/backup/

# Skip compression for already-compressed files
rsync -avh --compress --skip-compress=gz/bz2/zip/7z/jpg/png/mp4/mkv \
  -e ssh /source/ user@remote:/backup/

For SSH-level compression (alternative to rsync’s -z):

# Use SSH compression instead of rsync compression
rsync -avh -e "ssh -C" /source/ user@remote:/backup/

Tip: Do not enable both rsync compression (-z) and SSH compression (-C) simultaneously. Double compression wastes CPU cycles and can actually increase transfer size. Use one or the other — rsync’s -z is generally more efficient because it compresses per-file.


Automated Backup Script with Cron

A complete backup script should handle logging, error reporting, snapshot rotation, and email notifications. Here is a production-ready example:

#!/usr/bin/env bash
# /usr/local/bin/rsync-backup.sh
# Snapshot-style rsync backup with rotation and logging

set -euo pipefail

# ─── Configuration ───────────────────────────────────────────────
SOURCE="/home/"
BACKUP_ROOT="/backup/snapshots"
REMOTE_USER="backupuser"
REMOTE_HOST="192.168.1.100"
REMOTE_DEST="${REMOTE_USER}@${REMOTE_HOST}:${BACKUP_ROOT}"
SSH_KEY="/root/.ssh/backup_ed25519"
EXCLUDE_FILE="/etc/rsync-excludes.txt"
LOG_DIR="/var/log/rsync-backup"
RETENTION_DAYS=30
BWLIMIT=0  # 0 = unlimited, or set KB/s

# ─── Derived Variables ───────────────────────────────────────────
DATE=$(date +%Y-%m-%d_%H-%M-%S)
DEST="${BACKUP_ROOT}/${DATE}"
LATEST="${BACKUP_ROOT}/latest"
LOG_FILE="${LOG_DIR}/backup-${DATE}.log"
LOCK_FILE="/tmp/rsync-backup.lock"

# ─── Functions ───────────────────────────────────────────────────
log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}

cleanup() {
    rm -f "$LOCK_FILE"
    log "Lock file removed."
}

# ─── Pre-flight Checks ──────────────────────────────────────────
mkdir -p "$LOG_DIR" "$BACKUP_ROOT"

# Prevent concurrent runs
if [ -f "$LOCK_FILE" ]; then
    echo "Backup already running (lock file exists). Exiting."
    exit 1
fi
trap cleanup EXIT
touch "$LOCK_FILE"

log "=== Backup started ==="
log "Source: $SOURCE"
log "Destination: $DEST"

# ─── Run rsync ───────────────────────────────────────────────────
RSYNC_OPTS=(
    -avh
    --delete
    --numeric-ids
    --log-file="$LOG_FILE"
)

[ -f "$EXCLUDE_FILE" ] && RSYNC_OPTS+=(--exclude-from="$EXCLUDE_FILE")
[ "$BWLIMIT" -gt 0 ] 2>/dev/null && RSYNC_OPTS+=(--bwlimit="$BWLIMIT")
[ -L "$LATEST" ] && RSYNC_OPTS+=(--link-dest="$LATEST")

if rsync "${RSYNC_OPTS[@]}" "$SOURCE" "$DEST/"; then
    log "rsync completed successfully."
else
    RSYNC_EXIT=$?
    log "ERROR: rsync exited with code $RSYNC_EXIT"
    # Send alert (uncomment and configure)
    # echo "Backup failed with exit code $RSYNC_EXIT" | mail -s "Backup FAILED" admin@example.com
    exit $RSYNC_EXIT
fi

# ─── Update latest symlink ──────────────────────────────────────
rm -f "$LATEST"
ln -s "$DEST" "$LATEST"
log "Updated 'latest' symlink to $DEST"

# ─── Rotate old backups ─────────────────────────────────────────
log "Removing backups older than $RETENTION_DAYS days..."
find "$BACKUP_ROOT" -maxdepth 1 -type d -name "20*" -mtime +$RETENTION_DAYS -exec rm -rf {} \;
log "Rotation complete."

# ─── Disk usage report ──────────────────────────────────────────
USAGE=$(du -sh "$BACKUP_ROOT" | cut -f1)
log "Total backup storage used: $USAGE"

log "=== Backup finished ==="

Make the script executable and schedule it with cron:

# Make executable
sudo chmod +x /usr/local/bin/rsync-backup.sh

# Edit the root crontab
sudo crontab -e

Add the following cron entry:

# Run backup every day at 2:00 AM
0 2 * * * /usr/local/bin/rsync-backup.sh >> /var/log/rsync-backup/cron.log 2>&1

# Run backup every 6 hours
0 */6 * * * /usr/local/bin/rsync-backup.sh >> /var/log/rsync-backup/cron.log 2>&1

Alternatively, you can use a systemd timer for more robust scheduling:

# /etc/systemd/system/rsync-backup.service
[Unit]
Description=Rsync Backup Service
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/rsync-backup.sh
Nice=10
IOSchedulingClass=idle
# /etc/systemd/system/rsync-backup.timer
[Unit]
Description=Run Rsync Backup Daily

[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true
RandomizedDelaySec=300

[Install]
WantedBy=timers.target
# Enable and start the timer
sudo systemctl daemon-reload
sudo systemctl enable --now rsync-backup.timer

# Check timer status
systemctl list-timers rsync-backup.timer

Restoring from Backups

A backup is only useful if you can restore from it. Because rsync backups are plain files and directories, restoration is straightforward.

Restore a single file

# Find the file in a specific snapshot
ls -la /backup/snapshots/2026-01-20_02-00-00/home/jc/documents/report.pdf

# Restore it
cp /backup/snapshots/2026-01-20_02-00-00/home/jc/documents/report.pdf \
   /home/jc/documents/report.pdf

Restore an entire directory

# Restore a full directory from the latest backup
rsync -avh /backup/snapshots/latest/var/www/ /var/www/

Full system restore

# Restore the entire system (from a live USB or rescue environment)
sudo rsync -aAXvh --delete \
  --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} \
  /mnt/backup/system/ /mnt/target/

Restore from a remote backup

# Pull files from the remote backup server
rsync -avhz -e "ssh -i /root/.ssh/backup_ed25519" \
  backupuser@192.168.1.100:/backup/snapshots/latest/var/www/ \
  /var/www/

Important: Always perform a dry run with --dry-run (or -n) before restoring, especially for full system restores. This shows you exactly what will change without modifying anything.

# Dry run -- preview what the restore would do
rsync -avhn /backup/snapshots/latest/home/ /home/

Verifying Backup Integrity

Backups you do not verify are backups you cannot trust. Schedule regular integrity checks.

# Compare source and backup using checksums (does not transfer data)
rsync -avnc /home/ /backup/snapshots/latest/home/

# The -c flag forces checksum comparison instead of size+timestamp
# -n ensures nothing is transferred (dry run)

If the output shows no differences, the backup is an exact match. Any discrepancies are listed with their paths.

For automated verification, add a check to your backup script:

# Post-backup verification
log "Running integrity verification..."
DIFF_COUNT=$(rsync -anc --delete "$SOURCE" "$DEST/" 2>/dev/null | wc -l)
if [ "$DIFF_COUNT" -le 1 ]; then
    log "Verification PASSED: backup matches source."
else
    log "WARNING: $DIFF_COUNT differences found between source and backup."
fi

Rsync Options Reference Table

OptionDescription
-aArchive mode: equivalent to -rlptgoD (recursive, links, permissions, times, group, owner, devices)
-vVerbose output
-hHuman-readable file sizes
-zCompress data during transfer
-PEquivalent to --partial --progress (resume partial transfers + show progress)
-nDry run — show what would be transferred without making changes
-cUse checksums instead of modification time + size to determine changes
--deleteDelete files on destination that no longer exist on source
--delete-afterDelete after transfer completes (safer than default --delete-during)
--link-dest=DIRHard-link unchanged files from DIR (for snapshot backups)
--exclude=PATTERNExclude files matching PATTERN
--exclude-from=FILERead exclude patterns from FILE
--include=PATTERNInclude files matching PATTERN (overrides exclude)
--bwlimit=KBPSLimit bandwidth in kilobytes per second
-e "ssh ..."Specify the remote shell and its options
--numeric-idsPreserve numeric UID/GID (important for cross-machine backups)
--log-file=FILEWrite detailed transfer log to FILE
--partialKeep partially transferred files for resumption
--statsPrint file transfer statistics at the end
--progressShow per-file transfer progress
--backupMake backups of overwritten/deleted files (with --backup-dir)
--timeout=SECONDSSet I/O timeout in seconds

Troubleshooting

rsync: connection unexpectedly closed

This usually indicates an SSH problem. Debug with:

# Test SSH connectivity first
ssh -v backupuser@remote echo "Connection works"

# Run rsync with SSH debug output
rsync -avhz -e "ssh -vvv" /source/ user@remote:/backup/

Common causes: firewall blocking the SSH port, incorrect SSH key path, SSH key with a passphrase but no ssh-agent running.

Permission denied errors

# Run rsync as root for full system backups
sudo rsync -aAXvh /source/ /backup/

# Or use --no-owner --no-group if you do not need ownership preservation
rsync -avh --no-owner --no-group /source/ /backup/

Ensure the --link-dest path is absolute and accessible:

# Wrong: relative path
rsync --link-dest=../latest ...

# Correct: absolute path
rsync --link-dest=/backup/snapshots/latest ...

Backup is taking too long

# Identify what is being transferred
rsync -avh --progress --stats /source/ /backup/ 2>&1 | tail -20

# Exclude large unnecessary files
rsync -avh --exclude='*.iso' --exclude='*.vmdk' /source/ /backup/

Disk space filling up on the destination

# Check space usage per snapshot
du -sh /backup/snapshots/*/

# Manually remove old snapshots
rm -rf /backup/snapshots/2026-01-01_02-00-00

# Check hard link count (high count means good deduplication)
stat /backup/snapshots/latest/some-file

rsync hangs or stalls on large files

# Set a timeout to prevent indefinite hangs
rsync -avh --timeout=300 --partial /source/ /backup/

Summary

A well-designed rsync backup strategy provides reliable, space-efficient, and transparent data protection for your Linux servers. The key components are:

  • Local backups for quick recovery from accidental deletion
  • Remote backups over SSH for disaster recovery
  • Snapshot-style incremental backups with --link-dest for space-efficient versioning
  • Exclude files to skip caches, temporary files, and build artifacts
  • Automated scheduling with cron or systemd timers
  • Regular verification to ensure backup integrity
  • Tested restore procedures — a backup you have not tested is not a backup

Combine this rsync backup strategy with SSH hardening to secure the transport layer, and follow the Linux Server Security Checklist to protect the servers themselves. Together, these practices form a solid foundation for server administration and disaster recovery.