What is the difference between pg_dump and pg_basebackup in PostgreSQL?

pg_dump creates logical backups (SQL statements or archive files) of individual databases. pg_basebackup creates physical backups of the entire PostgreSQL cluster at the filesystem level. Use pg_dump for single-database backups and cross-version migration. Use pg_basebackup for full cluster backup and point-in-time recovery with WAL archiving.

How do I automate PostgreSQL backups with pg_dump and cron?

Create a shell script that runs pg_dump with your preferred format and compression, then add it to cron. Example: 0 2 * * * /usr/local/bin/pg_backup.sh runs daily at 2 AM. Use a .pgpass file for passwordless authentication and redirect stderr to a log file for monitoring.

Can I restore a pg_dump backup to a different PostgreSQL version?

Yes, logical backups from pg_dump are cross-version compatible. You can dump from PostgreSQL 12 and restore to PostgreSQL 16. Always use the pg_dump and pg_restore from the newer version when migrating. Some deprecated features may require manual SQL adjustments after restore.

How do I restore a single table from a PostgreSQL backup?

Use pg_restore with the -t flag to restore a specific table from a custom-format dump. Example: pg_restore -d mydb -t users backup.dump. This only works with custom (-Fc) or directory (-Fd) format dumps, not plain SQL dumps. You can also use --list and --use-list to select specific items.

PostgreSQL Backup and Restore with pg_dump

PostgreSQL backup and restore is the foundation of any database disaster recovery plan. Whether you are protecting a single application database or managing hundreds of instances, mastering pg_dump and pg_restore ensures you can recover from hardware failures, accidental deletions, and botched migrations. This guide covers every practical aspect — from basic dumps through automated scheduling, selective restores, and the critical step most teams skip: actually testing that backups can be restored.

Prerequisites

PostgreSQL 12 or later installed and running
Access to a PostgreSQL database with sufficient privileges (pg_dump requires at least read access to all objects)
sudo or postgres user access on the database server
Basic familiarity with SQL and PostgreSQL concepts (databases, schemas, tables)

Creating Backups with pg_dump

pg_dump produces logical backups — a representation of your database as SQL statements or a compressed archive. Unlike filesystem-level copies, these backups are portable across PostgreSQL versions and platforms.

Plain SQL Dump (Default)

The simplest backup creates a .sql file containing all the SQL commands needed to recreate the database:

# Dump a single database to a SQL file
pg_dump -U postgres -h localhost mydb > mydb_backup.sql

# Include CREATE DATABASE statement
pg_dump -U postgres -h localhost -C mydb > mydb_backup_with_create.sql

# Dump with timestamps in filename
pg_dump -U postgres mydb > "mydb_$(date +%Y%m%d_%H%M%S).sql"

The plain SQL format is human-readable and can be restored with psql. The downside: it does not support parallel restore or selective table restoration.

Custom Format (-Fc) — Recommended for Production

The custom format creates a compressed, non-text archive that supports the most flexible restore options:

# Custom format backup (compressed by default)
pg_dump -U postgres -Fc mydb > mydb_backup.dump

# Custom format with maximum compression
pg_dump -U postgres -Fc -Z 9 mydb > mydb_backup.dump

# Custom format with parallel dump (4 jobs)
pg_dump -U postgres -Fc -j 4 mydb > mydb_backup.dump

Why custom format is preferred:

Compressed automatically (typically 5-10x smaller than plain SQL)
Supports parallel restore with pg_restore -j
Allows selective restore (specific tables, schemas, or data only)
Can reorder items during restore to optimize load speed

Directory Format (-Fd) — Best for Large Databases

The directory format creates a directory with one file per table, enabling true parallel dumps:

# Directory format with parallel dump (8 workers)
pg_dump -U postgres -Fd -j 8 -f /backup/mydb_dir mydb

This creates a directory containing a toc.dat file (table of contents) and one compressed file per table. The parallel dump significantly reduces backup time for large databases.

Dumping All Databases (pg_dumpall)

To back up every database in the PostgreSQL cluster, plus global objects (roles, tablespaces):

# Dump all databases and global objects
pg_dumpall -U postgres > all_databases.sql

# Dump only global objects (roles, tablespaces)
pg_dumpall -U postgres --globals-only > globals.sql

pg_dumpall always produces plain SQL format. For large clusters, dump globals separately and use pg_dump per database in custom format.

Selective Dumps

# Dump a single table
pg_dump -U postgres -t users mydb > users_table.sql

# Dump multiple specific tables
pg_dump -U postgres -t users -t orders -t products mydb > selected_tables.sql

# Dump a specific schema
pg_dump -U postgres -n public mydb > public_schema.sql

# Dump schema only (no data)
pg_dump -U postgres -s mydb > schema_only.sql

# Dump data only (no schema)
pg_dump -U postgres -a mydb > data_only.sql

# Exclude a large table from the dump
pg_dump -U postgres -T audit_log mydb > mydb_no_audit.sql

Restoring Backups

Restore from Plain SQL

# Restore to an existing database
psql -U postgres -h localhost mydb < mydb_backup.sql

# Create the database and restore (if dump includes -C flag)
psql -U postgres -h localhost < mydb_backup_with_create.sql

# Restore with verbose output
psql -U postgres -v ON_ERROR_STOP=1 mydb < mydb_backup.sql

The ON_ERROR_STOP=1 flag causes psql to stop on the first error instead of silently continuing. Always use this when restoring production databases.

Restore from Custom Format

# Basic restore
pg_restore -U postgres -d mydb mydb_backup.dump

# Parallel restore (8 jobs — dramatically faster for large databases)
pg_restore -U postgres -d mydb -j 8 mydb_backup.dump

# Restore to a new database
createdb -U postgres mydb_restored
pg_restore -U postgres -d mydb_restored mydb_backup.dump

# Clean (drop) existing objects before restore
pg_restore -U postgres -d mydb --clean --if-exists mydb_backup.dump

Restore Specific Tables

# Restore a single table from a custom-format dump
pg_restore -U postgres -d mydb -t users mydb_backup.dump

# Restore using a list file (fine-grained control)
pg_restore -l mydb_backup.dump > restore_list.txt
# Edit restore_list.txt — comment out items you don't want
pg_restore -U postgres -d mydb -L restore_list.txt mydb_backup.dump

Restore Specific Schema

# Restore only the 'public' schema
pg_restore -U postgres -d mydb -n public mydb_backup.dump

# Restore data only (schema already exists)
pg_restore -U postgres -d mydb -a mydb_backup.dump

Comparing PostgreSQL Backup Methods

Feature	pg_dump (Logical)	pg_basebackup (Physical)	Filesystem Snapshot
Backup scope	Single database	Entire cluster	Entire cluster
Cross-version restore	Yes	No (same major version)	No
Selective table restore	Yes (-t flag)	No	No
Point-in-time recovery	No	Yes (with WAL)	Yes (with WAL)
Backup while running	Yes (consistent snapshot)	Yes	Requires fsync/freeze
Backup size	Smaller (compressed data)	Larger (full data directory)	Largest (full disk)
Backup speed	Slower (reads via SQL)	Faster (streaming)	Fastest (block-level)
Best for	Individual databases, migrations	Full cluster DR, PITR	VM/cloud snapshots

Use pg_dump for daily backups of individual databases and for migrations between PostgreSQL versions. Use pg_basebackup when you need point-in-time recovery or are setting up streaming replication. Use filesystem snapshots as a complement, not a replacement — they are fast but require careful handling of WAL files.

Automating Backups with Cron

Backup Script

#!/bin/bash
# /usr/local/bin/pg_backup.sh
# Automated PostgreSQL backup script

set -euo pipefail

# Configuration
BACKUP_DIR="/var/backups/postgresql"
RETENTION_DAYS=30
PG_USER="postgres"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
LOG_FILE="/var/log/pg_backup.log"

# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"

# Log function
log() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" >> "$LOG_FILE"
}

log "Starting PostgreSQL backup"

# Dump globals (roles, tablespaces)
pg_dumpall -U "$PG_USER" --globals-only > "$BACKUP_DIR/globals_${TIMESTAMP}.sql" 2>> "$LOG_FILE"
log "Globals dumped"

# Dump each database in custom format
for DB in $(psql -U "$PG_USER" -At -c "SELECT datname FROM pg_database WHERE datistemplate = false AND datname != 'postgres'"); do
    DUMP_FILE="$BACKUP_DIR/${DB}_${TIMESTAMP}.dump"
    pg_dump -U "$PG_USER" -Fc -Z 6 "$DB" > "$DUMP_FILE" 2>> "$LOG_FILE"
    SIZE=$(du -h "$DUMP_FILE" | cut -f1)
    log "Dumped $DB ($SIZE)"
done

# Remove backups older than retention period
find "$BACKUP_DIR" -name "*.dump" -mtime +${RETENTION_DAYS} -delete
find "$BACKUP_DIR" -name "globals_*.sql" -mtime +${RETENTION_DAYS} -delete
log "Cleaned up backups older than $RETENTION_DAYS days"

log "Backup complete"

# Make executable
chmod +x /usr/local/bin/pg_backup.sh

# Test manually first
sudo -u postgres /usr/local/bin/pg_backup.sh

Cron Schedule

# Edit the postgres user's crontab
sudo crontab -u postgres -e

# Daily backup at 2:00 AM
0 2 * * * /usr/local/bin/pg_backup.sh

# Weekly full backup on Sunday at 1:00 AM (keep longer)
0 1 * * 0 /usr/local/bin/pg_backup.sh

Passwordless Authentication with .pgpass

For cron jobs, you need passwordless authentication. Create a .pgpass file:

# Create .pgpass for the postgres user
sudo -u postgres bash -c 'cat > ~/.pgpass << EOF
localhost:5432:*:postgres:your_secure_password
EOF'

# Set required permissions (pg_dump refuses to use a world-readable .pgpass)
sudo -u postgres chmod 600 ~/.pgpass

Format: hostname:port:database:username:password. Use * as a wildcard for database to cover all databases.

Testing Backup Integrity

A backup that cannot be restored is not a backup. Schedule regular restore tests:

#!/bin/bash
# /usr/local/bin/pg_restore_test.sh
# Verify backup integrity by restoring to a temporary database

set -euo pipefail

BACKUP_FILE="$1"
TEST_DB="restore_test_$(date +%s)"
PG_USER="postgres"

# Create temporary database
createdb -U "$PG_USER" "$TEST_DB"

# Attempt restore
if pg_restore -U "$PG_USER" -d "$TEST_DB" "$BACKUP_FILE" 2>/dev/null; then
    # Verify data
    TABLES=$(psql -U "$PG_USER" -At -c "SELECT count(*) FROM information_schema.tables WHERE table_schema = 'public'" "$TEST_DB")
    ROWS=$(psql -U "$PG_USER" -At -c "SELECT sum(n_live_tup) FROM pg_stat_user_tables" "$TEST_DB")
    echo "PASS: Restored $TABLES tables with $ROWS total rows"
else
    echo "FAIL: Restore encountered errors"
fi

# Clean up
dropdb -U "$PG_USER" "$TEST_DB"

# Test the most recent backup
sudo -u postgres /usr/local/bin/pg_restore_test.sh /var/backups/postgresql/mydb_latest.dump

Troubleshooting Common Issues

”pg_dump: error: connection to server failed"

# Check PostgreSQL is running
sudo systemctl status postgresql

# Verify you can connect
psql -U postgres -h localhost -l

# Check pg_hba.conf for authentication rules
sudo cat /etc/postgresql/16/main/pg_hba.conf | grep -v '^#' | grep -v '^$'

"pg_restore: error: could not execute query: ERROR: relation already exists”

The target database already has objects. Use --clean --if-exists to drop existing objects before restore:

pg_restore -U postgres -d mydb --clean --if-exists mydb_backup.dump

Or restore to a fresh empty database:

dropdb -U postgres mydb
createdb -U postgres mydb
pg_restore -U postgres -d mydb mydb_backup.dump

Backup is too slow

# Use parallel dump (custom or directory format only)
pg_dump -U postgres -Fd -j $(nproc) -f /backup/mydb_dir mydb

# Exclude large, regenerable tables
pg_dump -U postgres -Fc -T large_cache_table -T session_data mydb > mydb.dump

# Compress less aggressively for speed
pg_dump -U postgres -Fc -Z 1 mydb > mydb_fast.dump

Restore is too slow

# Use parallel restore (custom or directory format)
pg_restore -U postgres -d mydb -j $(nproc) mydb_backup.dump

# Disable triggers during data load (speeds up significantly)
pg_restore -U postgres -d mydb --disable-triggers mydb_backup.dump

# Increase maintenance_work_mem for the restore session
psql -U postgres -d mydb -c "SET maintenance_work_mem = '1GB';"
pg_restore -U postgres -d mydb mydb_backup.dump

.pgpass file ignored

# Check permissions (must be 600)
ls -la ~/.pgpass

# Fix permissions
chmod 600 ~/.pgpass

# Verify format (no extra spaces)
cat ~/.pgpass
# Should be: hostname:port:database:username:password

Gotchas and Edge Cases

Large objects (BLOBs): pg_dump includes large objects by default, but pg_restore -t does NOT restore them. Use pg_restore -L with a custom list to include large objects when doing selective restores.

Extensions: pg_dump includes CREATE EXTENSION statements, but restore fails if the extension is not installed on the target server. Install extensions before restoring.

Ownership and permissions: pg_dump records object ownership. If the owner role does not exist on the target server, restore fails. Use --no-owner to skip ownership assignments, or create the roles first using the globals dump.

Concurrent access during dump: pg_dump takes a snapshot at the start and reads consistently from it. Other transactions continue normally. However, DDL operations (ALTER TABLE, DROP) that hold AccessExclusiveLock will block the dump. Schedule dumps during low-activity periods.

Encoding mismatches: If the source and target databases use different encodings, data corruption can occur. Always verify encoding matches:

psql -U postgres -c "SHOW server_encoding;"

Summary

pg_dump with custom format (-Fc) is the recommended approach for production backups — it compresses data, supports parallel restore, and allows selective table recovery
pg_dumpall captures global objects (roles, tablespaces) that pg_dump misses — always dump globals separately alongside your database backups
Automate with cron and a backup script that includes rotation, logging, and error handling — never rely on manual backup procedures
Test your restores regularly by restoring to a temporary database and validating row counts — a backup you have never tested is a backup that might fail when you need it
Use .pgpass for passwordless authentication in cron jobs with chmod 600 permissions — pg_dump refuses to read world-readable credential files
Parallel dump and restore (-j flag) dramatically reduces time for large databases — use directory format (-Fd) for the best parallel performance