TL;DR — Quick Summary
MySQL master-slave replication for high availability: binary logs, slave config, lag monitoring, failover, and read/write splitting with ProxySQL.
MySQL replication lets one server (the master) stream every data change to one or more replica servers (slaves) in near real-time. The result is a high-availability architecture where read traffic is spread across multiple nodes, backups can be taken from a slave without touching the master, and a slave can be promoted to master in minutes if the primary fails. This guide covers the complete master-slave replication setup on MySQL 8.0 — from binary log configuration and initial data sync to lag monitoring, broken-replication recovery, failover procedures, and ProxySQL-based read/write splitting.
Replication Architecture
MySQL replication works through three components working in concert:
- Binary log (binlog) — the master records every committed transaction to a sequential log file. Three formats exist:
STATEMENT(logs the SQL text),ROW(logs the before/after values of changed rows), andMIXED(uses STATEMENT by default, falls back to ROW for non-deterministic functions). Always useROWfor production. - IO thread — a thread on the slave connects to the master, reads new binary log events, and writes them to the local relay log.
- SQL thread — a second thread on the slave reads the relay log and replays the events against the local database, keeping data in sync.
Master Slave
------ -----
Writes → Binary Log ──────► IO Thread → Relay Log → SQL Thread → Database
(network)
Replication Modes
| Mode | How It Works | When to Use |
|---|---|---|
| Asynchronous (default) | Master does not wait for slave ACK | General HA, read scaling |
| Semi-synchronous | Master waits for at least one slave to write relay log | Financial data, reduced data loss risk |
| Group Replication | Multi-master with consensus protocol (Paxos) | Active-active, auto-failover |
Prerequisites
- Two Linux servers running MySQL 8.0 or 8.4 (Ubuntu 22.04+ or RHEL 9+)
- Root or sudo access on both servers
- Network connectivity between master and slave (port 3306 open)
- Master server-id and slave server-id must be unique integers across the topology
Step 1: Configure the Master Server
Edit the MySQL configuration on the master:
sudo nano /etc/mysql/mysql.conf.d/mysqld.cnf
Add or modify these settings in the [mysqld] section:
[mysqld]
# Unique ID for this server — must be different on every node
server-id = 1
# Enable binary logging — required for replication
log-bin = /var/log/mysql/mysql-bin
binlog-format = ROW
# Keep 7 days of binary logs
expire_logs_days = 7
# Flush binlog to disk on every commit — prevents data loss on crash
sync_binlog = 1
# GTID-based replication (recommended for MySQL 8.0+)
gtid_mode = ON
enforce_gtid_consistency = ON
Apply the configuration:
sudo systemctl restart mysql
Verify binary logging is active:
SHOW VARIABLES LIKE 'log_bin';
-- log_bin | ON
SHOW MASTER STATUS\G
-- File: mysql-bin.000003
-- Position: 1573
Step 2: Create the Replication User
On the master, create a dedicated user with only the permissions needed for replication:
CREATE USER 'repl'@'192.168.1.102' IDENTIFIED BY 'ReplStr0ng!Pass';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.1.102';
FLUSH PRIVILEGES;
Replace 192.168.1.102 with the actual slave IP. Never use % (any host) for replication users — restrict by IP for security.
Step 3: Take a Consistent Snapshot
You need a point-in-time consistent snapshot of the master to seed the slave. The --master-data flag automatically records the binary log position inside the dump file.
# Option A: mysqldump (suitable for databases under ~50 GB)
mysqldump -u root -p \
--all-databases \
--master-data=2 \
--single-transaction \
--flush-logs \
--routines \
--triggers \
> /tmp/master_dump.sql
# Option B: Percona XtraBackup (recommended for large databases, no table locks)
xtrabackup --backup --user=root --password=mypass \
--target-dir=/tmp/xtrabackup/
xtrabackup --prepare --target-dir=/tmp/xtrabackup/
Transfer the dump to the slave:
scp /tmp/master_dump.sql user@192.168.1.102:/tmp/
Step 4: Configure the Slave Server
On the slave, edit mysqld.cnf:
[mysqld]
# Must be different from master and all other slaves
server-id = 2
# Relay log location
relay-log = /var/log/mysql/mysql-relay-bin
# Prevent accidental writes to the slave
read_only = 1
super_read_only = 1
# Required if this slave will itself be a master (chained replication)
log_slave_updates = 1
# Match master GTID settings
gtid_mode = ON
enforce_gtid_consistency = ON
sudo systemctl restart mysql
Step 5: Import the Snapshot and Start Replication
Restore the master dump on the slave:
mysql -u root -p < /tmp/master_dump.sql
Configure the slave to connect to the master. With GTID mode (recommended):
CHANGE MASTER TO
MASTER_HOST = '192.168.1.101',
MASTER_USER = 'repl',
MASTER_PASSWORD = 'ReplStr0ng!Pass',
MASTER_AUTO_POSITION = 1;
Without GTID (traditional position-based), find the position from the dump file comment:
grep "MASTER_LOG_FILE\|MASTER_LOG_POS" /tmp/master_dump.sql | head -5
# -- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=1573;
CHANGE MASTER TO
MASTER_HOST = '192.168.1.101',
MASTER_USER = 'repl',
MASTER_PASSWORD = 'ReplStr0ng!Pass',
MASTER_LOG_FILE = 'mysql-bin.000003',
MASTER_LOG_POS = 1573;
Start replication:
START SLAVE;
Step 6: Monitor Replication Health
SHOW SLAVE STATUS\G
Key fields to check immediately:
Slave_IO_Running: Yes -- IO thread connected and running
Slave_SQL_Running: Yes -- SQL thread replaying events
Seconds_Behind_Master: 0 -- slave is fully caught up
Last_IO_Error: (empty) -- no connection errors
Last_SQL_Error: (empty) -- no replay errors
For ongoing monitoring via performance_schema (MySQL 8.0+):
SELECT
CHANNEL_NAME,
SERVICE_STATE,
LAST_ERROR_MESSAGE,
LAST_HEARTBEAT_TIMESTAMP
FROM performance_schema.replication_connection_status;
Handling Replication Lag
Replication lag (Seconds_Behind_Master growing) is common under heavy write loads. Root causes and fixes:
| Cause | Fix |
|---|---|
| Single-threaded SQL thread | Enable parallel replication workers |
| Long-running queries on slave | Identify with SHOW PROCESSLIST; optimize |
| Disk I/O bottleneck on slave | Move relay logs to faster storage |
| Large bulk transactions | Break into smaller batches on master |
| Network latency | Co-locate master and slaves in same datacenter |
Enable parallel replication (MySQL 8.0):
STOP SLAVE SQL_THREAD;
SET GLOBAL slave_parallel_workers = 4;
SET GLOBAL slave_parallel_type = 'LOGICAL_CLOCK';
START SLAVE SQL_THREAD;
Fixing Broken Replication
When Slave_SQL_Running: No appears in SHOW SLAVE STATUS:
Skip a Single Errored Transaction
STOP SLAVE;
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;
SHOW SLAVE STATUS\G
Skip by GTID (Safer with GTID Mode)
STOP SLAVE;
-- Replace with the GTID from Last_SQL_Error message
SET GTID_NEXT = 'a1b2c3d4-1111-2222-3333-444444444444:1234';
BEGIN; COMMIT;
SET GTID_NEXT = 'AUTOMATIC';
START SLAVE;
Full Resync from Master
STOP SLAVE;
RESET SLAVE ALL;
-- Re-take the snapshot from Step 3 and reconfigure
Failover: Promoting a Slave to Master
When the master fails and you need to promote a slave:
-- On the slave to be promoted:
-- Confirm it has fully caught up
SHOW SLAVE STATUS\G
-- Seconds_Behind_Master: 0
-- Stop replication and clear slave configuration
STOP SLAVE;
RESET SLAVE ALL;
-- Disable read-only mode — this server is now the master
SET GLOBAL read_only = 0;
SET GLOBAL super_read_only = 0;
Then update the application connection strings to the new master IP, and reconfigure any remaining slaves:
-- On remaining slaves:
STOP SLAVE;
CHANGE MASTER TO
MASTER_HOST = '192.168.1.102',
MASTER_AUTO_POSITION = 1;
START SLAVE;
Step 7: ProxySQL for Read/Write Splitting
ProxySQL sits between the application and MySQL, routing writes to the master and reads to slaves transparently.
# Install ProxySQL (Ubuntu)
wget -O - 'https://repo.proxysql.com/ProxySQL/repo_pub_key' | sudo apt-key add -
echo "deb https://repo.proxysql.com/ProxySQL/proxysql-2.x/$(lsb_release -sc)/ ./" \
| sudo tee /etc/apt/sources.list.d/proxysql.list
sudo apt update && sudo apt install proxysql
sudo systemctl enable --now proxysql
Configure via the ProxySQL admin interface (port 6032):
-- Add servers: hostgroup 0 = write (master), hostgroup 1 = read (slave)
INSERT INTO mysql_servers (hostgroup_id, hostname, port) VALUES
(0, '192.168.1.101', 3306),
(1, '192.168.1.102', 3306);
-- Add the application user
INSERT INTO mysql_users (username, password, default_hostgroup) VALUES
('appuser', 'AppPass123!', 0);
-- Route SELECT queries to the read hostgroup
INSERT INTO mysql_query_rules (rule_id, active, match_digest, destination_hostgroup, apply) VALUES
(1, 1, '^SELECT.*', 1, 1);
-- Apply and persist changes
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
LOAD MYSQL USERS TO RUNTIME; SAVE MYSQL USERS TO DISK;
LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK;
Your application connects to ProxySQL on port 6033, and all routing is handled transparently.
Comparison: MySQL Replication vs Alternatives
| Solution | Architecture | Failover | Write Scaling | Complexity |
|---|---|---|---|---|
| MySQL Replication | Master + slaves | Manual or MHA | No (single master) | Low |
| MySQL Group Replication | Multi-master Paxos | Automatic | Limited | Medium |
| Galera Cluster (MariaDB) | Synchronous multi-master | Automatic | Yes | Medium |
| Vitess | Sharded MySQL | Automatic | Yes (sharding) | High |
| PlanetScale | Managed Vitess | Managed | Yes | Low (managed) |
| PostgreSQL Streaming | Primary + standbys | Patroni/manual | No | Low-Medium |
MySQL async replication is the lowest-complexity starting point. Migrate to Group Replication or Galera when you need automatic failover.
Gotchas and Edge Cases
- server-id must be globally unique across all nodes — two nodes with the same ID silently corrupt replication
read_only=1does not block SUPER users — usesuper_read_only=1to fully lock the slave- Large transactions stall the IO thread — use
pt-online-schema-changefor large DDL operations - GTID gaps after restore — if a backup has
gtid_purgedset, configure it on the slave before starting replication - Timezone mismatches cause replication errors — set
default-time-zone='+00:00'on all nodes
Summary
- MySQL master-slave replication uses binary logs, IO threads, and SQL threads to keep slaves in sync
- Use
binlog-format=ROW,sync_binlog=1, and GTID mode (gtid_mode=ON) for production - Monitor
Slave_IO_Running,Slave_SQL_Running, andSeconds_Behind_Mastercontinuously - Enable parallel replication workers to reduce lag under heavy write loads
- Use
RESET SLAVE ALLfor a clean reconfiguration rather than trying to patch a broken state - ProxySQL transparently routes reads to slaves and writes to the master
- Take backups from the slave to avoid impacting master performance