Skip to content

PostgreSQL 18 Streaming Replication Setup

This guide provides step-by-step instructions for setting up PostgreSQL 18.1 Streaming Replication on Debian/PGDG using a Primary-Standby architecture.

Architecture:

  • Primary Server: 10.0.0.90 (receives all writes)
  • Standby Servers: 10.0.0.91, 10.0.0.92 (read replicas)
  • Replication Type: Physical Streaming Replication (Standby can read / receives continuous WAL)

Before starting, ensure the following conditions are met:

  • PostgreSQL Major version must be identical (18.x) on all servers
  • Network connectivity: Standby (91/92) must be able to reach Primary (90) on port 5432
  • Running as Debian/PGDG cluster (has pg_lsclusters) or at least know the data_directory path
  • You have sudo/root access on all servers

Run these commands on all servers to verify paths:

Terminal window
sudo -u postgres psql -c "SHOW data_directory; SHOW config_file; SHOW hba_file;"
pg_lsclusters

Expected Output:

data_directory | config_file | hba_file
-------------------------------------------+---------------------------------------+--------------------------------------
/var/lib/postgresql/18/main | /etc/postgresql/18/main/postgresql.conf | /etc/postgresql/18/main/pg_hba.conf

Edit the PostgreSQL configuration file (typically /etc/postgresql/18/main/postgresql.conf):

Terminal window
sudo nano /etc/postgresql/18/main/postgresql.conf

Add or modify the following parameters:

# Network Configuration
listen_addresses = '*'
# Replication Configuration
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
# Prevent WAL from growing indefinitely when slot/standby has issues
# Adjust based on actual disk space (e.g., 5-50GB)
max_slot_wal_keep_size = '10GB'

Parameter Explanations:

ParameterValuePurpose
listen_addresses'*'Allow connections from all network interfaces
wal_levelreplicaEnable WAL logging for replication
max_wal_senders10Maximum concurrent WAL sender processes
max_replication_slots10Maximum replication slots allowed
max_slot_wal_keep_size'10GB'Prevent WAL files from filling disk

Create a dedicated PostgreSQL user with replication privileges:

Terminal window
sudo -u postgres psql -c "CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'ChangeMeStrongPassword';"

1.3 Configure Client Authentication (pg_hba.conf)

Section titled “1.3 Configure Client Authentication (pg_hba.conf)”

Edit the HBA configuration file (typically /etc/postgresql/18/main/pg_hba.conf):

Terminal window
sudo nano /etc/postgresql/18/main/pg_hba.conf

Add the following lines at the end of the file:

# Allow replication from standbys
host replication replicator 10.0.0.91/32 scram-sha-256
host replication replicator 10.0.0.92/32 scram-sha-256

HBA Rule Breakdown:

ColumnValueMeaning
hostTCP/IP connectionConnection type
replicationspecial databaseReplication connection
replicatorusernameUser allowed to connect
10.0.0.91/32IP addressStandby server IP
scram-sha-256auth methodSecure password authentication

Apply the configuration changes by restarting PostgreSQL:

Terminal window
# Using systemctl
sudo systemctl restart postgresql@18-main
# OR using pg_ctlcluster
sudo pg_ctlcluster 18 main restart

Verify Service Status:

Terminal window
sudo systemctl status postgresql@18-main

Step 2: Standby Server Setup (10.0.0.91 and 10.0.0.92)

Section titled “Step 2: Standby Server Setup (10.0.0.91 and 10.0.0.92)”

Stop the PostgreSQL service on the Standby server:

Terminal window
# Using systemctl
sudo systemctl stop postgresql@18-main
# OR using pg_ctlcluster
sudo pg_ctlcluster 18 main stop

Verify the data directory path:

Terminal window
pg_lsclusters

Example output:

Ver Cluster Port Status Owner Data directory Log file
18 main 5432 down postgres /var/lib/postgresql/18/main /var/log/postgresql/...

Remove all files from the data directory:

Terminal window
sudo -u postgres bash -c "rm -rf /var/lib/postgresql/18/main/*"

To avoid entering passwords during pg_basebackup, create a .pgpass file:

Terminal window
sudo -u postgres bash -c 'echo "10.0.0.90:5432:*:replicator:ChangeMeStrongPassword" >> ~/.pgpass && chmod 600 ~/.pgpass'

Verify .pgpass was created:

Terminal window
sudo -u postgres cat ~/.pgpass

.pgpass Format:

hostname:port:database:username:password

2.4 Perform Base Backup with Replication Slot

Section titled “2.4 Perform Base Backup with Replication Slot”

Run pg_basebackup to clone the Primary database and create a dedicated replication slot.

Terminal window
sudo -u postgres pg_basebackup \
-h 10.0.0.90 \
-p 5432 \
-U replicator \
-D /var/lib/postgresql/18/main \
-Fp -Xs -P -R \
-C -S replica_10_0_0_91
Terminal window
sudo -u postgres pg_basebackup \
-h 10.0.0.90 \
-p 5432 \
-U replicator \
-D /var/lib/postgresql/18/main \
-Fp -Xs -P -R \
-C -S replica_10_0_0_92

Option Explanations:

OptionPurpose
-h 10.0.0.90Primary server hostname/IP
-p 5432PostgreSQL port
-U replicatorReplication user
-D /var/lib/postgresql/18/mainTarget data directory
-FpPlain format (file-based copy)
-XsStream WAL during backup
-PShow progress
-RCreate standby.signal and primary_conninfo automatically
-CCreate replication slot
-S replica_10_0_0_91Replication slot name (unique per standby)

Expected Output:

pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created replication slot "replica_10_0_0_91"
...
pg_basebackup: base backup completed

Start the PostgreSQL service on the Standby server:

Terminal window
# Using systemctl
sudo systemctl start postgresql@18-main
# OR using pg_ctlcluster
sudo pg_ctlcluster 18 main start

Verify Service Started:

Terminal window
sudo systemctl status postgresql@18-main

3.1 Verify Standby is in Recovery Mode (91/92)

Section titled “3.1 Verify Standby is in Recovery Mode (91/92)”

On each Standby server, check if it’s running in recovery/standby mode:

Terminal window
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"

Expected Output:

pg_is_in_recovery
-------------------
t
(1 row)
  • t (true) = Server is in standby/recovery mode ✅
  • f (false) = Server is in normal mode (not a standby) ❌

3.2 Check Replication Status on Primary (90)

Section titled “3.2 Check Replication Status on Primary (90)”

On the Primary server, verify that Standby servers are connected:

Terminal window
sudo -u postgres psql -c "SELECT client_addr, state, sync_state, write_lag, flush_lag, replay_lag FROM pg_stat_replication;"

Expected Output:

client_addr | state | sync_state | write_lag | flush_lag | replay_lag
-------------+-----------+------------+-----------+-----------+------------
10.0.0.91 | streaming | async | 00:00:00 | 00:00:00 | 00:00:00
10.0.0.92 | streaming | async | 00:00:00 | 00:00:00 | 00:00:00
(2 rows)

Column Meanings:

ColumnValueMeaning
client_addr10.0.0.91IP of connected Standby
statestreamingReplication is active
sync_stateasyncAsynchronous replication
write_lag00:00:00Time lag for writes
flush_lag00:00:00Time lag for flushes
replay_lag00:00:00Time lag for replays

3.3 Verify Replication Slots on Primary (90)

Section titled “3.3 Verify Replication Slots on Primary (90)”

Check that replication slots are active:

Terminal window
sudo -u postgres psql -c "SELECT slot_name, active, restart_lsn FROM pg_replication_slots;"

Expected Output:

slot_name | active | restart_lsn
--------------------+--------+-------------
replica_10_0_0_91 | t | 0/3000000
replica_10_0_0_92 | t | 0/3000000
(2 rows)
  • active = t means the Standby is currently using the slot ✅
  • active = f means the Standby is disconnected (WAL may accumulate) ⚠️

Problem: If a Standby server goes offline, WAL files can accumulate indefinitely on the Primary, filling up disk space.

Solution 1: Set max_slot_wal_keep_size (already configured in Step 1.1)

max_slot_wal_keep_size = '10GB'

Solution 2: Drop inactive replication slots

If a Standby is permanently dead or decommissioned:

-- On Primary server
SELECT pg_drop_replication_slot('replica_10_0_0_91');

4.2 Troubleshooting: Replication Not Working

Section titled “4.2 Troubleshooting: Replication Not Working”

If replication stops or doesn’t start, check the following:

Terminal window
# From Standby, test connection to Primary
telnet 10.0.0.90 5432
# OR
nc -zv 10.0.0.90 5432
  • Ensure listen_addresses = '*' in postgresql.conf
  • Verify pg_hba.conf has replication rules for Standby IPs
  • Restart PostgreSQL after config changes

On Primary:

Terminal window
sudo journalctl -u postgresql@18-main -n 200 --no-pager

On Standby:

Terminal window
sudo journalctl -u postgresql@18-main -n 200 --no-pager

Common Error Messages:

ErrorCauseSolution
FATAL: no pg_hba.conf entryMissing HBA ruleAdd host replication replicator ... to pg_hba.conf
FATAL: password authentication failedWrong passwordCheck .pgpass or replication user password
could not connect to the primary serverNetwork issueCheck firewall, network connectivity
requested WAL segment has already been removedWAL gap too largeRebuild Standby with new pg_basebackup

On the Primary server, create a test table and insert data:

-- Connect to Primary
sudo -u postgres psql
-- Create test database (if it doesn't exist)
CREATE DATABASE test_replication;
-- Connect to test database
\c test_replication
-- Create test table
CREATE TABLE replication_test (
id SERIAL PRIMARY KEY,
created_at TIMESTAMP DEFAULT NOW(),
data TEXT
);
-- Insert test data
INSERT INTO replication_test (data) VALUES ('Test record from Primary at ' || NOW());
-- Verify data
SELECT * FROM replication_test;

On each Standby server, verify the data replicated:

-- Connect to Standby
sudo -u postgres psql
-- Connect to test database
\c test_replication
-- Query replicated data (should match Primary)
SELECT * FROM replication_test;

Expected Result: The same data from the Primary should appear on the Standby.


Configuration Files:

  • PostgreSQL Config: /etc/postgresql/18/main/postgresql.conf
  • HBA Config: /etc/postgresql/18/main/pg_hba.conf

Monitoring Commands:

Terminal window
# Check connected Standbys
sudo -u postgres psql -c "SELECT client_addr, state, sync_state FROM pg_stat_replication;"
# Check replication slots
sudo -u postgres psql -c "SELECT slot_name, active, restart_lsn FROM pg_replication_slots;"
# View logs
sudo journalctl -u postgresql@18-main -f

Setup Commands:

Terminal window
# 1. Stop PostgreSQL
sudo systemctl stop postgresql@18-main
# 2. Clear data directory
sudo -u postgres bash -c "rm -rf /var/lib/postgresql/18/main/*"
# 3. Setup .pgpass
sudo -u postgres bash -c 'echo "10.0.0.90:5432:*:replicator:PASSWORD" >> ~/.pgpass && chmod 600 ~/.pgpass'
# 4. Perform base backup (use unique slot name per standby)
sudo -u postgres pg_basebackup \
-h 10.0.0.90 -p 5432 -U replicator \
-D /var/lib/postgresql/18/main \
-Fp -Xs -P -R \
-C -S replica_10_0_0_91
# 5. Start PostgreSQL
sudo systemctl start postgresql@18-main

Monitoring Commands:

Terminal window
# Check if in recovery mode (should return 't')
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
# View logs
sudo journalctl -u postgresql@18-main -f

  • Set up monitoring alerts for replication lag
  • Monitor disk space on Primary (WAL files can accumulate)
  • Track pg_stat_replication metrics regularly
  • Replication is NOT a backup (it replicates corruption too)
  • Maintain regular pg_dump backups
  • Test restore procedures periodically
  • Use strong passwords for replication user
  • Restrict pg_hba.conf to specific Standby IPs
  • Consider SSL/TLS for replication connections in production:
# In pg_hba.conf
hostssl replication replicator 10.0.0.91/32 scram-sha-256
  • Size max_slot_wal_keep_size based on:
    • Network speed between Primary and Standby
    • Expected downtime tolerance
    • Available disk space
  • Example: If Standby can be down for 1 hour and WAL generates 2GB/hour, set at least 5GB