I am trying to set up two PostgreSQL servers on the same computer and perform thread replication. I succeeded once, but when I tried again to follow the exact same steps, it would not work. These are the steps: I have $ PGDATA = home / postgresql / 9.1 / data and $ STANDBY = home / postgresql / 9.1 / data2
initdb -D $PGDATA
initdb -D $STANDBY
In the main node, create a user for replication. I do this in pgAdmin (it has superuser privileges)
In the master node in pg_hba.conf add the part that allows you to connect to the network:
host replication repuser 127.0.0.1/0 md5
- In the main node in postgresql.conf install:
max_wal_senders = 1
archive_mode = on
archive_command = 'cp %p ~/postgresql/backup/archivedir/%f'
wal_level = archive
wal_keep_segments = 32
- Run the node wizard and do a basic backup:
psql -d dellstore2 -c "SELECT pg_start_backup('backup for replication', true)"
rsync -av ${PGDATA}/ $STANDBY --exclude postmaster.pid
psql -d dellstore2 -c "select pg_stop_backup()"
pg_stop_backup says that everything is in order, all WAL files have been archived
- In standby mode (data2) node I create recovery.conf using
standby_mode = 'on'
primary_conninfo = 'host=127.0.0.1 port=5432 user=repuser password=haslo'
trigger_file = '/home/michau/postgresql/replication.trigger'
restore_command = 'cp /home/michau/postgresql/backup/archivedir/%f "%p"'
- Run the node wizard, then run the node standby mode - replication should begin, and the standby mode should catch up with the master. This is what happened for the first time. Now, when I start the standby mode, I get: "Already accepted address . " Of course, both the standby mode and the wizard have the same port as in postgresql.conf (they have exactly the same postgresql.conf files). If I change the port in standby mode to say 5433, I get:
LOG: database system was shut down in recovery at 2012-06-12 19:48:01 CEST
LOG: entering standby mode
cp: cannot stat /home/michau/postgresql/backup/archivedir/000000010000000000000007: No such file or directory
LOG: consistent recovery state reached at 0/7000070
LOG: record with zero length at 0/7000070
cp: cannot stat /home/michau/postgresql/backup/archivedir/000000010000000000000007: No such file or directory
LOG: streaming replication successfully connected to primary
LOG: redo starts at 0/7000070
And he just hangs here. Running ps -ef | grep postgresql gives:
michau 2491 1898 0 19:46 pts/0 00:00:00 postgres -D /home/michau/postgresql/9.1/data
michau 2493 2491 0 19:46 ? 00:00:01 postgres: writer process
michau 2494 2491 0 19:46 ? 00:00:00 postgres: wal writer process
michau 2495 2491 0 19:46 ? 00:00:00 postgres: autovacuum launcher process
michau 2496 2491 0 19:46 ? 00:00:00 postgres: archiver process last was 000000010000000000000008
michau 2497 2491 0 19:46 ? 00:00:00 postgres: stats collector process
michau 2571 2214 0 19:49 pts/1 00:00:00 postgres -D /home/michau/postgresql/9.1/data2
michau 2572 2571 0 19:49 ? 00:00:01 postgres: startup process recovering 000000010000000000000009
michau 2575 2571 0 19:49 ? 00:00:01 postgres: writer process
michau 2578 2571 0 19:49 ? 00:00:02 postgres: wal receiver process streaming 0/99782DC
michau 2579 2491 0 19:49 ? 00:00:00 postgres: wal sender process repuser 127.0.0.1(42142) streaming 0/99782DC
michau 2586 2491 0 19:51 ? 00:00:00 postgres: michau postgres ::1(49941) idle
michau 2587 2491 0 19:51 ? 00:00:01 postgres: michau dellstore2 ::1(49942) idle
Recovery 0000000010000009, where it changes for a while, but within half an hour it is no more.
I am sure that I had to do something for the first time, and not to record or something else, but I completely lost to say what it was. I would appreciate any help.