empty pg_stat_replication when replication works fine?

Started by Andrej Vanekalmost 10 years ago3 messagesgeneral
Jump to latest
#1Andrej Vanek
andrej.vanek.sk@gmail.com

Streaming replication set-up,

one master, 3 slaves connecting to it.
I expected ps -ef gets all wal-sender processes and SAME information I'll
get via select * from pg_stat_replication.
Instead I observed:
- pg_stat_replication is empty
- 3 wal-sender processes up and running
- each slave has wal-receiver process running
- replication works (tried to create a table- it appears in all databases)
Question:
- why is pg_stat_replication empty?

Andrej
---------------details
[root@l2bmain ~]# tail /opt/pg_data/postgresql.conf
max_wal_senders = 5
hot_standby = on
wal_keep_segments = 128
archive_command = '/opt/postgres/dbconf/archive_command.sh %p %f'
wal_receiver_status_interval = 2
max_standby_streaming_delay = -1
max_standby_archive_delay = -1
restart_after_crash = off
hot_standby_feedback = on
wal_sender_timeout = 1min
[root@l2bmain ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 10797 1 0 15:53 ? S 0:20
/usr/pgsql-9.3/bin/postgres -D /opt/pg_data -c
config_file=/opt/pg_data//postgresql.conf
postgres 10820 10797 0 15:53 ? Ss 0:00 \_ postgres: logger
process
postgres 10823 10797 0 15:53 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 10824 10797 0 15:53 ? Ss 0:00 \_ postgres: writer
process
postgres 10825 10797 0 15:53 ? Ss 0:00 \_ postgres: wal writer
process
postgres 10826 10797 0 15:53 ? Ss 0:01 \_ postgres: autovacuum
launcher process
postgres 10827 10797 0 15:53 ? Ss 0:00 \_ postgres: archiver
process last was 0000000100000000000000A3.00000028.backup
postgres 10828 10797 0 15:53 ? Ss 0:03 \_ postgres: stats
collector process
postgres 11286 10797 0 15:54 ? Ss 0:08 \_ postgres: wal sender
process pgreplic 192.168.204.12(55231) streaming 0/A401BED8
postgres 11287 10797 0 15:54 ? Ss 0:06 \_ postgres: wal sender
process pgreplic 192.168.204.11(42937) streaming 0/A401BED8
postgres 19322 10797 0 15:58 ? Ss 0:08 \_ postgres: wal sender
process pgreplic 192.168.101.11(52379) streaming 0/A401BED8
postgres 28704 10797 0 18:44 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.13(58245) idle
postgres 7256 10797 0 18:52 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.23(55190) idle
postgres 8667 10797 0 18:53 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.13(58287) idle
[root@l2bmain ~]# psql -U postgres -c "select * from pg_stat_replication;"
pid | usesysid | usename | application_name | client_addr |
client_hostname | client_port | backend_start | state | sent_location |
write_location | flush_location | r
eplay_location | sync_priority | sync_state
-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+-------+---------------+----------------+----------------+--
---------------+---------------+------------
(0 rows)

[root@l2bmain ~]# tail /opt/pg_data/pg_log/postgresql-Wed.log
2016-05-25 15:53:56 CEST:@:[8603] LOG: database system is shut down
2016-05-25 15:53:58 CEST:@:[10821] LOG: database system was shut down in
recovery at 2016-05-25 15:53:56 CEST
2016-05-25 15:53:58 CEST:@:[10821] LOG: database system was not properly
shut down; automatic recovery in progress
2016-05-25 15:53:58 CEST:@:[10821] LOG: consistent recovery state reached
at 0/A2000090
2016-05-25 15:53:58 CEST:@:[10821] LOG: record with zero length at
0/A2000090
2016-05-25 15:53:58 CEST:@:[10821] LOG: redo is not required
2016-05-25 15:53:58 CEST:@:[10821] LOG: MultiXact member wraparound
protections are now enabled
2016-05-25 15:53:58 CEST:@:[10797] LOG: database system is ready to accept
connections
2016-05-25 15:53:58 CEST:@:[10826] LOG: autovacuum launcher started
`
[root@l2bmain ~]# ssh 192.168.101.11
Last login: Wed May 25 22:48:18 2016 from 192.168.101.12
[root@l2amain ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 5730 1 0 15:58 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/pg_data -c
config_file=/opt/pg_data//postgresql.conf
postgres 5754 5730 0 15:58 ? Ss 0:00 \_ postgres: logger
process
postgres 5755 5730 0 15:58 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 5773 5730 0 15:58 ? Ss 0:12 \_ postgres: wal
receiver process streaming 0/A401C030
postgres 5774 5730 0 15:58 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 5775 5730 0 15:58 ? Ss 0:00 \_ postgres: writer
process
postgres 5776 5730 0 15:58 ? Ss 0:00 \_ postgres: stats
collector process
[root@l2amain ~]# psql -U postgres -c "select pg_is_in_recovery();"
pg_is_in_recovery
-------------------
t
(1 row)
[root@l2bmain ~]# ssh 192.168.204.11
Warning: Permanently added '192.168.204.11' (RSA) to the list of known
hosts.
Last login: Wed May 25 16:28:49 2016 from 192.168.200.254
[root@l2abrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 8174 1 0 15:48 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 8195 8174 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 8197 8174 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 8206 8174 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 8207 8174 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 8208 8174 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 11414 8174 0 15:53 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0

[root@l2bmain ~]# ssh 192.168.204.11
Warning: Permanently added '192.168.204.11' (RSA) to the list of known
hosts.
Last login: Wed May 25 16:28:49 2016 from 192.168.200.254
[root@l2abrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 8174 1 0 15:48 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 8195 8174 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 8197 8174 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 8206 8174 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 8207 8174 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 8208 8174 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 11414 8174 0 15:53 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0
[root@l2abrnch ~]# psql -U postgres -c "select pg_is_in_recovery();"
pg_is_in_recovery
-------------------
t
(1 row)

[root@l2abrnch ~]# logout
Connection to 192.168.204.11 closed.
[root@l2bmain ~]# ssh 192.168.204.12
Warning: Permanently added '192.168.204.12' (RSA) to the list of known
hosts.
Last login: Wed May 25 14:58:03 2016 from 192.168.200.254
[root@l2bbrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 22885 1 0 15:48 ? S 0:00
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 22913 22885 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 22914 22885 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 22917 22885 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 22918 22885 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 22919 22885 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 26163 22885 0 15:54 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0

#2Andrej Vanek
andrej.vanek.sk@gmail.com
In reply to: Andrej Vanek (#1)
Re: empty pg_stat_replication when replication works fine?

The instance is still running, I tried to collect more information from it:

all databases are working as expected,
the only issue is that monitoring SQL commands (pg_stat_activity,
pg_stat_replication) are not working as expected (do not reflect postgres
processes list from command-line)

on Master:
- pg_stat_activity is empty as well (they can be seen just in ps f -fu
postgres output: CTSYSTEM lines)
- psql as postgres: select * from pg_stat_activity sees only its own session
- psql as unprivileged user (CTSYSTEM): select * from pg_stat_activity is
empty
- replication works fine (created a table, that was created also on all
replicas)
- added lines to postgresql.conf + reload:

Opening new lines to postgresql.conf + reload configuration:
client_min_messages = debug5
log_min_messages = debug5
log_min_error_statement = debug5
- activity seen in pg_log, also replication activity (pgreplic user) is
seen, still nothing in pg_stat_replication/pg_stat_activity

killed one slave postgres instance, restarted it
- "standby "l2abrnch" has now caught up with primary"
- replication works fine
- no entries on Master in pg_stat_replication
- ps -ef shows the new wal-sender process on master and wal-receiver
process streaming on this slave

Version is:
PostgreSQL 9.3.10 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7-16), 64-bit

I suspect something happened within master server
(pg_stat_activity+pg_stat_replication not working as described, do not
reflect ps -ef list of postgres processes and running SQL
client/replication information)

What may be additionally useful information before restarting the master?

Regards, Andrej

2016-05-25 23:22 GMT+02:00 Andrej Vanek <andrej.vanek.sk@gmail.com>:

Show quoted text

Streaming replication set-up,

one master, 3 slaves connecting to it.
I expected ps -ef gets all wal-sender processes and SAME information I'll
get via select * from pg_stat_replication.
Instead I observed:
- pg_stat_replication is empty
- 3 wal-sender processes up and running
- each slave has wal-receiver process running
- replication works (tried to create a table- it appears in all databases)
Question:
- why is pg_stat_replication empty?

Andrej
---------------details
[root@l2bmain ~]# tail /opt/pg_data/postgresql.conf
max_wal_senders = 5
hot_standby = on
wal_keep_segments = 128
archive_command = '/opt/postgres/dbconf/archive_command.sh %p %f'
wal_receiver_status_interval = 2
max_standby_streaming_delay = -1
max_standby_archive_delay = -1
restart_after_crash = off
hot_standby_feedback = on
wal_sender_timeout = 1min
[root@l2bmain ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 10797 1 0 15:53 ? S 0:20
/usr/pgsql-9.3/bin/postgres -D /opt/pg_data -c
config_file=/opt/pg_data//postgresql.conf
postgres 10820 10797 0 15:53 ? Ss 0:00 \_ postgres: logger
process
postgres 10823 10797 0 15:53 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 10824 10797 0 15:53 ? Ss 0:00 \_ postgres: writer
process
postgres 10825 10797 0 15:53 ? Ss 0:00 \_ postgres: wal
writer process
postgres 10826 10797 0 15:53 ? Ss 0:01 \_ postgres:
autovacuum launcher process
postgres 10827 10797 0 15:53 ? Ss 0:00 \_ postgres: archiver
process last was 0000000100000000000000A3.00000028.backup
postgres 10828 10797 0 15:53 ? Ss 0:03 \_ postgres: stats
collector process
postgres 11286 10797 0 15:54 ? Ss 0:08 \_ postgres: wal
sender process pgreplic 192.168.204.12(55231) streaming 0/A401BED8
postgres 11287 10797 0 15:54 ? Ss 0:06 \_ postgres: wal
sender process pgreplic 192.168.204.11(42937) streaming 0/A401BED8
postgres 19322 10797 0 15:58 ? Ss 0:08 \_ postgres: wal
sender process pgreplic 192.168.101.11(52379) streaming 0/A401BED8
postgres 28704 10797 0 18:44 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.13(58245) idle
postgres 7256 10797 0 18:52 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.23(55190) idle
postgres 8667 10797 0 18:53 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.13(58287) idle
[root@l2bmain ~]# psql -U postgres -c "select * from pg_stat_replication;"
pid | usesysid | usename | application_name | client_addr |
client_hostname | client_port | backend_start | state | sent_location |
write_location | flush_location | r
eplay_location | sync_priority | sync_state

-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+-------+---------------+----------------+----------------+--
---------------+---------------+------------
(0 rows)

[root@l2bmain ~]# tail /opt/pg_data/pg_log/postgresql-Wed.log
2016-05-25 15:53:56 CEST:@:[8603] LOG: database system is shut down
2016-05-25 15:53:58 CEST:@:[10821] LOG: database system was shut down in
recovery at 2016-05-25 15:53:56 CEST
2016-05-25 15:53:58 CEST:@:[10821] LOG: database system was not properly
shut down; automatic recovery in progress
2016-05-25 15:53:58 CEST:@:[10821] LOG: consistent recovery state
reached at 0/A2000090
2016-05-25 15:53:58 CEST:@:[10821] LOG: record with zero length at
0/A2000090
2016-05-25 15:53:58 CEST:@:[10821] LOG: redo is not required
2016-05-25 15:53:58 CEST:@:[10821] LOG: MultiXact member wraparound
protections are now enabled
2016-05-25 15:53:58 CEST:@:[10797] LOG: database system is ready to
accept connections
2016-05-25 15:53:58 CEST:@:[10826] LOG: autovacuum launcher started
`
[root@l2bmain ~]# ssh 192.168.101.11
Last login: Wed May 25 22:48:18 2016 from 192.168.101.12
[root@l2amain ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 5730 1 0 15:58 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/pg_data -c
config_file=/opt/pg_data//postgresql.conf
postgres 5754 5730 0 15:58 ? Ss 0:00 \_ postgres: logger
process
postgres 5755 5730 0 15:58 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 5773 5730 0 15:58 ? Ss 0:12 \_ postgres: wal
receiver process streaming 0/A401C030
postgres 5774 5730 0 15:58 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 5775 5730 0 15:58 ? Ss 0:00 \_ postgres: writer
process
postgres 5776 5730 0 15:58 ? Ss 0:00 \_ postgres: stats
collector process
[root@l2amain ~]# psql -U postgres -c "select pg_is_in_recovery();"
pg_is_in_recovery
-------------------
t
(1 row)
[root@l2bmain ~]# ssh 192.168.204.11
Warning: Permanently added '192.168.204.11' (RSA) to the list of known
hosts.
Last login: Wed May 25 16:28:49 2016 from 192.168.200.254
[root@l2abrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 8174 1 0 15:48 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 8195 8174 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 8197 8174 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 8206 8174 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 8207 8174 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 8208 8174 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 11414 8174 0 15:53 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0

[root@l2bmain ~]# ssh 192.168.204.11
Warning: Permanently added '192.168.204.11' (RSA) to the list of known
hosts.
Last login: Wed May 25 16:28:49 2016 from 192.168.200.254
[root@l2abrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 8174 1 0 15:48 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 8195 8174 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 8197 8174 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 8206 8174 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 8207 8174 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 8208 8174 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 11414 8174 0 15:53 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0
[root@l2abrnch ~]# psql -U postgres -c "select pg_is_in_recovery();"
pg_is_in_recovery
-------------------
t
(1 row)

[root@l2abrnch ~]# logout
Connection to 192.168.204.11 closed.
[root@l2bmain ~]# ssh 192.168.204.12
Warning: Permanently added '192.168.204.12' (RSA) to the list of known
hosts.
Last login: Wed May 25 14:58:03 2016 from 192.168.200.254
[root@l2bbrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 22885 1 0 15:48 ? S 0:00
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 22913 22885 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 22914 22885 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 22917 22885 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 22918 22885 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 22919 22885 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 26163 22885 0 15:54 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0

#3Andrej Vanek
andrej.vanek.sk@gmail.com
In reply to: Andrej Vanek (#1)
Re: empty pg_stat_replication when replication works fine?

I've found the reason: inconsistent catalog data.
This state did not disappear after restart of all postgres instances.

pg_authid.oid does not match the one in pg_stat_get_activity(NULL::integer).

I wonder how this inconsistency could happen.. Maybe some error during
cloning database (binary database copy, or some older WAL logs left over
either in archive or pg_xlog, or some corrupted index in pg_catalog?)
during several test-cycles..

Any other idea how can oid of users could be different from the one
appearing in pg_stat_get_activity()? (see details below)

Andrej
--------------details:
postgres=# \d+ pg_stat_replication;
View "pg_catalog.pg_stat_replication"
Column | Type | Modifiers | Storage |
Description
------------------+--------------------------+-----------+----------+-------------
pid | integer | | plain |
usesysid | oid | | plain |
usename | name | | plain |
application_name | text | | extended |
client_addr | inet | | main |
client_hostname | text | | extended |
client_port | integer | | plain |
backend_start | timestamp with time zone | | plain |
state | text | | extended |
sent_location | text | | extended |
write_location | text | | extended |
flush_location | text | | extended |
replay_location | text | | extended |
sync_priority | integer | | plain |
sync_state | text | | extended |
View definition:
SELECT s.pid,
s.usesysid,
u.rolname AS usename,
s.application_name,
s.client_addr,
s.client_hostname,
s.client_port,
s.backend_start,
w.state,
w.sent_location,
w.write_location,
w.flush_location,
w.replay_location,
w.sync_priority,
w.sync_state
FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid,
application_name, state, query, waiting, xact_start, query_start,
backend_start, state_change, client_addr, client_hostname, client_port),
pg_authid u,
pg_stat_get_wal_senders() w(pid, state, sent_location, write_location,
flush_location, replay_location, sync_priority, sync_state)
WHERE s.usesysid = u.oid AND s.pid = w.pid;

postgres=# \dv+ pg_stat_replication;
List of relations
Schema | Name | Type | Owner | Size | Description
------------+---------------------+------+----------+---------+-------------
pg_catalog | pg_stat_replication | view | postgres | 0 bytes |
(1 row)

postgres=# select * from pg_stat_get_wal_senders();
pid | state | sent_location | write_location | flush_location |
replay_location | sync_priority | sync_state
-----+-----------+---------------+----------------+----------------+-----------------+---------------+------------
707 | streaming | 0/B5000090 | 0/B5000090 | 0/B5000090 |
0/B5000090 | 0 | async
(1 row)

postgres=# select * from pg_stat_get_activity(NULL::integer);
datid | pid | usesysid | application_name | state |
query | waiting | xact_start |
query_star
t | backend_start | state_change
| client_addr | client_hostname | client_port
-------+-----+----------+------------------+--------+----------------------------------------------------+---------+-------------------------------+--------------------
-----------+-------------------------------+-------------------------------+----------------+-----------------+-------------
0 | 707 | 34456 | l2amain | idle |
| f | |
| 2016-05-31 13:19:04.875468+02 | 2016-05-31 13:19:04.877894+02
| 192.168.101.11 | | 33329
12896 | 709 | 10 | psql | active | select * from
pg_stat_get_activity(NULL::integer); | f | 2016-05-31
13:22:23.090373+02 | 2016-05-31 13:22:23
.090373+02 | 2016-05-31 13:19:08.719215+02 | 2016-05-31 13:22:23.090382+02
| | | -1
(2 rows)

postgres=# select * from pg_authid where oid = 34456;
rolname | rolsuper | rolinherit | rolcreaterole | rolcreatedb |
rolcatupdate | rolcanlogin | rolreplication | rolconnlimit | rolpassword |
rolvaliduntil
---------+----------+------------+---------------+-------------+--------------+-------------+----------------+--------------+-------------+---------------
(0 rows)

postgres=# select oid, * from pg_authid;
oid | rolname | rolsuper | rolinherit | rolcreaterole | rolcreatedb |
rolcatupdate | rolcanlogin | rolreplication | rolconnlimit | rolpassword |
rolvaliduntil
-------+-----------+----------+------------+---------------+-------------+--------------+-------------+----------------+--------------+-------------+---------------
10 | postgres | t | t | t | t |
t | t | t | -1 | |
29732 | xxusrrole | f | t | f | f |
f | f | f | -1 | |
29733 | xxadmrole | f | t | f | f |
f | f | f | -1 | |
29734 | xxschema | f | t | f | f |
f | t | f | -1 | xxxxxxxx |
29735 | xxadm | f | t | f | f |
f | t | f | -1 | xxxxx |
29737 | pgbackup | f | t | f | f |
f | t | f | -1 | xxxxxxxx |
29738 | pgmonitor | f | t | f | f |
f | t | f | -1 | xxxxxxxxx |
29739 | pgreplic | f | t | f | f |
f | t | t | -1 | xxxxxxxx |
29736 | CTSYSTEM | f | t | f | f |
f | t | f | -1 | xxxxxx |
(9 rows)

2016-05-25 23:22 GMT+02:00 Andrej Vanek <andrej.vanek.sk@gmail.com>:

Show quoted text

Streaming replication set-up,

one master, 3 slaves connecting to it.
I expected ps -ef gets all wal-sender processes and SAME information I'll
get via select * from pg_stat_replication.
Instead I observed:
- pg_stat_replication is empty
- 3 wal-sender processes up and running
- each slave has wal-receiver process running
- replication works (tried to create a table- it appears in all databases)
Question:
- why is pg_stat_replication empty?

Andrej
---------------details
[root@l2bmain ~]# tail /opt/pg_data/postgresql.conf
max_wal_senders = 5
hot_standby = on
wal_keep_segments = 128
archive_command = '/opt/postgres/dbconf/archive_command.sh %p %f'
wal_receiver_status_interval = 2
max_standby_streaming_delay = -1
max_standby_archive_delay = -1
restart_after_crash = off
hot_standby_feedback = on
wal_sender_timeout = 1min
[root@l2bmain ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 10797 1 0 15:53 ? S 0:20
/usr/pgsql-9.3/bin/postgres -D /opt/pg_data -c
config_file=/opt/pg_data//postgresql.conf
postgres 10820 10797 0 15:53 ? Ss 0:00 \_ postgres: logger
process
postgres 10823 10797 0 15:53 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 10824 10797 0 15:53 ? Ss 0:00 \_ postgres: writer
process
postgres 10825 10797 0 15:53 ? Ss 0:00 \_ postgres: wal
writer process
postgres 10826 10797 0 15:53 ? Ss 0:01 \_ postgres:
autovacuum launcher process
postgres 10827 10797 0 15:53 ? Ss 0:00 \_ postgres: archiver
process last was 0000000100000000000000A3.00000028.backup
postgres 10828 10797 0 15:53 ? Ss 0:03 \_ postgres: stats
collector process
postgres 11286 10797 0 15:54 ? Ss 0:08 \_ postgres: wal
sender process pgreplic 192.168.204.12(55231) streaming 0/A401BED8
postgres 11287 10797 0 15:54 ? Ss 0:06 \_ postgres: wal
sender process pgreplic 192.168.204.11(42937) streaming 0/A401BED8
postgres 19322 10797 0 15:58 ? Ss 0:08 \_ postgres: wal
sender process pgreplic 192.168.101.11(52379) streaming 0/A401BED8
postgres 28704 10797 0 18:44 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.13(58245) idle
postgres 7256 10797 0 18:52 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.23(55190) idle
postgres 8667 10797 0 18:53 ? Ss 0:00 \_ postgres: CTSYSTEM
lidb 192.168.102.13(58287) idle
[root@l2bmain ~]# psql -U postgres -c "select * from pg_stat_replication;"
pid | usesysid | usename | application_name | client_addr |
client_hostname | client_port | backend_start | state | sent_location |
write_location | flush_location | r
eplay_location | sync_priority | sync_state

-----+----------+---------+------------------+-------------+-----------------+-------------+---------------+-------+---------------+----------------+----------------+--
---------------+---------------+------------
(0 rows)

[root@l2bmain ~]# tail /opt/pg_data/pg_log/postgresql-Wed.log
2016-05-25 15:53:56 CEST:@:[8603] LOG: database system is shut down
2016-05-25 15:53:58 CEST:@:[10821] LOG: database system was shut down in
recovery at 2016-05-25 15:53:56 CEST
2016-05-25 15:53:58 CEST:@:[10821] LOG: database system was not properly
shut down; automatic recovery in progress
2016-05-25 15:53:58 CEST:@:[10821] LOG: consistent recovery state
reached at 0/A2000090
2016-05-25 15:53:58 CEST:@:[10821] LOG: record with zero length at
0/A2000090
2016-05-25 15:53:58 CEST:@:[10821] LOG: redo is not required
2016-05-25 15:53:58 CEST:@:[10821] LOG: MultiXact member wraparound
protections are now enabled
2016-05-25 15:53:58 CEST:@:[10797] LOG: database system is ready to
accept connections
2016-05-25 15:53:58 CEST:@:[10826] LOG: autovacuum launcher started
`
[root@l2bmain ~]# ssh 192.168.101.11
Last login: Wed May 25 22:48:18 2016 from 192.168.101.12
[root@l2amain ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 5730 1 0 15:58 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/pg_data -c
config_file=/opt/pg_data//postgresql.conf
postgres 5754 5730 0 15:58 ? Ss 0:00 \_ postgres: logger
process
postgres 5755 5730 0 15:58 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 5773 5730 0 15:58 ? Ss 0:12 \_ postgres: wal
receiver process streaming 0/A401C030
postgres 5774 5730 0 15:58 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 5775 5730 0 15:58 ? Ss 0:00 \_ postgres: writer
process
postgres 5776 5730 0 15:58 ? Ss 0:00 \_ postgres: stats
collector process
[root@l2amain ~]# psql -U postgres -c "select pg_is_in_recovery();"
pg_is_in_recovery
-------------------
t
(1 row)
[root@l2bmain ~]# ssh 192.168.204.11
Warning: Permanently added '192.168.204.11' (RSA) to the list of known
hosts.
Last login: Wed May 25 16:28:49 2016 from 192.168.200.254
[root@l2abrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 8174 1 0 15:48 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 8195 8174 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 8197 8174 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 8206 8174 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 8207 8174 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 8208 8174 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 11414 8174 0 15:53 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0

[root@l2bmain ~]# ssh 192.168.204.11
Warning: Permanently added '192.168.204.11' (RSA) to the list of known
hosts.
Last login: Wed May 25 16:28:49 2016 from 192.168.200.254
[root@l2abrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 8174 1 0 15:48 ? S 0:04
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 8195 8174 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 8197 8174 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 8206 8174 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 8207 8174 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 8208 8174 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 11414 8174 0 15:53 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0
[root@l2abrnch ~]# psql -U postgres -c "select pg_is_in_recovery();"
pg_is_in_recovery
-------------------
t
(1 row)

[root@l2abrnch ~]# logout
Connection to 192.168.204.11 closed.
[root@l2bmain ~]# ssh 192.168.204.12
Warning: Permanently added '192.168.204.12' (RSA) to the list of known
hosts.
Last login: Wed May 25 14:58:03 2016 from 192.168.200.254
[root@l2bbrnch ~]# ps f -fu postgres
UID PID PPID C STIME TTY STAT TIME CMD
postgres 22885 1 0 15:48 ? S 0:00
/usr/pgsql-9.3/bin/postgres -D /opt/geo_stdby_data -c
config_file=/opt/geo_stdby_data//postgresql.conf
postgres 22913 22885 0 15:48 ? Ss 0:00 \_ postgres: logger
process
postgres 22914 22885 0 15:48 ? Ss 0:00 \_ postgres: startup
process recovering 0000000100000000000000A4
postgres 22917 22885 0 15:48 ? Ss 0:00 \_ postgres:
checkpointer process
postgres 22918 22885 0 15:48 ? Ss 0:00 \_ postgres: writer
process
postgres 22919 22885 0 15:48 ? Ss 0:00 \_ postgres: stats
collector process
postgres 26163 22885 0 15:54 ? Ss 0:13 \_ postgres: wal
receiver process streaming 0/A401C0D0