Hi,
I am doing some stress tests to a pair of PG servers to monitor the pg_stat_replication, during the test, the pg_stat_replication suddenly became empty.
PG version: 9.2.4
O/S: Ubuntu: 12.04
Since I need to monitor the replication lag from time to time, if the pg_stat_replication becomes empty, the lag calculation in the slave will be wrong.
Please advise if this is a bug.
regards
How to reproduce:
session 1: Master server - try to insert a large number of records into a test table
postgres=# drop table test;CREATE TABLE test (id INTEGER PRIMARY KEY); INSERT INTO test VALUES (generate_series(1,100000000)); EXPLAIN ANALYZE SELECT COUNT(*) FROM test;
2) session 2: Master server - check the byte_lag from time to time
postgres=# SELECT
sent_offset - (
replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lag
FROM (
SELECT
client_addr,
('x' || lpad(split_part(sent_location, '/', 1), 8, '0'))::bit(32)::bigint AS sent_xlog,
('x' || lpad(split_part(replay_location, '/', 1), 8, '0'))::bit(32)::bigint AS replay_xlog,
('x' || lpad(split_part(sent_location, '/', 2), 8, '0'))::bit(32)::bigint AS sent_offset,
('x' || lpad(split_part(replay_location, '/', 2), 8, '0'))::bit(32)::bigint AS replay_offset
FROM pg_stat_replication
) AS s;
byte_lag
----------
2097216
(1 row)
postgres=# SELECT
sent_offset - (
replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lag
FROM (
SELECT
client_addr,
('x' || lpad(split_part(sent_location, '/', 1), 8, '0'))::bit(32)::bigint AS sent_xlog,
('x' || lpad(split_part(replay_location, '/', 1), 8, '0'))::bit(32)::bigint AS replay_xlog,
('x' || lpad(split_part(sent_location, '/', 2), 8, '0'))::bit(32)::bigint AS sent_offset,
('x' || lpad(split_part(replay_location, '/', 2), 8, '0'))::bit(32)::bigint AS replay_offset
FROM pg_stat_replication
) AS s;
byte_lag
----------
(0 rows)
3) session 3: Slave server -
postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
log_delay
-----------
0
(1 row)
postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
log_delay
-----------
4.873282
(1 row)
.
.
.
postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
log_delay
-------------
4070.325329
(1 row)