Re: pg_stat_replication became empty suddenly - Mailing list pgsql-general

From ascot.moss@gmail.com
Subject Re: pg_stat_replication became empty suddenly
Date
Msg-id 11FDD81A-072D-49DC-9B31-6023E5593023@gmail.com
Whole thread Raw
In response to Re: pg_stat_replication became empty suddenly  ("ascot.moss@gmail.com" <ascot.moss@gmail.com>)
Responses Re: pg_stat_replication became empty suddenly
List pgsql-general
Hi,

I just tried another round of tests, without running  "sync; echo 3 > /proc/sys/vm/drop_caches',
still got the same error,  following FATAL errors are found in pg_log (slave), can anyone please advise how to resolve this error?

regards


LOG:  entering standby mode
LOG:  consistent recovery state reached at 11/42000318
LOG:  redo starts at 11/42000280
LOG:  invalid record length at 11/42000318
LOG:  database system is ready to accept read only connections
LOG:  streaming replication successfully connected to primary
FATAL:  could not send data to WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
LOG:  unexpected pageaddr 10/D2EC0000 in log file 18, segment 5, offset 15466496
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 000000010000001200000005 has already been removed





On 6 Aug 2013, at 12:39 PM, ascot.moss@gmail.com wrote:

Hi,

I found the problem should be because I tried to clean RAM cache in the slave by running "sync; echo 3 > /proc/sys/vm/drop_caches' 
that caused the "receiver" of slave gone away.

ps -ef | grep receiver
postgres  6182  6178  0 12:11 ?        00:00:06 postgres: wal receiver process   streaming D/FB8DA000    
sync; echo 3 > /proc/sys/vm/drop_caches
ps -ef | grep receiver
root      8804 30447  0 12:29 pts/2    00:00:00 grep --color=auto receiver

regards

On 6 Aug 2013, at 10:44 AM, ascot.moss@gmail.com wrote:

Hi,

I am doing some stress tests to a pair of PG servers to monitor the pg_stat_replication, during the test, the pg_stat_replication suddenly became empty. 

PG version: 9.2.4
O/S: Ubuntu: 12.04

Since I need to monitor the replication lag from time to time, if the pg_stat_replication becomes empty, the lag calculation in the slave will be wrong.
Please advise if this is a bug.

regards




How to reproduce:

session 1: Master server - try to insert a large number of records into a test table 
postgres=# drop table test;CREATE TABLE test (id INTEGER PRIMARY KEY); INSERT INTO test VALUES (generate_series(1,100000000));  EXPLAIN ANALYZE SELECT COUNT(*) FROM test;

2) session 2: Master server - check the byte_lag from time to time
postgres=# SELECT
    sent_offset - (
        replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lag
FROM (
    SELECT
        client_addr,
        ('x' || lpad(split_part(sent_location,   '/', 1), 8, '0'))::bit(32)::bigint AS sent_xlog,
        ('x' || lpad(split_part(replay_location, '/', 1), 8, '0'))::bit(32)::bigint AS replay_xlog,
        ('x' || lpad(split_part(sent_location,   '/', 2), 8, '0'))::bit(32)::bigint AS sent_offset,
        ('x' || lpad(split_part(replay_location, '/', 2), 8, '0'))::bit(32)::bigint AS replay_offset
    FROM pg_stat_replication
) AS s;
byte_lag 
----------
2097216
(1 row)

postgres=# SELECT

    sent_offset - (
        replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lag
FROM (
    SELECT
        client_addr,
        ('x' || lpad(split_part(sent_location,   '/', 1), 8, '0'))::bit(32)::bigint AS sent_xlog,
        ('x' || lpad(split_part(replay_location, '/', 1), 8, '0'))::bit(32)::bigint AS replay_xlog,
        ('x' || lpad(split_part(sent_location,   '/', 2), 8, '0'))::bit(32)::bigint AS sent_offset,
        ('x' || lpad(split_part(replay_location, '/', 2), 8, '0'))::bit(32)::bigint AS replay_offset
    FROM pg_stat_replication
) AS s;
byte_lag 
----------
(0 rows)


3) session 3: Slave server - 
postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()  THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
 log_delay 
-----------
         0
(1 row)

postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()  THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
 log_delay 
-----------
  4.873282
(1 row)

.
.
.
postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()  THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
  log_delay  
-------------
 4070.325329
(1 row)




pgsql-general by date:

Previous
From: "immersive.excel@gmail.com"
Date:
Subject: Re: Seamless replacement to MySQL's GROUP_CONCAT function...
Next
From: "immersive.excel@gmail.com"
Date:
Subject: Re: Seamless replacement to MySQL's GROUP_CONCAT function...