Re: pg_stat_replication became empty suddenly - Mailing list pgsql-general
From | ascot.moss@gmail.com |
---|---|
Subject | Re: pg_stat_replication became empty suddenly |
Date | |
Msg-id | 11FDD81A-072D-49DC-9B31-6023E5593023@gmail.com Whole thread Raw |
In response to | Re: pg_stat_replication became empty suddenly ("ascot.moss@gmail.com" <ascot.moss@gmail.com>) |
Responses |
Re: pg_stat_replication became empty suddenly
|
List | pgsql-general |
Hi,
I just tried another round of tests, without running "sync; echo 3 > /proc/sys/vm/drop_caches',
still got the same error, following FATAL errors are found in pg_log (slave), can anyone please advise how to resolve this error?
regards
LOG: entering standby mode
LOG: consistent recovery state reached at 11/42000318
LOG: redo starts at 11/42000280
LOG: invalid record length at 11/42000318
LOG: database system is ready to accept read only connections
LOG: streaming replication successfully connected to primary
FATAL: could not send data to WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
LOG: unexpected pageaddr 10/D2EC0000 in log file 18, segment 5, offset 15466496
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
LOG: streaming replication successfully connected to primary
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed
On 6 Aug 2013, at 12:39 PM, ascot.moss@gmail.com wrote:
Hi,I found the problem should be because I tried to clean RAM cache in the slave by running "sync; echo 3 > /proc/sys/vm/drop_caches'that caused the "receiver" of slave gone away.ps -ef | grep receiverpostgres 6182 6178 0 12:11 ? 00:00:06 postgres: wal receiver process streaming D/FB8DA000sync; echo 3 > /proc/sys/vm/drop_cachesps -ef | grep receiverroot 8804 30447 0 12:29 pts/2 00:00:00 grep --color=auto receiverregardsOn 6 Aug 2013, at 10:44 AM, ascot.moss@gmail.com wrote:Hi,I am doing some stress tests to a pair of PG servers to monitor the pg_stat_replication, during the test, the pg_stat_replication suddenly became empty.PG version: 9.2.4O/S: Ubuntu: 12.04Since I need to monitor the replication lag from time to time, if the pg_stat_replication becomes empty, the lag calculation in the slave will be wrong.Please advise if this is a bug.regardsHow to reproduce:session 1: Master server - try to insert a large number of records into a test tablepostgres=# drop table test;CREATE TABLE test (id INTEGER PRIMARY KEY); INSERT INTO test VALUES (generate_series(1,100000000)); EXPLAIN ANALYZE SELECT COUNT(*) FROM test;2) session 2: Master server - check the byte_lag from time to timepostgres=# SELECTsent_offset - (replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lagFROM (SELECTclient_addr,('x' || lpad(split_part(sent_location, '/', 1), 8, '0'))::bit(32)::bigint AS sent_xlog,('x' || lpad(split_part(replay_location, '/', 1), 8, '0'))::bit(32)::bigint AS replay_xlog,('x' || lpad(split_part(sent_location, '/', 2), 8, '0'))::bit(32)::bigint AS sent_offset,('x' || lpad(split_part(replay_location, '/', 2), 8, '0'))::bit(32)::bigint AS replay_offsetFROM pg_stat_replication) AS s;byte_lag----------2097216(1 row)postgres=# SELECTsent_offset - (replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lagFROM (SELECTclient_addr,('x' || lpad(split_part(sent_location, '/', 1), 8, '0'))::bit(32)::bigint AS sent_xlog,('x' || lpad(split_part(replay_location, '/', 1), 8, '0'))::bit(32)::bigint AS replay_xlog,('x' || lpad(split_part(sent_location, '/', 2), 8, '0'))::bit(32)::bigint AS sent_offset,('x' || lpad(split_part(replay_location, '/', 2), 8, '0'))::bit(32)::bigint AS replay_offsetFROM pg_stat_replication) AS s;byte_lag----------(0 rows)3) session 3: Slave server -postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;log_delay-----------0(1 row)postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;log_delay-----------4.873282(1 row)...postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;log_delay-------------4070.325329(1 row)
pgsql-general by date: