[BUGS] Standby corruption after master is restarted - Mailing list pgsql-bugs
From | Emre Hasegeli |
---|---|
Subject | [BUGS] Standby corruption after master is restarted |
Date | |
Msg-id | CAE2gYzzVZNsGn=-E6grO4sVQs04J02zNKQofQEO8gu8=qCFR+Q@mail.gmail.com Whole thread Raw |
Responses |
Re: [BUGS] Standby corruption after master is restarted
Re: Standby corruption after master is restarted |
List | pgsql-bugs |
One of our standby servers on a testing environment at InnoGames broke today. The master and the other standby server is working fine. I noticed that it happened right after the master had restarted. It looks to me like the WAL file is corrupted. I wanted to check, if it can be a bug. These were on the log file: > 2017-08-24 12:19:51 UTC [20351-2] LOG: replication terminated by primary server > 2017-08-24 12:19:51 UTC [20351-3] DETAIL: End of WAL reached on timeline 1 at 0/9B9F5FF0. > 2017-08-24 12:19:51 UTC [20351-4] FATAL: could not send end-of-streaming message to primary: no COPY in progress > 2017-08-24 12:19:51 UTC [26673-7] FATAL: invalid memory alloc request size 1298694144 > 2017-08-24 12:19:51 UTC [26672-3] LOG: startup process (PID 26673) exited with exit code 1 > 2017-08-24 12:19:51 UTC [26672-4] LOG: terminating any other active server processes I tried to start it again to get more logs: > 2017-08-24 14:07:46 UTC [28861-12698] LOCATION: KnownAssignedXidsRemovePreceding, procarray.c:3738 > 2017-08-24 14:07:46 UTC [28861-12699] DEBUG: 00000: transaction ID wrap limit is 2147484807, limited by database withOID 1 > 2017-08-24 14:07:46 UTC [28861-12700] CONTEXT: xlog redo at 0/9B9F5F80 for XLOG/CHECKPOINT_SHUTDOWN: redo 0/9B9F5F80;tli 1; prev tli 1; fpw true; xid 0:5869245; oid 20756; multi 1; offset 0; oldest xid 1160 in DB 1; oldest multi1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown > 2017-08-24 14:07:46 UTC [28861-12701] LOCATION: SetTransactionIdLimit, varsup.c:347 > 2017-08-24 14:07:46 UTC [28861-12702] DEBUG: 00000: prune KnownAssignedXids to 5869245 > 2017-08-24 14:07:46 UTC [28861-12703] CONTEXT: xlog redo at 0/9B9F5F80 for XLOG/CHECKPOINT_SHUTDOWN: redo 0/9B9F5F80;tli 1; prev tli 1; fpw true; xid 0:5869245; oid 20756; multi 1; offset 0; oldest xid 1160 in DB 1; oldest multi1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown > 2017-08-24 14:07:46 UTC [28861-12704] LOCATION: KnownAssignedXidsRemovePreceding, procarray.c:3738 > 2017-08-24 14:07:46 UTC [28861-12705] FATAL: XX000: invalid memory alloc request size 1298694144 > 2017-08-24 14:07:46 UTC [28861-12706] LOCATION: palloc_extended, mcxt.c:961 These from pg_controldata: > pg_control version number: 960 > Catalog version number: 201608131 > Database system identifier: 6423674571818176107 > Database cluster state: in archive recovery > pg_control last modified: Thu 24 Aug 2017 02:01:42 PM UTC > Latest checkpoint location: 0/9B9C1188 > Prior checkpoint location: 0/9B9C1188 > Latest checkpoint's REDO location: 0/9B9BEC60 > Latest checkpoint's REDO WAL file: 00000001000000000000009B > Latest checkpoint's TimeLineID: 1 > Latest checkpoint's PrevTimeLineID: 1 > Latest checkpoint's full_page_writes: on > Latest checkpoint's NextXID: 0:5868410 > Latest checkpoint's NextOID: 28526 > Latest checkpoint's NextMultiXactId: 1 > Latest checkpoint's NextMultiOffset: 0 > Latest checkpoint's oldestXID: 1160 > Latest checkpoint's oldestXID's DB: 1 > Latest checkpoint's oldestActiveXID: 5868410 > Latest checkpoint's oldestMultiXid: 1 > Latest checkpoint's oldestMulti's DB: 1 > Latest checkpoint's oldestCommitTsXid:0 > Latest checkpoint's newestCommitTsXid:0 > Time of latest checkpoint: Thu 24 Aug 2017 12:02:56 PM UTC > Fake LSN counter for unlogged rels: 0/1 > Minimum recovery ending location: 0/9B9E13C0 > Min recovery ending loc's timeline: 1 > Backup start location: 0/0 > Backup end location: 0/0 > End-of-backup record required: no > wal_level setting: replica > wal_log_hints setting: on > max_connections setting: 400 > max_worker_processes setting: 8 > max_prepared_xacts setting: 0 > max_locks_per_xact setting: 64 > track_commit_timestamp setting: off > Maximum data alignment: 8 > Database block size: 8192 > Blocks per segment of large relation: 131072 > WAL block size: 8192 > Bytes per WAL segment: 16777216 > Maximum length of identifiers: 64 > Maximum columns in an index: 32 > Maximum size of a TOAST chunk: 1996 > Size of a large-object chunk: 2048 > Date/time type storage: 64-bit integers > Float4 argument passing: by value > Float8 argument passing: by value > Data page checksum version: 1 I tried pg_xlogdump and it failed on the spot: > rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 0/9B9F5F80, prev 0/9B9F5F48, desc: CHECKPOINT_SHUTDOWNredo 0/9B9F5F80; tli 1; prev tli 1; fpw true; xid 0:5869245; oid 20756; multi 1; offset 0; oldest xid1160 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown > pg_xlogdump: FATAL: error in WAL record at 0/9B9F5F80: unexpected pageaddr 0/979F6000 in log segment 00000000000000000000009B,offset 10444800 This is from the master at the same place: > rmgr: XLOG len (rec/tot): 106/ 106, tx: 0, lsn: 0/9B9F5F80, prev 0/9B9F5F48, desc: CHECKPOINT_SHUTDOWNredo 0/9B9F5F80; tli 1; prev tli 1; fpw true; xid 0:5869245; oid 20756; multi 1; offset 0; oldest xid1160 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown > rmgr: XLOG len (rec/tot): 49/ 2805, tx: 0, lsn: 0/9B9F5FF0, prev 0/9B9F5F80, desc: FPI_FOR_HINT ,blkref #0: rel 1663/16396/16398 blk 0 FPW > rmgr: Heap len (rec/tot): 54/ 5046, tx: 5869245, lsn: 0/9B9F6B00, prev 0/9B9F5FF0, desc: INSERT off 54,blkref #0: rel 1663/16396/16427 blk 64203 FPW > rmgr: Btree len (rec/tot): 53/ 7961, tx: 5869245, lsn: 0/9B9F7EB8, prev 0/9B9F6B00, desc: INSERT_LEAF off281, blkref #0: rel 1663/16396/16446 blk 21754 FPW > rmgr: Transaction len (rec/tot): 34/ 34, tx: 5869245, lsn: 0/9B9F9DF0, prev 0/9B9F7EB8, desc: COMMIT 2017-08-2412:19:59.560138 UTC I still have the database available. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
pgsql-bugs by date: