BUG #15744: Replication slot peak query throwing error for wrong sequence entry for toast chunk - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #15744: Replication slot peak query throwing error for wrong sequence entry for toast chunk
Date
Msg-id 15744-ca657d03603b8220@postgresql.org
Whole thread Raw
Responses Re: BUG #15744: Replication slot peak query throwing error for wrongsequence entry for toast chunk  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      15744
Logged by:          Nitesh Yadav
Email address:      nitesh@datacoral.co
PostgreSQL version: 9.6.3
Operating system:   x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 2
Description:

Hi, 

Postgres Server setup: 
  Postgres server is running as AWS rds instance. 
  Server Version is PostgreSQL 9.6.3 on x86_64-pc-linux-gnu, compiled by gcc
(GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
  With the following parameters group rds.logical_replication is set to
1.Which internally set the following flags: wal_level, max_wal_senders,
max_replication_slots, max_connections.
  We are using test_decoding module for retrieving/read the WAL data through
the logical decoding mechanism.

Application setup: 
  Periodically we run the peek command to retrieve the data from the slot:
eg SELECT * FROM pg_logical_slot_peek_changes('pgldpublic_cdc_slot', NULL,
NULL, 'include-timestamp', 'on') LIMIT 200000 OFFSET 0; 
  From the above query result, we use location of last transaction to remove
the data from the slot: eg SELECT location, xid FROM 
  pg_logical_slot_get_changes('pgldpublic_cdc_slot', 'B92/C7394678', NULL,
'include-timestamp', 'on') LIMIT 1; 
  We runs Step 1 & 2 in the loop for reading data in the chunk of 200K
records at a time in a given process. 

Behavior reported (Bug)
  We have a replication slot running for successfully but recently we
encountered following error: 

error: got sequence entry 2 for toast chunk 30954054 instead of seq 0
at Connection.parseE
(/var/task/node_modules/datacoral-utils/node_modules/pg/lib/connection.js:555:11)
at Connection.parseMessage
(/var/task/node_modules/datacoral-utils/node_modules/pg/lib/connection.js:380:19)
at TLSSocket.<anonymous>
(/var/task/node_modules/datacoral-utils/node_modules/pg/lib/connection.js:120:22)
at emitOne (events.js:116:13)
at TLSSocket.emit (events.js:211:7)
at addChunk (_stream_readable.js:263:12)
at readableAddChunk (_stream_readable.js:250:11)
at TLSSocket.Readable.push (_stream_readable.js:208:10)
at TLSWrap.onread (net.js:607:20)

Temporary resolution
    After running the query 2-3 times, the error went away. But this causes
the whole process to shut down. 

Is there any permanent resolution for the issue or is it resolved in the
higher version of postgres? 

Regards,
Nitesh


pgsql-bugs by date:

Previous
From: "s.celles@gmail.com"
Date:
Subject: Re: BUG #15698: to_char doesn't return expected value with negative INTERVAL
Next
From: PG Bug reporting form
Date:
Subject: BUG #15745: WAL References Invalid Pages...that eventually resolves