Re: logical decoding bug: segfault in ReorderBufferToastReplace() - Mailing list pgsql-bugs

From Jeremy Schneider
Subject Re: logical decoding bug: segfault in ReorderBufferToastReplace()
Date
Msg-id 81182626-e836-061f-8f19-204edac18922@amazon.com
Whole thread Raw
In response to Re: logical decoding bug: segfault in ReorderBufferToastReplace()  (Andres Freund <andres@anarazel.de>)
Responses Re: logical decoding bug: segfault in ReorderBufferToastReplace()
Re: logical decoding bug: segfault in ReorderBufferToastReplace()
List pgsql-bugs
On 12/11/19 08:35, Andres Freund wrote:
I think we need to see pg_waldump output for the preceding records. That
might allow us to see why there's a toast record that's being associated
with this table, despite there not being a toast table.
Unfortunately the WAL logs are no longer available at this time.  :(

I did a little poking around in the core file and searching source code but didn't find anything yet.  Is there any memory structure that would have the preceding/following records cached in memory?  If so then I might be able to extract this from the core dumps.

Seems like we clearly should add an elog(ERROR) here, so we error out,
rather than crash.
done - in the commit that I replied to when I started this thread :)

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=69f883fef14a3fc5849126799278abcc43f40f56

Has there been DDL to this table?
I'm not sure that we will be able to find out at this point. 

Could you print out *change?

This was also in the original email - here it is:

(gdb) print *change
$1 = {lsn = 9430473343416, action = REORDER_BUFFER_CHANGE_INSERT, origin_id = 0, data = {tp = {relnode = {spcNode = 1663, dbNode = 16401,
        relNode = 16428}, clear_toast_afterwards = true, oldtuple = 0x0, newtuple = 0x2b79313f9c68}, truncate = {
      nrelids = 70441758623359, cascade = 44, restart_seqs = 64, relids = 0x0}, msg = {
      prefix = 0x40110000067f <Address 0x40110000067f out of bounds>, message_size = 4294983724, message = 0x0},
    snapshot = 0x40110000067f, command_id = 1663, tuplecid = {node = {spcNode = 1663, dbNode = 16401, relNode = 16428}, tid = {
        ip_blkid = {bi_hi = 1, bi_lo = 0}, ip_posid = 0}, cmin = 0, cmax = 826252392, combocid = 11129}}, node = {prev = 0x30ac918,
    next = 0x30ac9b8}}

Is this version of postgres effectively unmodified in any potentially
relevant region (snapshot computations, generation of WAL records, ...)?
It's not changed from community code in any relevant regions.  (Also, FYI, this is not Aurora.)

-Jeremy

-- 
Jeremy Schneider
Database Engineer
Amazon Web Services

pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: BUG #16162: create index using gist_trgm_ops leads to panic
Next
From: Thomas Munro
Date:
Subject: Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash