Hi hackers,
While working on [1], it appears that (on master) the issue reproduction (of toast rewrite not resetting the toast_hash) is triggering a failed assertion:
#2 0x0000000000b29fab in ExceptionalCondition (conditionName=0xce6850
"(rb->size >= sz) && (txn->size >= sz)", errorType=0xce5f84
"FailedAssertion", fileName=0xce5fd0 "reorderbuffer.c", lineNumber=3141)
at assert.c:69
#3 0x00000000008ff1fb in ReorderBufferChangeMemoryUpdate (rb=0x11a7a40,
change=0x11c94b8, addition=false) at reorderbuffer.c:3141
#4 0x00000000008fab27 in ReorderBufferReturnChange (rb=0x11a7a40,
change=0x11c94b8, upd_mem=true) at reorderbuffer.c:477
#5 0x0000000000902ec1 in ReorderBufferToastReset (rb=0x11a7a40,
txn=0x11b1998) at reorderbuffer.c:4799
#6 0x00000000008faaa2 in ReorderBufferReturnTXN (rb=0x11a7a40,
txn=0x11b1998) at reorderbuffer.c:448
#7 0x00000000008fc95b in ReorderBufferCleanupTXN (rb=0x11a7a40,
txn=0x11b1998) at reorderbuffer.c:1540
while on 12.5 for example, we would get (with the exact same repro):
ERROR: could not open relation with OID 0
The failed assertion is happening in the PG_CATCH() section of ReorderBufferProcessTXN().
We entered PG_CATCH() because elog(ERROR, "could not open relation with OID %u",...) has been triggered in ReorderBufferToastReplace().
But this elog(ERROR,) is being called after ReorderBufferChangeMemoryUpdate() being triggered with "addition" set to false.
As a consequence of elog(ERROR,) then ReorderBufferChangeMemoryUpdate() with "addition" set to true is not called at the end of ReorderBufferToastReplace().
That leads to a subsequent call to ReorderBufferChangeMemoryUpdate() (being triggered by 4daa140a2f adding ReorderBufferToastReset() calls to ReorderBufferReturnTXN()) triggering the failed assertion.
Please find attached a patch proposal to avoid the failed assertion (by ensuring that ReorderBufferChangeMemoryUpdate() being triggered with "addition" set to false in ReorderBufferToastReplace() is done after the elog(ERROR,)).
Adding Amit and Dilip as they are also aware of [1] and have worked on 4daa140a2f.
Also adding this patch in the commitfest.
Thanks
Bertrand
[1]: https://www.postgresql.org/message-id/b5146fb1-ad9e-7d6e-f980-98ed68744a7c@amazon.com