Bug in logical decoding with DDL and subtransactions - Mailing list pgsql-hackers

From Mark Dilger
Subject Bug in logical decoding with DDL and subtransactions
Date
Msg-id CAHgHdKu5e3XY5e90Tuaxq_R4WrKxSV734Q+Lwo5y39Omp2A-Gg@mail.gmail.com
Whole thread
List pgsql-hackers
There is a bug in logical decoding with CREATE and subtransactions.  If a CREATE statement creates a row in a catalog during a subtransaction, but that subtransaction gets rolled back to the savepoint, and other things happen which trigger page pruning on the catalog page, and the original transaction (perhaps in a new subtransaction) then does another CREATE operation, a new row can get inserted into the same catalog at the same TID.  During logical decoding, this can trigger an Assertion, and in non-assert builds, could silently corrupt the decoder's catalog visibility, which could cause it to produce incorrect output (wrong column mappings, etc.)

This bug appears to go all the way back to 9.4 where logical replication was introduced.

Arseny Sher hit the cmax variant of this exact bug, and Alvaro fixed the cmax version of it, but appears not to have seen the danger for cmin also existed, rather writing the comment, "if so it must have the same cmin."  (commit 350cdcd5e6d, 2019)

Creating a short deterministic reproducer is difficult, because the catalog table must be set up such that page pruning will happen.  (I have a 24K line reproducer, which seems too big to attach for the list.)  A fuzz tester is attached instead.

The attached patch fixes the problem without fixing the fundamental architectural shortcut the code is taking.  The comment in xl_heap_new_cid ("store toplevel xid so we don't have to merge cids from different transactions") indicates an intentional design choice.  A more complete fix could also be considered, but is not included here.

--

Mark Dilger
Attachment

pgsql-hackers by date:

Previous
From: lakshmi
Date:
Subject: Re: let ALTER TABLE DROP COLUMN drop whole-row referenced object
Next
From: Dilip Kumar
Date:
Subject: Re: Proposal: Conflict log history table for Logical Replication