Re: pgoutput incorrectly replaces missing values with NULL since PostgreSQL 15 - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: pgoutput incorrectly replaces missing values with NULL since PostgreSQL 15
Date
Msg-id CAA4eK1+=Ls2b19Nd6g2p_iM41r1UohaN+c8OQQccGhsjX5vnSg@mail.gmail.com
Whole thread Raw
In response to pgoutput incorrectly replaces missing values with NULL since PostgreSQL 15  (Nikhil Benesch <nikhil.benesch@gmail.com>)
Responses Re: pgoutput incorrectly replaces missing values with NULL since PostgreSQL 15
List pgsql-hackers
On Thu, Nov 23, 2023 at 1:10 PM Nikhil Benesch <nikhil.benesch@gmail.com> wrote:
>
> While working on Materialize's streaming logical replication from Postgres [0],
> my colleagues Sean Loiselle and Petros Angelatos (CC'd) discovered today what
> appears to be a correctness bug in pgoutput, introduced in v15.
>
> The problem goes like this. A table with REPLICA IDENTITY FULL and some
> data in it...
>
>     CREATE TABLE t (a int);
>     ALTER TABLE t REPLICA IDENTITY FULL;
>     INSERT INTO t VALUES (1), (2), (3), ...;
>
> ...undergoes a schema change to add a new column with a default:
>
>     ALTER TABLE t ADD COLUMN b bool DEFAULT false NOT NULL;
>
> PostgreSQL is smart and does not rewrite the entire table during the schema
> change. Instead it updates the tuple description to indicate to future readers
> of the table that if `b` is missing, it should be filled in with the default
> value, `false`.
>
> Unfortunately, since v15, pgoutput mishandles missing attributes. If a
> downstream server is subscribed to changes from t via the pgoutput plugin, when
> a row with a missing attribute is updated, e.g.:
>
>     UPDATE t SET a = 2 WHERE a = 1
>
> pgoutput will incorrectly report b's value as NULL in the old tuple, rather than
> false.
>

Thanks, I could reproduce this behavior. I'll look into your patch.

> Using the same example:
>
>     old: a=1, b=NULL
>     new: a=2, b=true
>
> The subscriber will ignore the update (as it has no row with values
> a=1, b=NULL), and thus the subscriber's copy of `t` will become out of sync with
> the publisher's.
>
> I bisected the problem to 52e4f0cd4 [1], which introduced row filtering for
> publications. The problem appears to be the use of CreateTupleDescCopy where
> CreateTupleDescCopyConstr is required, as the former drops the constraints
> in the tuple description (specifically, the default value constraint) on the
> floor.
>
> I've attached a patch which both fixes the issue and includes a test. I've
> verified that the test fails against the current master and passes against
> the patched version.
>
> I'm relatively unfamiliar with the project norms here, but assuming the patch is
> acceptable, this strikes me as important enough to warrant a backport to both
> v15 and v16.
>

Right.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby
Next
From: Peter Eisentraut
Date:
Subject: Re: SQL:2011 application time