Thread: Incorrect messages emitted from pgoutput when using column lists

Incorrect messages emitted from pgoutput when using column lists

From

Gunnar Morling

Date:

24 November 2022, 20:13:59

Hi,

I'd like to test the column list feature for logical replication added
in Postgres 15 [1], and it seems I am getting unexpected events for
UPDATEs. Here are steps for reproducing:

CREATE TABLE test (
id SERIAL NOT NULL PRIMARY KEY,
a VARCHAR(255) NOT NULL,
b VARCHAR(255) NOT NULL
);

CREATE PUBLICATION test_publication FOR TABLE test (id, a);
CREATE UNIQUE INDEX test_publication_idx ON test (id, a);
ALTER TABLE test REPLICA IDENTITY USING INDEX test_publication_idx;

SELECT * FROM pg_create_logical_replication_slot('test_slot', 'pgoutput');

INSERT INTO test values (default, 'aaa', 'bbb');
UPDATE test set b='bbbbbb' where id = 1;
UPDATE test set a='aaaaaa' where id = 1;

Then consume the events using any client, for instance Debezium, or a
bespoke consumer application. What I observed is this:

- The 'R' event sent for the table looks as expected, i.e. without the
"b" column
- The event sent for the INSERT looks as expected
- The event for the first UPDATE (column not part of the column list)
looks as expected
- The event for the second UPDATE (column is part of the column list)
looks wrong; specifically, its TupleData part: it has a column count
of 3 (rather than 2), with the 'n' byte identifying the value of the
third column (b) as a null value

I would expect that I always get events which reflect the structure of
the previously sent `R` event. If that's not the case, I'm not sure
how I would interpret the subsequent events. So I suppose this is a
bug?

Thanks for any advice and help,

--Gunnar

[1] http://amitkapila16.blogspot.com/2022/11/logical-replication-improvements-in.html

RE: Incorrect messages emitted from pgoutput when using column lists

From

"houzj.fnst@fujitsu.com"

Date:

25 November 2022, 05:46:46

On Friday, November 25, 2022 1:14 AM Gunnar Morling <gunnar.morling@googlemail.com> wrote:

Hi,

> I'd like to test the column list feature for logical replication added
> in Postgres 15 [1], and it seems I am getting unexpected events for
> UPDATEs. Here are steps for reproducing:
> 
> CREATE TABLE test (
> id SERIAL NOT NULL PRIMARY KEY,
> a VARCHAR(255) NOT NULL,
> b VARCHAR(255) NOT NULL
> );
> 
> CREATE PUBLICATION test_publication FOR TABLE test (id, a);
> CREATE UNIQUE INDEX test_publication_idx ON test (id, a);
> ALTER TABLE test REPLICA IDENTITY USING INDEX test_publication_idx;
> 
> SELECT * FROM pg_create_logical_replication_slot('test_slot', 'pgoutput');
> 
> INSERT INTO test values (default, 'aaa', 'bbb');
> UPDATE test set b='bbbbbb' where id = 1;
> UPDATE test set a='aaaaaa' where id = 1;
> 
> Then consume the events using any client, for instance Debezium, or a
> bespoke consumer application. What I observed is this:
> 
> - The 'R' event sent for the table looks as expected, i.e. without the
> "b" column
> - The event sent for the INSERT looks as expected
> - The event for the first UPDATE (column not part of the column list)
> looks as expected
> - The event for the second UPDATE (column is part of the column list)
> looks wrong; specifically, its TupleData part: it has a column count
> of 3 (rather than 2), with the 'n' byte identifying the value of the
> third column (b) as a null value
> 
> I would expect that I always get events which reflect the structure of
> the previously sent `R` event. If that's not the case, I'm not sure
> how I would interpret the subsequent events. So I suppose this is a
> bug?
> 
> Thanks for any advice and help,
> 

Thanks for reporting.

I think the reason is that we didn't filter the column when sending the old
tuple in pgoutput. We thought that the old tuple won't include columns that not
in RI, but it seems it will still be null values for such columns in the old
tuple. So, I think we'd better filter the column for old tuple as well.

Attach a small patch which fixes this.

Best regards,
Hou zj

Attachment

0001-fix-column-list-of-old-tuple.patch

Re: Incorrect messages emitted from pgoutput when using column lists

From

Michael Paquier

Date:

25 November 2022, 05:53:19

On Fri, Nov 25, 2022 at 02:46:46AM +0000, houzj.fnst@fujitsu.com wrote:
> I think the reason is that we didn't filter the column when sending the old
> tuple in pgoutput. We thought that the old tuple won't include columns that not
> in RI, but it seems it will still be null values for such columns in the old
> tuple. So, I think we'd better filter the column for old tuple as well.

Isn't that something where you should have at least some coverage with
one or more regression tests?
--
Michael

Attachment

signature.asc

Re: Incorrect messages emitted from pgoutput when using column lists

From

Amit Kapila

Date:

25 November 2022, 13:08:42

On Fri, Nov 25, 2022 at 8:16 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> I think the reason is that we didn't filter the column when sending the old
> tuple in pgoutput. We thought that the old tuple won't include columns that not
> in RI, but it seems it will still be null values for such columns in the old
> tuple.
>

Yes, that is correct. We do fill null values for non-replica identity
columns in the old tuple. See ExtractReplicaIdentity.

> So, I think we'd better filter the column for old tuple as well.
>

Your fix looks correct to me though I haven't tested it yet.

Can we think of writing a test case using
pg_logical_slot_peek_binary_changes() similar to what we have in
020_messages.pl?

-- 
With Regards,
Amit Kapila.

RE: Incorrect messages emitted from pgoutput when using column lists

From

"houzj.fnst@fujitsu.com"

Date:

27 November 2022, 15:37:07

On Friday, November 25, 2022 6:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Fri, Nov 25, 2022 at 8:16 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > I think the reason is that we didn't filter the column when sending
> > the old tuple in pgoutput. We thought that the old tuple won't include
> > columns that not in RI, but it seems it will still be null values for
> > such columns in the old tuple.
> >
> 
> Yes, that is correct. We do fill null values for non-replica identity columns in the
> old tuple. See ExtractReplicaIdentity.
> 
> > So, I think we'd better filter the column for old tuple as well.
> >
> 
> Your fix looks correct to me though I haven't tested it yet.
> 
> Can we think of writing a test case using
> pg_logical_slot_peek_binary_changes() similar to what we have in
> 020_messages.pl?

Yeah, I think it works.

Besides, I find that we don't filter the column for DELETE as well because
DELETE change only have old tuple. This looks like a similar problem as UPDATE,
and I suppose we need to fix them all. Attach the new version patch which added
the testcase as suggested and fix both DELETE and UPDATE cases.

Best regards,
Hou zj

Attachment

v2-0001-fix-column-list-for-old-tuple.patch

RE: Incorrect messages emitted from pgoutput when using column lists

From

"houzj.fnst@fujitsu.com"

Date:

27 November 2022, 15:42:51

On Friday, November 25, 2022 10:53 AM Michael Paquier <michael@paquier.xyz>
>
> On Fri, Nov 25, 2022 at 02:46:46AM +0000, houzj.fnst@fujitsu.com wrote:
> > I think the reason is that we didn't filter the column when sending
> > the old tuple in pgoutput. We thought that the old tuple won't include
> > columns that not in RI, but it seems it will still be null values for
> > such columns in the old tuple. So, I think we'd better filter the column for old
> tuple as well.
>
> Isn't that something where you should have at least some coverage with one or
> more regression tests?

Yes, I agree. Since the UPDATE will succeed even if old tuple includes columns
not in the column list, it took me a while to think of another test approach.
In the new version patch[1], I try to use pg_logical_slot_peek_binary_changes() to
get the binary data of UPDATE and test the number of columns in the old tuple.

[1]
https://www.postgresql.org/message-id/OS0PR01MB571658F178388836D116217A94109%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Best regards,
Hou zj

Re: Incorrect messages emitted from pgoutput when using column lists

From

Amit Kapila

Date:

02 December 2022, 12:20:44

On Sun, Nov 27, 2022 at 6:07 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Friday, November 25, 2022 6:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Nov 25, 2022 at 8:16 AM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > I think the reason is that we didn't filter the column when sending
> > > the old tuple in pgoutput. We thought that the old tuple won't include
> > > columns that not in RI, but it seems it will still be null values for
> > > such columns in the old tuple.
> > >
> >
> > Yes, that is correct. We do fill null values for non-replica identity columns in the
> > old tuple. See ExtractReplicaIdentity.
> >
> > > So, I think we'd better filter the column for old tuple as well.
> > >
> >
> > Your fix looks correct to me though I haven't tested it yet.
> >
> > Can we think of writing a test case using
> > pg_logical_slot_peek_binary_changes() similar to what we have in
> > 020_messages.pl?
>
> Yeah, I think it works.
>
> Besides, I find that we don't filter the column for DELETE as well because
> DELETE change only have old tuple. This looks like a similar problem as UPDATE,
> and I suppose we need to fix them all. Attach the new version patch which added
> the testcase as suggested and fix both DELETE and UPDATE cases.
>

LGTM, so pushed!

-- 
With Regards,
Amit Kapila.