Thread: "Missing" column in Postgres logical replication update message

"Missing" column in Postgres logical replication update message

From
Kevin Martin
Date:

We have a replication slot set up on a database in Postgres 12.8. For one of the tables, when a row is created, we see a series of records come through - an INSERT followed by a handful of UPDATEs. All of these messages in the WAL files show all columns, except for the last UPDATE message, which is missing one of the columns. (The column in question is a JSONB field, and the data is not overly large - less than 1000 chars.)  We think this is causing the data to come into our data lake (via Stitch) with that column as NULL.

My understanding is that all INSERT and UPDATE messages written to the replication logs are supposed to include all columns. But I can't find a definitive answer on that.

So, my first question is: Is it normal / expected for UPDATE messages in the replication logs to exclude any columns in the table?

And, of course, if that is unexpected behavior, I'd love to hear any thoughts on what may cause it.

Thanks.

-Kevin


FYI.  I have this question posted also on StackOverflow: https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message

Re: "Missing" column in Postgres logical replication update message

From
Adrian Klaver
Date:
On 8/3/22 06:50, Kevin Martin wrote:
> We have a replication slot set up on a database in Postgres 12.8. For 
> one of the tables, when a row is created, we see a series of records 
> come through - an INSERT followed by a handful of UPDATEs. All of these 
> messages in the WAL files show all columns, except for the last UPDATE 
> message, which is missing one of the columns. (The column in question is 
> a JSONB field, and the data is not overly large - less than 1000 
> chars.)  We think this is causing the data to come into our data lake 
> (via Stitch) with that column as NULL.

See the messages where and/or how?

What is the UPDATE command that is being given on primary?

Is the data showing up in the replica table?

> 
> My understanding is that all INSERT and UPDATE messages written to the 
> replication logs are supposed to include all columns. But I can't find a 
> definitive answer on that.
> 
> So, my first question is: Is it normal / expected for UPDATE messages in 
> the replication logs to exclude any columns in the table?
> 
> And, of course, if that is unexpected behavior, I'd love to hear any 
> thoughts on what may cause it.
> 
> Thanks.
> 
> -Kevin
> 
> 
> FYI.  I have this question posted also on StackOverflow: 
> https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message 
> <https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message>
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: "Missing" column in Postgres logical replication update message

From
Adrian Klaver
Date:
On 8/3/22 08:09, Kevin Martin wrote:

Please reply to list also
Ccing list

I don't have answers to below at the moment, just getting thread back to 
list so others who might have answers can see it.

> Thanks for the reply, Adrian.
> 
> We're looking at the messages in the replication slot using 
> pg_logical_slot_peek_changes in the source db.
> 
> In those messages, we see some UPDATEs that do not include one of the 
> columns in the table.
> 
> I'm not sure what statements are producing the updates to the table 
> from the application, if that is what you are asking.  Does the update 
> against the database have to include all columns in order for the 
> replication log to have them all?
> 
> I thought that any UPDATE message in the replication logs is designed to 
> include all values for all columns.  Is that correct?
> 
> The data is not showing up in the replica table.  In this case, though, 
> the replication slot is being queried by Stitch to produce a copy in 
> Snowflake.  That is probably somewhat irrelevant to the current 
> question, though, since we appear to be seeing missing data in the 
> replication slot messages on the source.
> 
> I'm on the receiving side of this issue and am working with my DBA on 
> trying to figure it out, so I'm not fully versed in how all of this 
> works.  I can try to get more information if it helps.  I have seen the 
> output from the peek function, and there are clearly some UPDATE 
> messages that have the column / values in question and some that do not.
> 
> On Wed, Aug 3, 2022 at 10:47 AM Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 8/3/22 06:50, Kevin Martin wrote:
>      > We have a replication slot set up on a database in Postgres 12.8.
>     For
>      > one of the tables, when a row is created, we see a series of records
>      > come through - an INSERT followed by a handful of UPDATEs. All of
>     these
>      > messages in the WAL files show all columns, except for the last
>     UPDATE
>      > message, which is missing one of the columns. (The column in
>     question is
>      > a JSONB field, and the data is not overly large - less than 1000
>      > chars.)  We think this is causing the data to come into our data
>     lake
>      > (via Stitch) with that column as NULL.
> 
>     See the messages where and/or how?
> 
>     What is the UPDATE command that is being given on primary?
> 
>     Is the data showing up in the replica table?
> 
>      >
>      > My understanding is that all INSERT and UPDATE messages written
>     to the
>      > replication logs are supposed to include all columns. But I can't
>     find a
>      > definitive answer on that.
>      >
>      > So, my first question is: Is it normal / expected for UPDATE
>     messages in
>      > the replication logs to exclude any columns in the table?
>      >
>      > And, of course, if that is unexpected behavior, I'd love to hear any
>      > thoughts on what may cause it.
>      >
>      > Thanks.
>      >
>      > -Kevin
>      >
>      >
>      > FYI.  I have this question posted also on StackOverflow:
>      >
>     https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message
>     <https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message>
> 
>      >
>     <https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message
>     <https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message>>
>      >
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: "Missing" column in Postgres logical replication update message

From
Kevin Martin
Date:
My apologies.  Read that in the mailing list instructions, and still forgot.  Thanks for the reminder.

On Wed, Aug 3, 2022 at 11:12 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 8/3/22 08:09, Kevin Martin wrote:

Please reply to list also
Ccing list

I don't have answers to below at the moment, just getting thread back to
list so others who might have answers can see it.

> Thanks for the reply, Adrian.
>
> We're looking at the messages in the replication slot using
> pg_logical_slot_peek_changes in the source db.
>
> In those messages, we see some UPDATEs that do not include one of the
> columns in the table.
>
> I'm not sure what statements are producing the updates to the table
> from the application, if that is what you are asking.  Does the update
> against the database have to include all columns in order for the
> replication log to have them all?
>
> I thought that any UPDATE message in the replication logs is designed to
> include all values for all columns.  Is that correct?
>
> The data is not showing up in the replica table.  In this case, though,
> the replication slot is being queried by Stitch to produce a copy in
> Snowflake.  That is probably somewhat irrelevant to the current
> question, though, since we appear to be seeing missing data in the
> replication slot messages on the source.
>
> I'm on the receiving side of this issue and am working with my DBA on
> trying to figure it out, so I'm not fully versed in how all of this
> works.  I can try to get more information if it helps.  I have seen the
> output from the peek function, and there are clearly some UPDATE
> messages that have the column / values in question and some that do not.
>
> On Wed, Aug 3, 2022 at 10:47 AM Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 8/3/22 06:50, Kevin Martin wrote:
>      > We have a replication slot set up on a database in Postgres 12.8.
>     For
>      > one of the tables, when a row is created, we see a series of records
>      > come through - an INSERT followed by a handful of UPDATEs. All of
>     these
>      > messages in the WAL files show all columns, except for the last
>     UPDATE
>      > message, which is missing one of the columns. (The column in
>     question is
>      > a JSONB field, and the data is not overly large - less than 1000
>      > chars.)  We think this is causing the data to come into our data
>     lake
>      > (via Stitch) with that column as NULL.
>
>     See the messages where and/or how?
>
>     What is the UPDATE command that is being given on primary?
>
>     Is the data showing up in the replica table?
>
>      >
>      > My understanding is that all INSERT and UPDATE messages written
>     to the
>      > replication logs are supposed to include all columns. But I can't
>     find a
>      > definitive answer on that.
>      >
>      > So, my first question is: Is it normal / expected for UPDATE
>     messages in
>      > the replication logs to exclude any columns in the table?
>      >
>      > And, of course, if that is unexpected behavior, I'd love to hear any
>      > thoughts on what may cause it.
>      >
>      > Thanks.
>      >
>      > -Kevin
>      >
>      >
>      > FYI.  I have this question posted also on StackOverflow:
>      >
>     https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message
>     <https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message>
>
>      >
>     <https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message
>     <https://stackoverflow.com/questions/73222107/missing-column-in-postgres-logical-replication-update-message>>
>      >
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: "Missing" column in Postgres logical replication update message

From
Tom Lane
Date:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
> On 8/3/22 08:09, Kevin Martin wrote:
>> I thought that any UPDATE message in the replication logs is designed to
>> include all values for all columns.  Is that correct?

Buried deep in the wire protocol specification is this bit:

----
The following message parts are shared by the above messages.

    TupleData

    Int16
        Number of columns.

    Next, one of the following submessages appears for each column (except generated columns):

    ...

    Byte1('u')
        Identifies unchanged TOASTed value (the actual value is not sent).
----

So that right there is two cases where we omit data for a column.
I suspect the "unchanged toasted data" case is what matters for
your purposes.  Maybe you've found some code that fails to implement
that correctly?

            regards, tom lane



Re: "Missing" column in Postgres logical replication update message

From
Kevin Martin
Date:
Thanks for the response, Tom.  I think that's exactly what's going on here.

On Wed, Aug 3, 2022 at 11:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
> On 8/3/22 08:09, Kevin Martin wrote:
>> I thought that any UPDATE message in the replication logs is designed to
>> include all values for all columns.  Is that correct?

Buried deep in the wire protocol specification is this bit:

----
The following message parts are shared by the above messages.

    TupleData

    Int16
        Number of columns.

    Next, one of the following submessages appears for each column (except generated columns):

    ...

    Byte1('u')
        Identifies unchanged TOASTed value (the actual value is not sent).
----

So that right there is two cases where we omit data for a column.
I suspect the "unchanged toasted data" case is what matters for
your purposes.  Maybe you've found some code that fails to implement
that correctly?

                        regards, tom lane