Thread: confirmed_flush_lsn shows LSN of the data that has not yet been received by the logical subscriber.

Hi All,

The logically decoded data are sent to the logical subscriber at the time of transaction commit, assuming that the data is small. However, before the transaction commit is performed, the LSN representing the data that is yet to be received by the logical subscriber appears in the confirmed_flush_lsn column of pg_replication_slots catalog. Isn't the information seen in the confirmed_flush_lsn column while the transaction is in progress incorrect ? esp considering the description given in the pg doc for this column.

Actually, while the transaction is running, the publisher keeps on sending keepalive messages containing LSN of the last decoded data saved in reorder buffer and the subscriber responds with the same LSN as the last received LSN which is then updated as confirmed_flush_lsn by the publisher. I think the LSN that we are sending with the keepalive message should be the one representing the transaction begin message, not the LSN of the last decoded data which is yet to be sent. Please let me know if I am missing something here.

--
With Regards,
Ashutosh Sharma.
On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi All,
>
> The logically decoded data are sent to the logical subscriber at the time of transaction commit, assuming that the
datais small. However, before the transaction commit is performed, the LSN representing the data that is yet to be
receivedby the logical subscriber appears in the confirmed_flush_lsn column of pg_replication_slots catalog. Isn't the
informationseen in the confirmed_flush_lsn column while the transaction is in progress incorrect ? esp considering the
descriptiongiven in the pg doc for this column. 
>
> Actually, while the transaction is running, the publisher keeps on sending keepalive messages containing LSN of the
lastdecoded data saved in reorder buffer and the subscriber responds with the same LSN as the last received LSN which
isthen updated as confirmed_flush_lsn by the publisher. I think the LSN that we are sending with the keepalive message
shouldbe the one representing the transaction begin message, not the LSN of the last decoded data which is yet to be
sent.Please let me know if I am missing something here. 

The transactions with commit lsn < confirmed_flush_lsn are confirmed
to be received (and applied by the subscriber. Setting LSN
corresponding to a WAL record within a transaction in progress as
confirmed_flush should be ok. Since the transactions are interleaved
in WAL stream, it's quite possible that LSNs of some WAL records of an
inflight transaction are lesser than commit LSN of some another
transaction. So setting commit LSN of another effectively same as
setting it to any of the LSNs of any previous WAL record irrespective
of the transaction that it belongs to.

In case WAL sender restarts with confirmed_flush_lsn set to LSN of a
WAL record of an inflight transaction, the whole inflight transaction
will be sent again since its commit LSN is higher than
confirmed_flush_lsn.

I think logical replication has inherited this from physical
replication. A useful effect of this is to reduce WAL retention by
moving restart_lsn based on the latest confirmed_flush_lsn.

--
Best Wishes,
Ashutosh Bapat



On Thu, Sep 8, 2022 at 6:23 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > Hi All,
> >
> > The logically decoded data are sent to the logical subscriber at the time of transaction commit, assuming that the
datais small. However, before the transaction commit is performed, the LSN representing the data that is yet to be
receivedby the logical subscriber appears in the confirmed_flush_lsn column of pg_replication_slots catalog. Isn't the
informationseen in the confirmed_flush_lsn column while the transaction is in progress incorrect ? esp considering the
descriptiongiven in the pg doc for this column. 
> >
> > Actually, while the transaction is running, the publisher keeps on sending keepalive messages containing LSN of the
lastdecoded data saved in reorder buffer and the subscriber responds with the same LSN as the last received LSN which
isthen updated as confirmed_flush_lsn by the publisher. I think the LSN that we are sending with the keepalive message
shouldbe the one representing the transaction begin message, not the LSN of the last decoded data which is yet to be
sent.Please let me know if I am missing something here. 
>
> The transactions with commit lsn < confirmed_flush_lsn are confirmed
> to be received (and applied by the subscriber. Setting LSN
> corresponding to a WAL record within a transaction in progress as
> confirmed_flush should be ok. Since the transactions are interleaved
> in WAL stream, it's quite possible that LSNs of some WAL records of an
> inflight transaction are lesser than commit LSN of some another
> transaction. So setting commit LSN of another effectively same as
> setting it to any of the LSNs of any previous WAL record irrespective
> of the transaction that it belongs to.

Thank you Ashutosh for the explanation. I still feel that the
documentation on confirmed_flush_lsn needs some improvement. It
actually claims that all the data before the confirmed_flush_lsn has
been received by the logical subscriber, but that's not the case. It
actually means that all the data belonging to the transactions with
commit lsn < confirmed_flush_lsn has been received and applied by the
subscriber. So setting confirmed_flush_lsn to the lsn of wal records
generated by running transaction might make people think that the wal
records belonging to previous data of the same running transaction has
already been received and applied by the subscriber node, but that's
not true.

--
With Regards,
Ashutosh Sharma.



On Thu, Sep 8, 2022 at 8:32 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> On Thu, Sep 8, 2022 at 6:23 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >
> > > Hi All,
> > >
> > > The logically decoded data are sent to the logical subscriber at the time of transaction commit, assuming that
thedata is small. However, before the transaction commit is performed, the LSN representing the data that is yet to be
receivedby the logical subscriber appears in the confirmed_flush_lsn column of pg_replication_slots catalog. Isn't the
informationseen in the confirmed_flush_lsn column while the transaction is in progress incorrect ? esp considering the
descriptiongiven in the pg doc for this column. 
> > >
> > > Actually, while the transaction is running, the publisher keeps on sending keepalive messages containing LSN of
thelast decoded data saved in reorder buffer and the subscriber responds with the same LSN as the last received LSN
whichis then updated as confirmed_flush_lsn by the publisher. I think the LSN that we are sending with the keepalive
messageshould be the one representing the transaction begin message, not the LSN of the last decoded data which is yet
tobe sent. Please let me know if I am missing something here. 
> >
> > The transactions with commit lsn < confirmed_flush_lsn are confirmed
> > to be received (and applied by the subscriber. Setting LSN
> > corresponding to a WAL record within a transaction in progress as
> > confirmed_flush should be ok. Since the transactions are interleaved
> > in WAL stream, it's quite possible that LSNs of some WAL records of an
> > inflight transaction are lesser than commit LSN of some another
> > transaction. So setting commit LSN of another effectively same as
> > setting it to any of the LSNs of any previous WAL record irrespective
> > of the transaction that it belongs to.
>
> Thank you Ashutosh for the explanation. I still feel that the
> documentation on confirmed_flush_lsn needs some improvement. It
> actually claims that all the data before the confirmed_flush_lsn has
> been received by the logical subscriber, but that's not the case. It
> actually means that all the data belonging to the transactions with
> commit lsn < confirmed_flush_lsn has been received and applied by the
> subscriber. So setting confirmed_flush_lsn to the lsn of wal records
> generated by running transaction might make people think that the wal
> records belonging to previous data of the same running transaction has
> already been received and applied by the subscriber node, but that's
> not true.
>

Can you please point to the documentation.

It's true that it needs to be clarified. But what you are saying may
not be entirely true in case of streamed transaction. In that case we
might send logically decoded changes of an ongoing transaction as
well. They may even get applied but not necessarily committed. It's a
bit complicated. :)

--
Best Wishes,
Ashutosh Bapat



On Fri, Sep 9, 2022 at 5:36 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Sep 8, 2022 at 8:32 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > On Thu, Sep 8, 2022 at 6:23 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> Can you please point to the documentation.
>

AFAIU there is just one documentation. Here is the link for it:

https://www.postgresql.org/docs/current/view-pg-replication-slots.html

> It's true that it needs to be clarified. But what you are saying may
> not be entirely true in case of streamed transaction. In that case we
> might send logically decoded changes of an ongoing transaction as
> well. They may even get applied but not necessarily committed. It's a
> bit complicated. :)
>

This can happen in case of big transactions. That's the reason I
mentioned that the transaction has a small set of data which is not
yet committed but the confirmed_flush_lsn says it has already reached
the logical subscriber.

And.. lastly sorry for the delayed response. I was not well and
couldn't access email for quite some days. The poor dengue had almost
killed me :(

--
With Regards,
Ashutosh Sharma.



On Mon, Sep 19, 2022 at 1:43 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> On Fri, Sep 9, 2022 at 5:36 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Thu, Sep 8, 2022 at 8:32 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >
> > > On Thu, Sep 8, 2022 at 6:23 PM Ashutosh Bapat
> > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > Can you please point to the documentation.
> >
>
> AFAIU there is just one documentation. Here is the link for it:
>
> https://www.postgresql.org/docs/current/view-pg-replication-slots.html

Thanks. Description of confirmed_flush_lsn is "The address (LSN) up to
which the logical slot's consumer has confirmed receiving data. Data
older than this is not available anymore. NULL for physical slots."
The second sentence is misleading. AFAIU, it really should be "Data
corresponding to the transactions committed before this LSN is not
available anymore". WAL before restart_lsn is likely to be removed but
WAL with LSN higher than restart_lsn is preserved. This correction
makes more sense because of the third sentence.

>
> And.. lastly sorry for the delayed response. I was not well and
> couldn't access email for quite some days. The poor dengue had almost
> killed me :(

Dengue had almost killed me also. Take care.

-- 
Best Wishes,
Ashutosh Bapat



On Mon, Sep 19, 2022 at 5:24 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Mon, Sep 19, 2022 at 1:43 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > On Fri, Sep 9, 2022 at 5:36 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Thu, Sep 8, 2022 at 8:32 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 8, 2022 at 6:23 PM Ashutosh Bapat
> > > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > > >
> > > > > On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > Can you please point to the documentation.
> > >
> >
> > AFAIU there is just one documentation. Here is the link for it:
> >
> > https://www.postgresql.org/docs/current/view-pg-replication-slots.html
>
> Thanks. Description of confirmed_flush_lsn is "The address (LSN) up to
> which the logical slot's consumer has confirmed receiving data. Data
> older than this is not available anymore. NULL for physical slots."
> The second sentence is misleading. AFAIU, it really should be "Data
> corresponding to the transactions committed before this LSN is not
> available anymore". WAL before restart_lsn is likely to be removed but
> WAL with LSN higher than restart_lsn is preserved. This correction
> makes more sense because of the third sentence.
>

Thanks for the clarification. Attached is the patch with the changes.
Please have a look.

--
With Regards,
Ashutosh Sharma.

Attachment
On Mon, Sep 19, 2022 at 8:09 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> On Mon, Sep 19, 2022 at 5:24 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Mon, Sep 19, 2022 at 1:43 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >
> > > On Fri, Sep 9, 2022 at 5:36 PM Ashutosh Bapat
> > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 8, 2022 at 8:32 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > > >
> > > > > On Thu, Sep 8, 2022 at 6:23 PM Ashutosh Bapat
> > > > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > > Can you please point to the documentation.
> > > >
> > >
> > > AFAIU there is just one documentation. Here is the link for it:
> > >
> > > https://www.postgresql.org/docs/current/view-pg-replication-slots.html
> >
> > Thanks. Description of confirmed_flush_lsn is "The address (LSN) up to
> > which the logical slot's consumer has confirmed receiving data. Data
> > older than this is not available anymore. NULL for physical slots."
> > The second sentence is misleading. AFAIU, it really should be "Data
> > corresponding to the transactions committed before this LSN is not
> > available anymore". WAL before restart_lsn is likely to be removed but
> > WAL with LSN higher than restart_lsn is preserved. This correction
> > makes more sense because of the third sentence.
> >
>
> Thanks for the clarification. Attached is the patch with the changes.
> Please have a look.
>
Looks good to me. Do you want to track this through commitfest app?



-- 
Best Wishes,
Ashutosh Bapat



On Wed, Sep 21, 2022 at 7:21 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Mon, Sep 19, 2022 at 8:09 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > On Mon, Sep 19, 2022 at 5:24 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Mon, Sep 19, 2022 at 1:43 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > >
> > > > On Fri, Sep 9, 2022 at 5:36 PM Ashutosh Bapat
> > > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > > >
> > > > > On Thu, Sep 8, 2022 at 8:32 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Sep 8, 2022 at 6:23 PM Ashutosh Bapat
> > > > > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > > > > >
> > > > > > > On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > > > > Can you please point to the documentation.
> > > > >
> > > >
> > > > AFAIU there is just one documentation. Here is the link for it:
> > > >
> > > > https://www.postgresql.org/docs/current/view-pg-replication-slots.html
> > >
> > > Thanks. Description of confirmed_flush_lsn is "The address (LSN) up to
> > > which the logical slot's consumer has confirmed receiving data. Data
> > > older than this is not available anymore. NULL for physical slots."
> > > The second sentence is misleading. AFAIU, it really should be "Data
> > > corresponding to the transactions committed before this LSN is not
> > > available anymore". WAL before restart_lsn is likely to be removed but
> > > WAL with LSN higher than restart_lsn is preserved. This correction
> > > makes more sense because of the third sentence.
> > >
> >
> > Thanks for the clarification. Attached is the patch with the changes.
> > Please have a look.
> >
> Looks good to me. Do you want to track this through commitfest app?
>

Yeah, I've added an entry for it in the commitfest app and marked it
as ready for committer. Thanks for the suggestion.

--
With Regards,
Ashutosh Sharma.



On Mon, Sep 19, 2022 at 8:09 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Thanks for the clarification. Attached is the patch with the changes.
> Please have a look.
>

LGTM. I'll push this tomorrow unless there are any other
comments/suggestions for this.

-- 
With Regards,
Amit Kapila.



On Thu, Nov 3, 2022 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 19, 2022 at 8:09 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > Thanks for the clarification. Attached is the patch with the changes.
> > Please have a look.
> >
>
> LGTM. I'll push this tomorrow unless there are any other
> comments/suggestions for this.
>

Pushed!

-- 
With Regards,
Amit Kapila.



On Fri, Nov 4, 2022 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 3, 2022 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Sep 19, 2022 at 8:09 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >
> > > Thanks for the clarification. Attached is the patch with the changes.
> > > Please have a look.
> > >
> >
> > LGTM. I'll push this tomorrow unless there are any other
> > comments/suggestions for this.
> >
>
> Pushed!
>

thanks Amit.!

--
With Regards,
Ashutosh Sharma.