Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: logical decoding and replication of sequences, take 2
Date
Msg-id CAExHW5vHRgjWzi6zZbgCs97eW9U7xMtzXEQK+aepuzoGDsDNtg@mail.gmail.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences, take 2  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: logical decoding and replication of sequences, take 2
Re: logical decoding and replication of sequences, take 2
List pgsql-hackers
0005, 0006 and 0007 are all related to the initial sequence sync. [3]
resulted in 0007 and I think we need it. That leaves 0005 and 0006 to
be reviewed in this response.

I followed the discussion starting [1] till [2]. The second one
mentions the interlock mechanism which has been implemented in 0005
and 0006. While I don't have an objection to allowing LOCKing a
sequence using the LOCK command, I am not sure whether it will
actually work or is even needed.

The problem described in [1] seems to be the same as the problem
described in [2]. In both cases we see the sequence moving backwards
during CATCHUP. At the end of catchup the sequence is in the right
state in both the cases. [2] actually deems this behaviour OK. I also
agree that the behaviour is ok. I am confused whether we have solved
anything using interlocking and it's really needed.

I see that the idea of using an LSN to decide whether or not to apply
a change to sequence started in [4]. In [5] Tomas proposed to use page
LSN. Looking at [6], it actually seems like a good idea. In [7] Tomas
agreed that LSN won't be sufficient. But I don't understand why. There
are three LSNs in the picture - restart LSN of sync slot,
confirmed_flush LSN of sync slot and page LSN of the sequence page
from where we read the initial state of the sequence. I think they can
be used with the following rules:
1. The publisher will not send any changes with LSN less than
confirmed_flush so we are good there.
2. Any non-transactional changes that happened between confirmed_flush
and page LSN should be discarded while syncing. They are already
visible to SELECT.
3. Any transactional changes with commit LSN between confirmed_flush
and page LSN should be discarded while syncing. They are already
visible to SELECT.
4. A DDL acquires a lock on sequence. Thus no other change to that
sequence can have an LSN between the LSN of the change made by DDL and
the commit LSN of that transaction. Only DDL changes to sequence are
transactional. Hence any transactional changes with commit LSN beyond
page LSN would not have been seen by the SELECT otherwise SELECT would
see the page LSN committed by that transaction. so they need to be
applied while syncing.
5. Any non-transactional changes beyond page LSN should be applied.
They are not seen by SELECT.

Am I missing something?

I don't have an idea how to get page LSN via a SQL query (while also
fetching data on that page). That may or may not be a challenge.

[1] https://www.postgresql.org/message-id/c2799362-9098-c7bf-c315-4d7975acafa3%40enterprisedb.com
[2] https://www.postgresql.org/message-id/2d4bee7b-31be-8b36-2847-a21a5d56e04f%40enterprisedb.com
[3] https://www.postgresql.org/message-id/f5a9d63d-a6fe-59a9-d1ed-38f6a5582c13%40enterprisedb.com
[4] https://www.postgresql.org/message-id/CAA4eK1KUYrXFq25xyjBKU1UDh7Dkzw74RXN1d3UAYhd4NzDcsg%40mail.gmail.com
[5] https://www.postgresql.org/message-id/CAA4eK1LiA8nV_ZT7gNHShgtFVpoiOvwoxNsmP_fryP%3DPsYPvmA%40mail.gmail.com
[6] https://www.postgresql.org/docs/current/storage-page-layout.html

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Nikita Malakhov
Date:
Subject: Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value
Next
From: Ashutosh Bapat
Date:
Subject: Re: logical decoding and replication of sequences, take 2