Re: Logical Replication of sequences - Mailing list pgsql-hackers

From vignesh C
Subject Re: Logical Replication of sequences
Date
Msg-id CALDaNm0Jj00ZKHh4zpSF47AutPeyEN4SfDg55EZoSpBvkD12DQ@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (Shlok Kyal <shlok.kyal.oss@gmail.com>)
Responses Re: Logical Replication of sequences
List pgsql-hackers
On Tue, 25 Jun 2024 at 17:53, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Thu, 20 Jun 2024 at 18:24, vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 19 Jun 2024 at 20:33, vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Agreed and I am not sure which is better because there is a value in
> > > > keeping the state name the same for both sequences and tables. We
> > > > probably need more comments in code and doc updates to make the
> > > > behavior clear. We can start with the sequence state as 'init' for
> > > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
> > > > feel so during the review.
> > >
> > > Here is a patch which does the sequence synchronization in the
> > > following lines from the above discussion:
> > > This commit introduces sequence synchronization during 1) creation of
> > > subscription for initial sync of sequences 2) refresh publication to
> > > synchronize the sequences for the newly created sequences 3) refresh
> > > publication sequences for synchronizing all the sequences.
> > > 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change):
> > >    - The subscriber retrieves sequences associated with publications.
> > >    - Sequences  are added in the 'init' state to the pg_subscription_rel table.
> > >    - Sequence synchronization worker will be started if there are any
> > > sequences to be synchronized
> > >    - A new sequence synchronization worker handles synchronization in
> > > batches of 100 sequences:
> > >      a) Retrieves sequence values using pg_sequence_state from the publisher.
> > >      b) Sets sequence values accordingly.
> > >      c) Updates sequence state to 'READY' in pg_susbcripion_rel
> > >      d) Commits batches of 100 synchronized sequences.
> > > 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH
> > > PUBLICATION (no syntax change):
> > >    - Stale sequences are removed from pg_subscription_rel.
> > >    - Newly added sequences in the publisher are added in 'init' state
> > > to pg_subscription_rel.
> > >    - Sequence synchronization will be done by sequence sync worker as
> > > listed in subscription creation process.
> > >    - Sequence synchronization occurs for newly added sequences only.
> > > 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION
> > > SEQUENCES  for refreshing all sequences:
> > >    - Removes stale sequences and adds newly added sequences from the
> > > publisher to pg_subscription_rel.
> > >    - Resets all sequences in pg_subscription_rel to 'init' state.
> > >    - Initiates sequence synchronization for all sequences by sequence
> > > sync worker as listed in subscription creation process.
> >
> > Here is an updated patch with a few fixes to remove an unused
> > function, changed a few references of table to sequence and added one
> > CHECK_FOR_INTERRUPTS in the sequence sync worker loop.
>
> Hi Vignesh,
>
> I have reviewed the patches and I have following comments:
>
> ===== tablesync.c ======
> 1. process_syncing_sequences_for_apply can crash with:
> 2024-06-21 15:25:17.208 IST [3681269] LOG:  logical replication apply
> worker for subscription "test1" has started
> 2024-06-21 15:28:10.127 IST [3682329] LOG:  logical replication
> sequences synchronization worker for subscription "test1" has started
> 2024-06-21 15:28:10.146 IST [3682329] LOG:  logical replication
> synchronization for subscription "test1", sequence "s1" has finished
> 2024-06-21 15:28:10.149 IST [3682329] LOG:  logical replication
> synchronization for subscription "test1", sequence "s2" has finished
> 2024-06-21 15:28:10.149 IST [3682329] LOG:  logical replication
> sequences synchronization worker for subscription "test1" has finished
> 2024-06-21 15:29:53.535 IST [3682767] LOG:  logical replication
> sequences synchronization worker for subscription "test1" has started
> TRAP: failed Assert("nestLevel > 0 && (nestLevel <= GUCNestLevel ||
> (nestLevel == GUCNestLevel + 1 && !isCommit))"), File: "guc.c", Line:
> 2273, PID: 3682767
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (ExceptionalCondition+0xbb)[0x5b2a61861c99]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (AtEOXact_GUC+0x7b)[0x5b2a618bddfa]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (RestoreUserContext+0xc7)[0x5b2a618a6937]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1ff7dfa)[0x5b2a61115dfa]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1ff7eb4)[0x5b2a61115eb4]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (SequencesyncWorkerMain+0x33)[0x5b2a61115fe7]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (BackgroundWorkerMain+0x4ad)[0x5b2a61029cae]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (postmaster_child_launch+0x236)[0x5b2a6102fb36]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1f1d12a)[0x5b2a6103b12a]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1f1df0f)[0x5b2a6103bf0f]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1f1bf71)[0x5b2a61039f71]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1f16f73)[0x5b2a61034f73]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (PostmasterMain+0x18fb)[0x5b2a61034445]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1ab1ab8)[0x5b2a60bcfab8]
> /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7b76bc629d90]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7b76bc629e40]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (_start+0x25)[0x5b2a601491a5]
>
> Analysis:
> Suppose there are two sequences (s1, s2) on publisher.
> SO, during initial sync.
> in loop,
> + foreach(lc, table_states_not_ready)
>
> table_states_not_ready -> it contains both s1 and s2.
> So, for s1 a sequence sync will be started. It will sync all sequences
> and the sequence sync worker will exit.
> Now, for s2 again a sequence sync will start. It will give the above error.
>
> Is this loop required? Instead we can just use a bool like
> 'is_any_sequence_not_ready'. Thoughts?
>
> ===== sequencesync.c =====
> 2. function name should be 'LogicalRepSyncSequences' instead of
> 'LogicalRepSyncSeqeunces'
>
> 3. In function 'LogicalRepSyncSeqeunces'
>     sequencerel = table_open(seqinfo->relid, RowExclusiveLock);\
>     There is a extra '\' symbol
>
> 4. In function LogicalRepSyncSeqeunces:
> + ereport(LOG,
> + errmsg("logical replication synchronization for subscription \"%s\",
> sequence \"%s\" has finished",
> +   get_subscription_name(subid, false), RelationGetRelationName(sequencerel)));
> + table_close(sequencerel, NoLock);
> +
> + currseq++;
> +
> + if (currseq % MAX_SEQUENCES_SYNC_PER_BATCH == 0 || currseq ==
> list_length(sequences))
> + CommitTransactionCommand();
>
>
> The above message gets logged even if the changes are not committed.
> Suppose the sequence worker exits before commit due to some reason.
> Thought the log will show that sequence is synced, the sequence will
> be in 'init' state. I think this is not desirable.
> Maybe we should log the synced sequences at commit time? Thoughts?
>
> ===== General ====
> 5. We can use other macros like 'foreach_ptr' instead of 'foreach'

Thanks for the comments, the attached patch has the fixes for the same.

Regards,
Vignesh

Attachment

pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
Next
From: Jacob Champion
Date:
Subject: Re: Direct SSL connection and ALPN loose ends