Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id Zc4AsF9FJPDW0iDR@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
Hi,

On Thu, Feb 15, 2024 at 05:00:18PM +0530, Amit Kapila wrote:
> On Thu, Feb 15, 2024 at 4:29 PM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> > Attach the v2 patch here.
> >
> > Apart from the new log message. I think we can add one more debug message in
> > reserve_wal_for_local_slot, this could be useful to analyze the failure.
> 
> Yeah, that can also be helpful, but the added message looks naive to me.
> + elog(DEBUG1, "segno: %ld oldest_segno: %ld", oldest_segno, segno);
> 
> Instead of the above, how about something like: "segno: %ld of
> purposed restart_lsn for the synced slot, oldest_segno: %ld
> available"?

Looks good to me. I'm not sure if it would make more sense to elog only if 
segno < oldest_segno means just before the XLogSegNoOffsetToRecPtr() call?

But I'm fine with the proposed location too.

> 
> > And we
> > can also enable the DEBUG log in the 040 tap-test, I see we have similar
> > setting in 010_logical_decoding_timline and logging debug1 message doesn't
> > increase noticable time on my machine. These are done in 0002.
> >
> 
> I haven't tested it but I think this can help in debugging BF
> failures, if any. I am not sure if to keep it always like that but
> till the time these tests are stabilized, this sounds like a good
> idea. So, how, about just making test changes as a separate patch so
> that later if required we can revert/remove it easily? Bertrand, do
> you have any thoughts on this?

+1 on having DEBUG log in the 040 tap-test until it's stabilized (I think we
took the same approach for 035_standby_logical_decoding.pl IIRC) and then revert
it back.

Also I was thinking: what about adding an output to pg_sync_replication_slots()?
The output could be the number of sync slots that have been created and are
not considered as sync-ready during the execution. I think that could be a good
addition to v2-0001-Add-a-log-if-remote-slot-didn-t-catch-up-to-local.patch
proposed here (should trigger special attention in case of non zero value).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Add system identifier to backup manifest
Next
From: Amit Kapila
Date:
Subject: Re: Synchronizing slots from primary to standby