On Thu, Feb 15, 2024 at 5:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Thu, Feb 15, 2024 at 05:00:18PM +0530, Amit Kapila wrote:
> > On Thu, Feb 15, 2024 at 4:29 PM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com> wrote:
> > > Attach the v2 patch here.
> > >
> > > Apart from the new log message. I think we can add one more debug message in
> > > reserve_wal_for_local_slot, this could be useful to analyze the failure.
> >
> > Yeah, that can also be helpful, but the added message looks naive to me.
> > + elog(DEBUG1, "segno: %ld oldest_segno: %ld", oldest_segno, segno);
> >
> > Instead of the above, how about something like: "segno: %ld of
> > purposed restart_lsn for the synced slot, oldest_segno: %ld
> > available"?
>
> Looks good to me. I'm not sure if it would make more sense to elog only if
> segno < oldest_segno means just before the XLogSegNoOffsetToRecPtr() call?
>
> But I'm fine with the proposed location too.
>
I am also fine either way but the current location gives required
information in more number of cases and could be helpful in debugging
this new facility.
> >
> > > And we
> > > can also enable the DEBUG log in the 040 tap-test, I see we have similar
> > > setting in 010_logical_decoding_timline and logging debug1 message doesn't
> > > increase noticable time on my machine. These are done in 0002.
> > >
> >
> > I haven't tested it but I think this can help in debugging BF
> > failures, if any. I am not sure if to keep it always like that but
> > till the time these tests are stabilized, this sounds like a good
> > idea. So, how, about just making test changes as a separate patch so
> > that later if required we can revert/remove it easily? Bertrand, do
> > you have any thoughts on this?
>
> +1 on having DEBUG log in the 040 tap-test until it's stabilized (I think we
> took the same approach for 035_standby_logical_decoding.pl IIRC) and then revert
> it back.
>
Good to know!
> Also I was thinking: what about adding an output to pg_sync_replication_slots()?
> The output could be the number of sync slots that have been created and are
> not considered as sync-ready during the execution.
>
Yeah, we can consider outputting some information via this function
like how many slots are synced and persisted but not sure what would
be appropriate here. Because one can anyway find that or more
information by querying pg_replication_slots. I think we can keep
discussing what makes more sense as a return value but let's commit
the debug/log patches as they will be helpful to analyze BF failures
or any other issues reported.
--
With Regards,
Amit Kapila.