Re: Collect statistics about conflicts in logical replication - Mailing list pgsql-hackers

From shveta malik
Subject Re: Collect statistics about conflicts in logical replication
Date
Msg-id CAJpy0uD9WE7iScjxTqcGAza+P6mM9vZ5F9TeU+uG0Pb=8bvhtA@mail.gmail.com
Whole thread Raw
Responses Re: Collect statistics about conflicts in logical replication
List pgsql-hackers
On Tue, Aug 27, 2024 at 3:21 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Tuesday, August 27, 2024 10:59 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > ~~~
> >
> > 3.
> > +# Enable track_commit_timestamp to detect origin-differ conflicts in
> > +logical # replication. Reduce wal_retrieve_retry_interval to 1ms to
> > +accelerate the # restart of the logical replication worker after encountering a
> > conflict.
> > +$node_subscriber->append_conf(
> > + 'postgresql.conf', q{
> > +track_commit_timestamp = on
> > +wal_retrieve_retry_interval = 1ms
> > +});
> >
> > Later, after CDR resolvers are implemented, it might be good to revisit these
> > conflict test cases and re-write them to use some conflict resolvers like 'skip'.
> > Then the subscriber won't give ERRORs and restart apply workers all the time
> > behind the scenes, so you won't need the above configuration for accelerating
> > the worker restarts. In other words, running these tests might be more efficient
> > if you can avoid restarting workers all the time.
> >
> > I suggest putting an XXX comment here as a reminder that these tests should
> > be revisited to make use of conflict resolvers in the future.
>
> I think it would be too early to mention the resolution implementation detail
> in the comments considering that the resolution is still not RFC. Also, I think
> reducing wal_retrieve_retry_interval is a reasonable way to speed up the test
> in this case because the test is not letting the worker to restart all the time, the
> error causes the restart will be resolved immediately after the stats check. So, I
> think adding XXX is not very appropriate.
>
> Other comments look good to me.
> I slightly adjusted few words and merged the changes. Thanks !
>
> Here is V3 patch.
>

Thanks for the patch. Just thinking out loud, since we have names like
'apply_error_count', 'sync_error_count' which tells that they are
actually error-count, will it be better to have something similar in
conflict-count cases, like 'insert_exists_conflict_count',
'delete_missing_conflict_count' and so on. Thoughts?

I noticed that now we do mention this (as I suggested earlier):
+ Note that any conflict resulting in an apply error will be counted
in both apply_error_count and the corresponding conflict count.

But we do not mention clearly which ones are conflict-counts. As an
example, we have this:

+ insert_exists_count bigint:
+ Number of times a row insertion violated a NOT DEFERRABLE unique
constraint during the application of changes

It does not mention that it is a conflict count. So we need to either
change names or mention clearly against each that it is a conflict
count.

thanks
sHveta



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Pgoutput not capturing the generated columns
Next
From: Maxim Orlov
Date:
Subject: Re: Test 041_checkpoint_at_promote.pl faild in installcheck due to missing injection_points