Re: Conflict detection and logging in logical replication - Mailing list pgsql-hackers

From shveta malik
Subject Re: Conflict detection and logging in logical replication
Date
Msg-id CAJpy0uCC4eLp24LH2aVMGsJkuqJpSSD4e_3m+WUjTv__CBoQ2Q@mail.gmail.com
Whole thread Raw
In response to RE: Conflict detection and logging in logical replication  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
List pgsql-hackers
On Wed, Aug 7, 2024 at 1:08 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, August 7, 2024 3:00 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:
> >
> > While playing with the 0003 patch (the patch may not be ready), I found that
> > when the insert_exists event occurred, both apply_error_count and
> > insert_exists_count was counted.
>
> Thanks for testing. 0003 is a separate feature which we might review
> after the 0001 is in a good shape or committed.
>
> >
> > ```
> > -- insert a tuple on the subscriber
> > subscriber =# INSERT INTO tab VALUES (1);
> >
> > -- insert the same tuple on the publisher, which causes insert_exists conflict
> > publisher =# INSERT INTO tab VALUES (1);
> >
> > -- after some time...
> > subscriber =# SELECT * FROM pg_stat_subscription_stats; -[ RECORD
> > 1 ]--------+------
> > subid                | 16389
> > subname              | sub
> > apply_error_count    | 16
> > sync_error_count     | 0
> > insert_exists_count  | 16
> > update_differ_count  | 0
> > update_exists_count  | 0
> > update_missing_count | 0
> > delete_differ_count  | 0
> > delete_missing_count | 0
> > stats_reset          |
> > ```
> >
> > Not tested, but I think this could also happen for the update_exists_count case,
> > or sync_error_count may be counted when the tablesync worker detects the
> > conflict.
> >
> > IIUC, the reason is that pgstat_report_subscription_error() is called in the
> > PG_CATCH part in start_apply() even after ReportApplyConflict(ERROR) is
> > called.
> >
> > What do you think of the current behavior? I wouldn't say I like that the same
> > phenomenon is counted as several events. E.g., in the case of vacuum, the
> > entry seemed to be separated based on the process by backends or
> > autovacuum.
>
> I think this is as expected. When the insert conflicts, it will report an ERROR
> so both the conflict count and error out are incremented which looks reasonable
> to me. The default behavior for each conflict could be different and is
> documented, I think It's clear that insert_exists will cause an ERROR while
> delete_missing or .. will not.
>

I had also observed this behaviour during my testing of stats patch.
But I found this behaviour to be okay. IMO, apply_error_count should
account any error caused during applying and thus should incorporate
insert-exists error-count too.

> In addition, we might support a resolution called "error" which is to report an
> ERROR When facing the specified conflict, it would be a bit confusing to me if
> the apply_error_count Is not incremented on the specified conflict, when I set
> resolution to ERROR.
>
> > I feel the spec is unfamiliar in that only insert_exists and update_exists are
> > counted duplicated with the apply_error_count.
> >
> > An easy fix is to introduce a global variable which is turned on when the conflict
> > is found.
>
> I am not sure about the benefit of changing the current behavior in the patch.
> And it will change the existing behavior, because before the conflict detection
> patch, the apply_error_count is incremented on each unique key violation, while
> after the detection patch, it stops incrementing the apply_error and only
> conflict_count is incremented.
>

thanks
Shveta



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Logical Replication of sequences
Next
From: Michael Banck
Date:
Subject: Re: Enable data checksums by default