Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAD21AoA7MiQhpTwLZX7HZ2ERDyaTJyLh76o+O_Pk42C9hpHEhw@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, May 27, 2021 at 7:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 27, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, May 26, 2021 at 6:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I agree with you that specifying XID could be easier and
> > > understandable for users. I was thinking and studying a bit about what
> > > other systems do in this regard. Why don't we try to provide conflict
> > > resolution methods for users? The idea could be that either the
> > > conflicts can be resolved automatically or manually. In the case of
> > > manual resolution, users can use the existing methods or the XID stuff
> > > you are proposing here and in case of automatic resolution, the
> > > in-built or corresponding user-defined functions will be invoked for
> > > conflict resolution. There are more details to figure out in the
> > > automatic resolution scheme but I see a lot of value in doing the
> > > same.
> >
> > Yeah, I also see a lot of value in automatic conflict resolution. But
> > maybe we can have both ways? For example, in case where the user wants
> > to resolve conflicts in different ways or a conflict that cannot be
> > resolved by automatic resolution (not sure there is in practice
> > though), the manual resolution would also have value.
> >
>
> Right, that is exactly what I was saying. So, even if both can be done
> as separate patches, we should try to design the manual resolution in
> a way that can be extended for an automatic resolution system. I think
> we can try to have some initial idea/design/POC for an automatic
> resolution as well to ensure that the manual resolution scheme can be
> further extended.

Totally agreed.

But perhaps we might want to note that the conflict resolution we're
talking about is to resolve conflicts at the row or column level. It
doesn't necessarily raise an ERROR and the granularity of resolution
is per record or column. For example, if a DELETE and an UPDATE
process the same tuple (searched by PK), the UPDATE may not find the
tuple and be ignored due to the tuple having been already deleted. In
this case, no ERROR will occur (i.g. UPDATE will be ignored), but the
user may want to do another conflict resolution. On the other hand,
the feature proposed here assumes that an error has already occurred
and logical replication has already been stopped. And resolves it by
skipping the entire transaction.

IIUC the conflict resolution can be thought of as a combination of
types of conflicts and the resolution that can be applied to them. For
example, if there is a conflict between INSERT and INSERT and the
latter INSERT violates the unique constraint, an ERROR is raised. So
DBA can resolve it manually. But there is another way to automatically
resolve it by selecting the tuple having a newer timestamp. On the
other hand, in the DELETE and UPDATE conflict described above, it's
possible to automatically ignore the fact that the UPDATE could update
the tuple. Or we can even generate an ERROR so that DBA can resolve it
manually. DBA can manually resolve the conflict in various ways:
skipping the entire transaction from the origin, choose the tuple
having a newer/older timestamp, etc.

In that sense, we can think of the feature proposed here as a feature
that provides a way to resolve the conflict that would originally
cause an ERROR by skipping the entire transaction. If we add a
solution that raises an ERROR for conflicts that don't originally
raise an ERROR (like DELETE and UPDATE conflict) in the future, we
will be able to manually skip each transaction for all types of
conflicts.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: ANALYZE's dead tuple accounting can get confused
Next
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side