Re: Conflict detection for update_deleted in logical replication - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Conflict detection for update_deleted in logical replication
Date
Msg-id CAD21AoBupmOifh4PO=+NjV9CqnQGq54JV1HpQz8qJ9tcZgBxcw@mail.gmail.com
Whole thread Raw
In response to Re: Conflict detection for update_deleted in logical replication  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Conflict detection for update_deleted in logical replication
List pgsql-hackers
On Fri, Sep 13, 2024 at 12:56 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Sep 13, 2024 at 11:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > >
> > > > So in brief, this solution is only for bidrectional setup? For non-bidirectional,
> > > > feedback_slots is non-configurable and thus irrelevant.
> > >
> > > Right.
> > >
> >
> > One possible idea to address the non-bidirectional case raised by
> > Shveta is to use a time-based cut-off to remove dead tuples. As
> > mentioned earlier in my email [1], we can define a new GUC parameter
> > say vacuum_committs_age which would indicate that we will allow rows
> > to be removed only if the modified time of the tuple as indicated by
> > committs module is greater than the vacuum_committs_age. We could keep
> > this parameter a table-level option without introducing a GUC as this
> > may not apply to all tables. I checked and found that some other
> > replication solutions like GoldenGate also allowed similar parameters
> > (tombstone_deletes) to be specified at table level [2]. The other
> > advantage of allowing it at table level is that it won't hamper the
> > performance of hot-pruning or vacuum in general. Note, I am careful
> > here because to decide whether to remove a dead tuple or not we need
> > to compare its committs_time both during hot-pruning and vacuum.
>
> +1 on the idea,

I agree that this idea is much simpler than the idea originally
proposed in this thread.

IIUC vacuum_committs_age specifies a time rather than an XID age. But
how can we implement it? If it ends up affecting the vacuum cutoff, we
should be careful not to end up with the same result of
vacuum_defer_cleanup_age that was discussed before[1]. Also, I think
the implementation needs not to affect the performance of
ComputeXidHorizons().

> but IIUC this value doesn’t need to be significant; it
> can be limited to just a few minutes. The one which is sufficient to
> handle replication delays caused by network lag or other factors,
> assuming clock skew has already been addressed.

I think that in a non-bidirectional case the value could need to be a
large number. Is that right?

Regards,

[1] https://www.postgresql.org/message-id/20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: optimizing pg_upgrade's once-in-each-database steps
Next
From: Thomas Munro
Date:
Subject: Re: Robocopy might be not robust enough for never-ending testing on Windows