Re: Conflict Detection and Resolution - Mailing list pgsql-hackers

From shveta malik
Subject Re: Conflict Detection and Resolution
Date
Msg-id CAJpy0uCCdFyp7dQdvEroXAw9WaKticomaKPikom4_bGFFDo3_w@mail.gmail.com
Whole thread Raw
In response to Re: Conflict Detection and Resolution  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Fri, Jun 7, 2024 at 6:10 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> >>> I don't understand the why should update_missing or update_deleted be
> >>> different, especially considering it's not detected reliably. And also
> >>> that even if we happen to find the row the associated TOAST data may
> >>> have already been removed. So why would this matter?
> >>
> >> Here, we are trying to tackle the case where the row is 'recently'
> >> deleted i.e. concurrent UPDATE and DELETE on pub and sub. User may
> >> want to opt for a different resolution in such a case as against the
> >> one where the corresponding row was not even present in the first
> >> place. The case where the row was deleted long back may not fall into
> >> this category as there are higher chances that they have been removed
> >> by vacuum and can be considered equivalent to the update_ missing
> >> case.
> >>
> >> Regarding "TOAST column" for deleted row cases, we may need to dig
> >> more. Thanks for bringing this case. Let me analyze more here.
> >>
> > I tested a simple case with a table with one TOAST column and found
> > that when a tuple with a TOAST column is deleted, both the tuple and
> > corresponding pg_toast entries are marked as ‘deleted’ (dead) but not
> > removed immediately. The main tuple and respective pg_toast entry are
> > permanently deleted only during vacuum. First, the main table’s dead
> > tuples are vacuumed, followed by the secondary TOAST relation ones (if
> > available).
> > Please let us know if you have a specific scenario in mind where the
> > TOAST column data is deleted immediately upon ‘delete’ operation,
> > rather than during vacuum, which we are missing.
> >
>
> I'm pretty sure you can vacuum the TOAST table directly, which means
> you'll end up with a deleted tuple with TOAST pointers, but with the
> TOAST entries already gone.
>

It is true that for a deleted row, its toast entries can be vacuumed
earlier than the original/parent row, but we do not need to be
concerned about that to raise 'update_deleted'. To raise an
'update_deleted' conflict, it is sufficient to know that the row has
been deleted and not yet vacuumed, regardless of the presence or
absence of its toast entries. Once this is determined, we need to
build the tuple from remote data and apply it (provided resolver is
such that). If the tuple cannot be fully constructed from the remote
data, the apply operation will either be skipped or an error will be
raised, depending on whether the user has chosen the apply_or_skip or
apply_or_error option.

In cases where the table has toast columns but the remote data does
not include the toast-column entry (when the toast column is
unmodified and not part of the replica identity),  the resolution for
'update_deleted' will be no worse than for 'update_missing'. That is,
for both the cases, we can not construct full tuple and thus the
operation either needs to be skipped or error needs to be raised.

thanks
Shveta



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Conflict Detection and Resolution
Next
From: Amit Kapila
Date:
Subject: Re: confirmed flush lsn seems to be move backward in certain error cases