Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key
Date
Msg-id CA+Tgmob5gn1oKCjBUu8t7Go4h_AZfN5yy22N_Jv0OK8advJoWA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key  (Pavan Deolasee <pavan.deolasee@gmail.com>)
List pgsql-hackers
On Thu, Mar 8, 2018 at 10:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Pavan Deolasee <pavan.deolasee@gmail.com> writes:
>> I am actually very surprised that 0001-Invalidate-ip_blkid-v5.patch does
>> not do anything to deal with the fact that t_ctid may no longer point to
>> itself to mark end of the chain. I just can't see how that would work.
>> ...
>> I am actually worried that we're tinkering with ip_blkid to handle one
>> corner case of detecting partition key update. This is going to change
>> on-disk format and probably need more careful attention.
>
> You know, either one of those alone would be scary as hell.  Both in
> one patch seem to me to be sufficient reason to reject it outright.
> Not only will it be an unending source of bugs, but it's chewing up
> far too much of what few remaining degrees-of-freedom we have in the
> on-disk format ... for a single purpose that hasn't even been sold as
> something we have to have.

I agree that it isn't clear that it's worth making a change to the
on-disk format for this feature.  I made the argument when it was
first proposed that we should just document that there would be
anomalies with cross-partition updates that didn't occur otherwise.
However, multiple people thought that it was worth burning one of our
precious few remaining infomask bits in order to throw an error in
that case rather than just silently having an anomaly, and that's why
this patch got written.  It's not too late to decide that we'd rather
not do that after all.

However, there's no such thing as a free lunch.  We can't use the CTID
field to point to a CTID in another table because there's no room to
include the identify of the other table in the field.  We can't widen
it to make room because that would break on-disk compatibility and
bloat our already-too-big tuple headers.  So, we cannot make it work
like it does when the updates are confined to a single partition.
Therefore, the only options are (1) ignore the problem, and let a
cross-partition update look entirely like a delete+insert, (2) try to
throw some error in the case where this introduces user-visible
anomalies that wouldn't be visible otherwise, or (3) revert update
tuple routing entirely.  I voted for (1), but the consensus was (2).
I think that (3) will make a lot of people sad; it's a very good
feature.  If we want to have (2), then we've got to have some way to
mark a tuple that was deleted as part of a cross-partition update, and
that requires a change to the on-disk format.

In short, the two things that you are claiming are prohibitively scary
if done in the same patch look to me like they're actually just one
thing, and that one thing is something which absolutely has to be done
in order to implement the design most community members favored in the
original discussion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)