Re: An out-of-date comment in nodeIndexonlyscan.c - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: An out-of-date comment in nodeIndexonlyscan.c
Date
Msg-id CA+hUKGKbYe3=0tV4SgE5QOeSsTDFRZn3a7ptr2kFroXhPPSQEg@mail.gmail.com
Whole thread Raw
In response to Re: An out-of-date comment in nodeIndexonlyscan.c  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: An out-of-date comment in nodeIndexonlyscan.c
List pgsql-hackers
On Mon, Jun 14, 2021 at 2:29 PM David Rowley <dgrowleyml@gmail.com> wrote:
> Have you also thought about deferrable unique / primary key constraints?
>
> It's possible to the uniqueness temporarily violated during a
> transaction when the unique constraint is deferred,

Oh, yeah, very good question.  I think many scenarios involving
duplicates work correctly, but I think there is a danger like this:

create table t (i int primary key deferrable initially deferred, j int);
create unique index on t(j);
insert into t values (999, 999); -- avoid empty index
set enable_seqscan = off;

begin transaction isolation level serializable;
insert into t values (1, 2); -- create phantom row
select * from t where i = 1;
delete from t where j = 2; -- remove phantom row
SELECT locktype, relation::regclass, page, tuple FROM pg_locks WHERE
mode = 'SIReadLock';
commit;

master:

 locktype | relation | page | tuple
----------+----------+------+-------
 page     | t_pkey   |    1 |
 page     | t_j_idx  |    1 |
(2 rows)

v3 patch:

 locktype | relation | page | tuple
----------+----------+------+-------
(0 rows)

In fact the lock on t_pkey's page 1 was important: it represents the
search for a tuple with i = 1, other than our temporary phantom (only
allowed because constraint deferred).  If someone else inserts a row
matching i = 1, the SSI system will not know that we tried to look for
it, because our own temporary phantom confused us.

> I think you'd just need to add a check to ensure that indimmediate is
> true around where you're checking the indisunique flag.

Yeah, that fixes the problem in this case at least.  With v4:

 locktype | relation | page | tuple
----------+----------+------+-------
 page     | t_pkey   |    1 |
(1 row)

(It's expected that t_j_idx is not locked: the delete query benefits
from the optimisation when accessing the index on t(j)).

That test case is a little confusing, because at no point does it ever
actually create a duplicate, but I suspect the problem is avoided
without the deferred constraint because you'd either get a UCV on
insertion, or block anyone else from inserting until after you commit
(even after you delete the phantom), and I think that may avoid the
hazard.  I could be confused about that.  If I am wrong, then a
possible general solution may be to apply the optimisation only if we
find a match that wasn't written by this transaction, though I'm not
sure how given the current layering.

Attachment

pgsql-hackers by date:

Previous
From: "Drouvot, Bertrand"
Date:
Subject: Re: Minimal logical decoding on standbys
Next
From: Yugo NAGATA
Date:
Subject: Fix around conn_duration in pgbench