Re: Question about RI checks - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Question about RI checks
Date
Msg-id 1414183807.96948.YahooMailNeo@web122301.mail.ne1.yahoo.com
Whole thread Raw
In response to Re: Question about RI checks  (Florian Pflug <fgp@phlo.org>)
Responses Re: Question about RI checks
List pgsql-hackers
Florian Pflug <fgp@phlo.org> wrote:
> On Oct24, 2014, at 20:24 , Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Oct 24, 2014 at 2:12 PM, Florian Pflug <fgp@phlo.org> wrote:
>>>> What about doing one scan using SnapshotAny and then testing each
>>>> returned row for visibility under both relevant snapshots?  See
>>>> whether there is any tuple for which they disagree.
>>>
>>> See my other mail - testing whether the snapshots agree isn't enough,
>>> you'd have to check whether there could have been *any* snapshot taken
>>> between the two which would see a different result.
>>
>> Oh, hmm.  I had thought what I was proposing was strong enough to
>> handle that case, but now I see that it isn't.  However, I'm not
>> entirely sure that it's the RI code's job to prevent such cases, or at
>> least not when the transaction isolation level is less than
>> serializable.
>
> Well, the responsibility was laid onto the RI code by the decision to
> not do SIREAD locking for RI enforcement queries. Simply enabling SIREAD
> locking without doing anything else is not a good solution, I think -
> it will drive up the false-positive rate. And it would make workloads
> consisting purely of serializable transactions pay the price for the
> row-level locking done by the RI triggers, without getting anything in
> return.

Yeah, it would be far better to fix RI behavior if we can;
otherwise those using serializable transactions to protect data
integrity essentially pay the price for concurrency control around
foreign keys twice -- once using blocking locks and again using
non-blocking SIRead locks just to cover some odd corner cases.

I need to spend some more time looking at it, and I have another
couple things in front of this on my personal TODO list, but I
think that if we had a row lock which was stronger than current
SELECT FOR UPDATE behavior, and the delete of a parent row (or
updated of its referenced key) read the children using it, this
would be solved.  Essentially I'm thinking of something just like
FOR UPDATE except that a transaction which attempted a concurrent
UPDATE or DELETE of the row (before the locking transaction ended)
would error out with some form of serialization failure.  I believe
this would be consistent with the behavior of SELECT FOR UPDATE in
Oracle, so it would allow a path for those migrating from Oracle to
maintain their current logic (with a slight change to the FOR
strength clause).

>> Is there an argument that the anomaly that results is
>> unacceptable at REPEATABLE READ?
>
> I'm not aware of any case that is clearly unacceptable for REPEATABLE
> READ, but neither am I convinced that no such case exists. REPEATABLE READ
> mode is hard because there doesn't seem to be any formal definition of
> what we do and do not guarantee. The guarantees provided by the SQL
> standard seem to be so much weaker than what we offer that they're not
> really helpful :-(
>
> I believe the best way forward is to first find a solution for SERIALIZABLE
> transactions, and then check if it can be applied to REPEATABLE READ
> mode too. For SERIALIZABLE mode, it's at least clear what we're aiming
> for -- offering true serializability.

What I outline above would clearly work for both.  Arguably we
could just use FOR UPDATE in REPEATABLE READ and only use the new,
stronger lock for SERIALIZABLE.  In fact, that seems to me to be
the appropriate course.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: pg_background (and more parallelism infrastructure patches)
Next
From: Jim Nasby
Date:
Subject: Re: pg_background (and more parallelism infrastructure patches)