Re: SSI predicate locking on heap -- tuple or row? - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: SSI predicate locking on heap -- tuple or row?
Date
Msg-id 4DE3D840.7090604@enterprisedb.com
Whole thread Raw
In response to SSI predicate locking on heap -- tuple or row?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: SSI predicate locking on heap -- tuple or row?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: SSI predicate locking on heap -- tuple or row?  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On 30.05.2011 17:10, Kevin Grittner wrote:
>> Heikki Linnakangas  wrote:
>> On 26.05.2011 06:19, Kevin Grittner wrote:
>> I did some of that in the comments, and I think I understand it
>> now. See attached patch. Does that look right to you?
>
> !  * A serialization failure can only occur if there is a dangerous
> !  * structure in the dependency graph:
> !  *
> !  *        Tin ------>  Tpivot ------>  Tout
> !  *                 rw             rw
> !  *
> !  * Furthermore, Tout must commit first.
>
> I agree that this is easier to read and understand than just straight
> text; but you dropped one other condition, which we use rather
> heavily.  We should probably add back something like:
>
> * If Tin is declared READ ONLY (or commits without writing), we can
> * only have a problem if Tout committed before Tin acquired its
> * snapshot.
>
>>      * XXX: Where does that last condition come from?
>
> This optimization is an original one, not yet appearing in any
> academic papers; Dan and I are both convinced it is safe, and in
> off-list correspondence with Michael Cahill after he left Oracle, he
> said that he discussed this with Alan Fekete and they both concur
> that it is a safe and good optimization.
>
> This optimization is mentioned in the README-SSI file in the
> "Innovations" section.  Do you think that file needs to have more on
> the topic?

Oh, I see this:

>           o We can avoid ever rolling back a transaction when the
> transaction on the conflict *in* side of the pivot is explicitly or
> implicitly READ ONLY unless the transaction on the conflict *out*
> side of the pivot committed before the READ ONLY transaction acquired
> its snapshot. (An implicit READ ONLY transaction is one which
> committed without writing, even though it was not explicitly declared
> to be READ ONLY.)

Since this isn't coming from the papers, it would be nice to explain why 
that is safe.

>>       * XXX: for an anomaly to occur, T2 must've committed before W.
>>       * Couldn't we easily check that here, or does the fact that W
>>       * committed already somehow imply it?
>
> The flags and macros should probably be renamed to make it more
> obvious that this is covered.  The flag which SxactHasConflictOut is
> based on is only set when the conflict out is to a transaction which
> committed ahead of the flagged transaction.  Regarding the other
> test -- since we have two transactions (Tin and Tout) which have not
> been summarized, and transactions are summarized in order of commit,
> SxactHasSummaryConflictOut implies that the transaction with the flag
> set has not committed before the transaction to which it has a
> rw-conflict out.

Ah, got it.

> The problem is coming up with a more accurate name which isn't really
> long.  The best I can think of is SxactHasConflictOutToEarlierCommit,
> which is a little long.  If nobody has a better idea, I guess it's
> better to have a long name than a misleading one.  Do you think we
> need to rename SxactHasSummaryConflictOut or just add a comment on
> that use that a summarized transaction will always be committed ahead
> of any transactions which are not summarized?

Dunno. I guess a comment would do. Can you write a separate patch for 
that, please?

>> (note that I swapped the second and third check in the function, I
>> think it's more straightforward that way).
>
> It doesn't matter to me either way, so if it seems clearer to you,
> that's a win.

FWIW, the reason I prefer it that way is that the function is checking 
three scenarios:

R ----> W ----> T2, where W committed after T2
R ----> W ----> T2, "else"
T0 ----> R ----> W

It seems more straightforward to keep both "R ----> W ----> T2" cases 
together.

>> Should we say "Cancelled on identification as pivot, during ...",
>> or "Cancelled on identification as a pivot, during ..." ? We seem
>> to use both in the error messages.
>
> Consistency is good.  I think it sounds better with the indefinite
> article in there.

Ok, will do.

> Do you want me to post another patch based on this, or are these
> changes from what you posted small enough that you would prefer to
> just do it as part of the commit?

I've committed with the discussed changes.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Unfriendly handling of pg_hba SSL options with SSL off
Next
From: Gaetano Giunta
Date:
Subject: adding applications to the stack builder