Re: SSI patch version 14 - Mailing list pgsql-hackers

From Dan Ports
Subject Re: SSI patch version 14
Date
Msg-id 20110131115028.GT57071@csail.mit.edu
Whole thread Raw
In response to Re: SSI patch version 14  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On Sun, Jan 30, 2011 at 04:01:56PM -0600, Kevin Grittner wrote:
> I'm wondering how this differs from what is discussed in Section 2.7
> ("Serialization Graph Testing") of Cahill's doctoral thesis.  That
> discusses a technique for trying to avoid false positives by testing
> the full graph for cycles, with the apparent conclusion that the
> costs of doing so are prohibitive.  The failure of attempts to
> implement that technique seem to have been part of the impetus to
> develop the SSI technique, where a particular structure involving two
> to three transactions has been proven to exist in all graphs which
> form such a cycle.

I'm not sure. My very limited understanding is that people have tried
to do concurrency control via serialization graph testing but it's (at
least thought to be) too expensive to actually use. This seems to be
saying the opposite of that, so there must be some difference...

> I've been able to identify four causes for false positives in the
> current SSI implementation:
>  
> (1)  Collateral reads.  In particular, data skew caused by inserts
> past the end of an index between an ANALYZE and subsequent data
> access was a recurring source of performance complaints.  This was
> fixed by having the planner probe the ends of an index to correct the
> costs in such situations.  This has been effective at correcting the
> target problem, but we haven't yet excluded such index probes from
> predicate locking.

I wasn't aware of this one (which probably means you mentioned it at
some point and I dropped that packet). Seems like it would not be too
difficult to exclude these -- for 9.2.

> (3)  Dependencies other than read-write.
[...]
> one has to be very careful about assuming anything
> else; trying to explicitly track these conflicts and consider that T2
> may have appeared to run before T1 can fall apart completely in the
> face of some common and reasonable programming practices.

Yes. If you want to do precise cycle testing you'll have to track these
dependencies also, and I believe that would require quite a different
design from what we're doing.

> (4)  Length of read-write dependency (a/k/a rw-conflict) chains.
[...]
> They also, as it
> happens, provide enough data to fully trace the read-write
> dependencies and avoid some false positives where the "dangerous
> structure" SSI is looking for exists, but there is neither a complete
> rw-conflict cycle, nor any transaction in the graph which committed
> early enough to make a write-read conflict possible to any
> transaction in the graph.  Whether such rigorous tracing prevents
> enough false positives to justify the programming effort, code
> complexity, and run-time cost is anybody's guess.

I think I understand what you're getting at here, and it does sound
quite complicated for a benefit that is not clear.

> I only raise these to clarify the issue for the Jeff (who is
> reviewing the patch), since he asked.  I strongly feel that none of
> them are issues which need to be addressed for 9.1, nor do I think
> they can be properly addressed within the time frame of this CF.  

Absolutely, no question about it!

Dan

-- 
Dan R. K. Ports              MIT CSAIL                http://drkp.net/


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Extensions support for pg_dump, patch v27
Next
From: Shigeru HANADA
Date:
Subject: Re: review: FDW API