Re: SSI patch version 14 - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: SSI patch version 14
Date
Msg-id 4D458B740200002500039FD8@gw.wicourts.gov
Whole thread Raw
In response to SSI patch version 14  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: SSI patch version 14  (Dan Ports <drkp@csail.mit.edu>)
List pgsql-hackers
> Dan Ports  wrote:
> On Thu, Jan 27, 2011 at 09:18:23AM -0800, Jeff Davis wrote:
>
>> Is there a reason we can't check for a real cycle, which would let
>> T1 succeed?
>
> I talked today with someone who experimented with doing exactly
> that in MySQL as part of a research project (Perfectly Serializable
> Snapshot Isolation, paper forthcoming in ICDE)
I'm wondering how this differs from what is discussed in Section 2.7
("Serialization Graph Testing") of Cahill's doctoral thesis.  That
discusses a technique for trying to avoid false positives by testing
the full graph for cycles, with the apparent conclusion that the
costs of doing so are prohibitive.  The failure of attempts to
implement that technique seem to have been part of the impetus to
develop the SSI technique, where a particular structure involving two
to three transactions has been proven to exist in all graphs which
form such a cycle.
I've been able to identify four causes for false positives in the
current SSI implementation:
(1)  Collateral reads.  In particular, data skew caused by inserts
past the end of an index between an ANALYZE and subsequent data
access was a recurring source of performance complaints.  This was
fixed by having the planner probe the ends of an index to correct the
costs in such situations.  This has been effective at correcting the
target problem, but we haven't yet excluded such index probes from
predicate locking.
(2)  Lock granularity.  This includes vertical granularity issues,
such as granularity promotion to conserve shared memory and page
locking versus next-key locking; and it also includes the possibility
of column locking. Many improvements are still possible in this area;
although each should be treated as a performance enhancement patch
and subject to the appropriate performance testing to justify the
extra code and complexity.  In some cases the work needed to effect
the reduction in serialization failures may cost more than retrying
the failed transactions.
(3)  Dependencies other than read-write.  SSI relies on current
PostgreSQL MVCC handling of write-write conflicts -- when one of
these occurs between two transactions running at this level, one of
the transactions must fail with a serialization failure; since only
one runs to successful completion, there is no question of which ran
*first*.  Write-read dependencies (where one transaction *can* see
the work of another because they *don't* overlap) is handled in SSI
by assuming that if T1 commits before T2 acquires its snapshot, T1
will appear to have run before T2.  I won't go into it at length on
this thread, but one has to be very careful about assuming anything
else; trying to explicitly track these conflicts and consider that T2
may have appeared to run before T1 can fall apart completely in the
face of some common and reasonable programming practices.
(4)  Length of read-write dependency (a/k/a rw-conflict) chains.
Basically, it is a further extension from the Cahill papers in the
direction of work which apparently failed because of the overhead of
full cycle-checking.  The 2008 paper (which was "best paper" at the
2008 ACM SIGMOD), just used inConflict and outConflict flag bits.
The 2009 paper extended this to pointers by those names, with NULL
meaning "no conflict", and a self-reference meaning "couldn't track
the detail; consider it to conflict with *all* concurrent
transactions".  The latter brought the false positive rate down
without adding too much overhead.  In the PostgreSQL effort, we
replaced those pointers with *lists* of conflicts.  We only get to
"conflict with all concurrent transactions" in certain circumstances
after summarizing transaction data to avoid blowing out shared
memory.
The lists allowed avoidance of many false positives, facilitated
earlier cleanup of much information from shared memory, and led to a
cleaner implementation of the summarization logic.  They also, as it
happens, provide enough data to fully trace the read-write
dependencies and avoid some false positives where the "dangerous
structure" SSI is looking for exists, but there is neither a complete
rw-conflict cycle, nor any transaction in the graph which committed
early enough to make a write-read conflict possible to any
transaction in the graph.  Whether such rigorous tracing prevents
enough false positives to justify the programming effort, code
complexity, and run-time cost is anybody's guess.
I only raise these to clarify the issue for the Jeff (who is
reviewing the patch), since he asked.  I strongly feel that none of
them are issues which need to be addressed for 9.1, nor do I think
they can be properly addressed within the time frame of this CF.  
On the other hand, perhaps an addition to the README-SSI file is
warranted?
-Kevin


pgsql-hackers by date:

Previous
From: Jan Urbański
Date:
Subject: Re: wildcard search support for pg_trgm
Next
From: Alexander Korotkov
Date:
Subject: Re: wildcard search support for pg_trgm