On Sat, Sep 25, 2010 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <gsstark@mit.edu> writes:
>> On Thu, Sep 23, 2010 at 4:08 PM, Kevin Grittner
>> <Kevin.Grittner@wicourts.gov> wrote:
>>> One place I'm particularly interested in using such a feature is in
>>> pg_dump. Without it we have the choice of using a SERIALIZABLE
>>> transaction, which might fail or cause failures (which doesn't seem
>>> good for a backup program) or using REPEATABLE READ (to get current
>>> snapshot isolation behavior), which might capture a view of the data
>>> which contains serialization anomalies.
>
>> I'm puzzled how pg_dump could possibly have serialization anomalies.
>
> At the moment, it can't. If this patch means that it can, that's going
> to be a mighty good reason not to apply the patch.
It certainly can, as can any other read-only transaction. This has
been discussed many times here before with detailed examples, mostly
by Kevin. T0 reads A and writes B. T1 then reads B and writes C. T0
commits. pg_dump runs. T1 commits. What is the fully serial order
of execution consistent with this chronology? Clearly, T1 must be run
before T0, since it doesn't see T0's update to B. But pg_dump sees
the effects of T0 but not T1, so T0 must be run before T1. Oops. Now
you might say that this won't be a problem for most people in
practice, and I think that's true, but it's still unserializable. And
pg_dump is the reason, because otherwise T1 then T0 would be a valid
serialization.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company