Re: SSI and Hot Standby - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: SSI and Hot Standby
Date
Msg-id 4D385ABC0200002500039914@gw.wicourts.gov
Whole thread Raw
In response to Re: SSI and Hot Standby  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: SSI and Hot Standby  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
> On 20.01.2011 03:05, Kevin Grittner wrote:
>> If we don't do something like this, do we just provide REPEATABLE
>> READ on the standby as the strictest level of transaction
>> isolation?  If so, do we generate an error on a request for
>> SERIALIZABLE, warn and provide degraded behavior, or just quietly
>> give them REPEATABLE READ behavior?
> 
> +1 for generating an error.
Before I go do that, I want to be sure everyone is clear about the
state of things.
If SSI is used to provide data integrity on the master, it will
prevent any serialization anomalies from being persisted on any hot
standby *long term*.  For example, at any point where the standby is
at a point in the transaction stream where there were no read/write
transaction active, no anomalies can be observed.  (That isn't the
*only* time; it's just the simplest one to describe as an example.) 
Queries on the standby can, however, see *transient* anomalies when
they run queries which would cause a serialization failure if run on
the master at the same point in the transaction stream.  This can
only occur when, of two concurrent transactions, the one which
*appears* to run second because the other can't read what it wrote,
*commits* first.
The most common and alarming situation where this occurs, in my
opinion, is batch processing.  This is extremely common in financial
applications, and tends to show up in a lot of other places, too. 
(The receipting query set is an instance of this type of problem,
but I'm going to keep it more general in hopes that people can see
where it impacts them.)  Imagine an application which has some small
control record in a table, and inserts to some other table are
assigned to a batch based on the control record.  The batches are
normally identified by ascending dates or serial numbers. 
Periodically a new batch is opened and the old batch is closed by
updating a "current batch id" column in the control table.  If the
batch ID is updated and the transaction in which that update was
executed commits while a transaction which read the old batch ID is
still in flight, a read of the database will show that the batch is
closed, but if you look at the detail of the batch, it will not yet
be complete.
Under SSI, one of these transactions will be canceled to prevent
this.  Our implementation will always allow the update which closes
the batch to complete, and either the insert or the select of the
detail will be rolled back with a serialization failure, depending
on the timing the actions inside those transactions.  If the insert
fails, it can be retried, and will land in the new batch -- making
the list of the batch which omits it OK.  If the listing of the
batch details is canceled, it will be because the insert into the
old batch committed before it recognized the problem, so an
immediate retry of the select will see the complete batch contents.
A hot standby can't really take part in the predicate locking and
transaction cancellation on the master.
Dan and I have both come to the conclusion that the only reasonable
way to allow hot standby to work with SSI is for the WAL (when
wal_level = hot_standby) to contain information about which
snapshots develop which won't see such a state.  In the above
example, barring some throttling mechanism skipping these particular
snapshots, or other problematic conflicts around the same time, the
master would tell the standby that the snapshot before either of the
two problem transactions was OK, and then it would tell them that
the snapshot after both had committed was OK.  It would not suggest
using the snapshot available between the commit of the control
record update and the commit of the insert into the batch.
This seems to me to be not completely unrelated to the snapshot
synchronization patch.  It is clearly closely related to the READ
ONLY DEFERRABLE mode, which also looks for a snapshot which is
immune to serialization anomalies without predicate locking,
conflict detection, transaction cancellation, etc.  Melding these
two things with hot standby seems to be beyond what can reasonably
happen for 9.1 without delaying the release.
If someone is using one feature and not the other, they really don't
have a problem.  Like anyone else, if a hot standby user has been
using SERIALIZABLE mode under 9.0 or earlier, they will need to
switch to REPEATABLE READ.  A SERIALIZABLE user who doesn't set up
hot standby has no issue.  Opinions so far seem to be in favor of
reporting an error on the standby if SERIALIZABLE is requested, so
that people don't silently get less protection than they expect. 
The most annoying thing about that is that if the use would *like*
to use truly serializable transactions on the standby, and will do
so when they get it in 9.2, they must switch to REPEATABLE READ now,
and switch back to SERIALIZABLE with the next release.
So, based on a more complete description of the issues, any more
opinions on whether to generate the error, as suggested by Heikki? 
Does anyone think this justifies the compatibility GUC as suggested
by Jeff?  It seems to me that this deserved documentation in the
MVCC chapter under both the "Serializable Isolation Level" and
"Enforcing Consistency With Serializable Transactions" sections.  I
think it probably deserves a note in the SET TRANSACTION reference
page, too.  Agreed?  Anywhere else?
-Kevin


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: estimating # of distinct values
Next
From: Tomas Vondra
Date:
Subject: Re: estimating # of distinct values