Re: SSI and Hot Standby - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: SSI and Hot Standby
Date
Msg-id 4D394AD3020000250003998D@gw.wicourts.gov
Whole thread Raw
In response to Re: SSI and Hot Standby  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jan 21, 2011 at 8:05 AM, Nicolas Barbier
>> This has been discussed before; in [1] I summarized:
>>
>> "IOW, one could say that the backup is consistent only if it were
>> never compared against the system as it continued running after
>> the dump took place."
> 
> But that's a pretty fair way to look at it, isn't it?  I mean, I
> guess it's a question of what you plan to use that backup for, but
> if it's disaster recovery, everything that happened after the dump
> is gone, so no such comparison will occur.  And that's probably
> the most common reason for taking a dump.
It's not, however, a reason for having a hot standby (versus a warm
standby or PITR backup).
> It occurs to me that focusing on how this is going to work on Hot
> Standby might be looking at the question too narrowly.  The
> general issue is - does this technique generalize to a distributed
> computing environment, with distributed transactions across
> multiple PostgreSQL databases?
No, and I can pretty much guarantee that you can't have such a
solution without blocking on all masters at commit time.  What
you're suggesting goes *way* beyond two phase commit, which just
guarantees the integrity rules of each database are honored and
that all transactions either commit or don't.  You're talking about
sharing lock information across high-latency communication links
which in SSI are communicated via LW locking.  Expect any such
generalized "and world peace!" solution to be rather slow.
> For example, what if the control record in Kevin's example is
> stored in another database, or on another server.  Or what if some
> tables are replicated via Slony?  I realize this is all outside
> the scope of the patch
Yep.  Again, the patch achieves true serializability with minimal
cost and *no blocking*.  Spend a few minutes thinking about how you
might coordinate what you propose, and you'll see it's going to
involve blocking based on waiting for messages from across the wire.
> but that's exactly the point: making this stuff work across
> multiple databases (even if they are replicas of each other) is
> much more complex than getting it to work on just one machine. 
> Even if we could agree on how to do it, coming up with some hack
> that can only ever possibly work in the Hot Standby case might not
> be the best thing to do.
I don't see it as a hack.  It's the logical extension of SSI onto
read only replicas.  If you're looking for something more than that
(as the above suggests), it's not a good fit; but I suspect that
there are people besides me who would want to use hot standby for
reporting and read only web access who would want a serializable
view.  What this proposal does is to say that there are two time
streams to look at on the standby -- how far along you are for
purposes of recovery, and how far along you are for purposes of
seeing a view of the data sure to be consistent the later state of
the master.  With SSI they can't be the same.  If someone wants them
to be, they could implement a traditional S2PL serializable mode,
complete with blocking and deadlocks, and then you'd have it
automatically on the replicas, because with S2PL the apparent order
of execution matches the commit order.
-Kevin


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: sepgsql contrib module
Next
From: Bruce Momjian
Date:
Subject: Re: ToDo List Item - System Table Index Clustering