Re: Proposal for 9.1: WAL streaming from WAL buffers - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Proposal for 9.1: WAL streaming from WAL buffers
Date
Msg-id 4C181654.4070703@agliodbs.com
Whole thread Raw
In response to Re: Proposal for 9.1: WAL streaming from WAL buffers  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Proposal for 9.1: WAL streaming from WAL buffers  (Josh Berkus <josh@agliodbs.com>)
Re: Proposal for 9.1: WAL streaming from WAL buffers  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
> I have yet to convince myself of how likely this is to occur.  I tried
> to reproduce this issue by crashing the database, but I think in 9.0
> you need an actual operating system crash to cause this problem, and I
> haven't yet set up an environment in which I can repeatedly crash the
> OS.  I believe, though, that in 9.1, we're going to want to stream
> from WAL buffers as proposed in the patch that started out this
> thread, and then I think this issue can be triggered with just a
> database crash.

Yes, but it still requires:

a) the master must crash with at least one transaction transmitted to
the slave an not yet fsync'd
b) the slave must not crash as well
c) the master must come back up without the slave ever having been
promoted to master

Note that (a) is fairly improbable to begin with due to both our
batching transactions into bundles for transmission, and network latency
vs. disk latency.

So, is it possible?  Yes.  Will it happen anywhere but the
highest-txn-rate sites one in 10,000 times?  No.

This means that we should look for a solution which does not penalize
the common case in order to close a very improbable hole, if such a
solution exists.

> In 9.0, I think we can fix this problem by (1) only streaming WAL that
> has been fsync'd and 

I don't think this is the best solution; it would be a noticeable
performance penalty on replication.  It also would potentially result in
data loss for the user; if the user fails over to the slave in the
corner case, they can "rescue" the in-flight transaction.  At the least,
this would need to become Yet Another Configuration Option.

>(2) PANIC-ing if the problem occurs anyway.  

The question is, is detecting out-of-order WAL records *sufficient* to
detect a failure?  I'm thinking there are possible sequences where there
would be no out-of-sequence, but the slave would still have a
transaction the master doesn't, which the user wouldn't know until a
page update corrupts their data.

> But
> in 9.1, with sync rep and the performance demands that entails, I
> think that we're going to need to rethink it.

All the more reason to avoid dealing with it now, if we can.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: [RRR] Reviewfest 2010-06 Plans and Call for Reviewers
Next
From: Josh Berkus
Date:
Subject: Re: Proposal for 9.1: WAL streaming from WAL buffers