Re: Re: [COMMITTERS] pgsql: Use a latch to make startup process wake up and replay - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Re: [COMMITTERS] pgsql: Use a latch to make startup process wake up and replay
Date
Msg-id 1284707190.1733.2961.camel@ebony
Whole thread Raw
In response to Re: Re: [COMMITTERS] pgsql: Use a latch to make startup process wake up and replay  (Fujii Masao <masao.fujii@gmail.com>)
Responses Configuring synchronous replication
List pgsql-hackers
On Fri, 2010-09-17 at 14:33 +0900, Fujii Masao wrote:
> On Thu, Sep 16, 2010 at 4:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > We definitely have the time, so the question is, what are the best
> > ideas?
> 
> Before advancing the review of each patch, we must determine what
> should be committed in 9.1, and what's in this CF.

Thank you for starting the discussion.

> "Synchronization level on per-transaction" feature is included in Simon's
> patch, but not in mine. This is most important difference

Agreed. It's also a very important option for users.

> which would
> have wide-reaching impact on the implementation, e.g., protocol between
> walsender and walreceiver. So, at first we should determine whether we'll
> commit the feature in 9.1. Then we need to determine how far we should
> implement in this CF. Thought?

Yes, sync rep specified per-transaction changes many things at a low
level. Basically, we have a choice of two mostly incompatible
implementations, plus some other options common to both.

There is no danger that we won't commit in 9.1. We have time for
discussion and thought. We also have time for performance testing and
since many of my design proposals are performance related that seems
essential to properly reviewing the patches.

I don't think we can determine how far to implement without considering
both approaches in detail. With regard to your points below, I don't
think any of those points could be committed first.

> Each patch provides "synchronization level on per-standby" feature. In
> Simon's patch, that level is specified in the standbys's recovery.conf.
> In mine, it's in the master's standbys.conf. I think that the former is simpler.
> But if we support the capability to register the standbys, the latter would
> be required. Which is the best?

Either approach is OK for me. Providing both options is also possible.
My approach was just less code and less change to existing mechanisms,
so I did it that way.

There are some small optimisations possible on standby if the standby
knows what role it's being asked to play. It doesn't matter to me
whether we let standby tell master or master tell standby and the code
is about the same either way.

> Simon's patch seems to include simple quorum commit feature (correct
> me if I'm wrong). That is, when there are multiple synchronous standbys,
> the master waits until ACK has arrived from at least one standby. OTOH,
> in my patch, the master waits until ACK has arrived from all the synchronous
> standbys. Which should we choose? I think that we should commit my
> straightforward approach first, and enable the quorum commit on that.
> Thought?

Yes, my approach is simple. For those with Oracle knowledge, my approach
(first-reply-releases-waiter) is equivalent to Oracle's Maximum
Protection mode (= 'fsync' in my design). Providing even higher levels
of protection would not be the most common case.

Your approach of waiting for all replies is much slower and requires
more complex code, since we need to track intermediate states. It also
has additional complexities of behaviour, such as how long do we wait
for second acknowledgement when we already have one, and what happens
when a second ack is not received? More failure modes == less stable.
ISTM that it would require more effort to do this also, since every ack
needs to check all WAL sender data to see if it is the last ack. None of
that seems straightforward.

I don't agree we should commit your approach to that aspect.

In my proposal, such additional features would be possible as a plugin.
The majority of users would not this facility and the plugin leaves the
way open for high-end users that need this.

> Simon proposes to invoke walwriter in the standby. This is not included
> in my patch, but looks good idea. ISTM that this is not essential feature
> for synchronous replication, so how about detachmenting of the walwriter
> part from the patch and reviewing it independently?

I regard it as an essential feature for implementing 'recv' mode of sync
rep, which is the fastest mode. At present WALreceiver does all of
these: receive, write and fsync. Of those the fsync is the slowest and
increases response time significantly.

Of course 'recv' option doesn't need to be part of first commit, but
splitting commits doesn't seem likely to make this go quicker or easier
in the early stages. In particular, splitting some features out could
make it much harder to put back in again later. That point is why my
patch even exists.


I would like to express my regret that the main feature proposal from me
necessitates low level changes that cause our two patches to be in
conflict. Nobody should take this as a sign that there is a personal or
professional problem between Fujii-san and myself.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Serializable Snapshot Isolation
Next
From: Heikki Linnakangas
Date:
Subject: Configuring synchronous replication