Re: [ADMIN] pg_basebackup blocking all queries with horrible performance - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: [ADMIN] pg_basebackup blocking all queries with horrible performance
Date
Msg-id CAHGQGwHAb1KdipmdTJsXKPp3H4oqiJFdn02jrgnnrfKMLY08Xw@mail.gmail.com
Whole thread Raw
In response to Re: [ADMIN] pg_basebackup blocking all queries with horrible performance  (Magnus Hagander <magnus@hagander.net>)
Responses Re: [ADMIN] pg_basebackup blocking all queries with horrible performance
List pgsql-hackers
On Tue, Jun 12, 2012 at 12:47 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Mon, Jun 11, 2012 at 5:37 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Mon, Jun 11, 2012 at 3:24 AM, Magnus Hagander <magnus@hagander.net> wrote:
>>> On Sun, Jun 10, 2012 at 6:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>> On Sun, Jun 10, 2012 at 11:45 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>>>> On Sun, Jun 10, 2012 at 4:29 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>>>> On Sun, Jun 10, 2012 at 11:10 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>>>>>> On Sun, Jun 10, 2012 at 4:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>>>>>> On Sun, Jun 10, 2012 at 10:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>>>>>>> On Sun, Jun 10, 2012 at 9:25 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>>>>>>>>> On Sun, Jun 10, 2012 at 7:43 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>>>>>>>>>> On Sat, Jun 9, 2012 at 2:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>>>>>>>>> Fujii Masao <masao.fujii@gmail.com> writes:
>>>>>>>>>>>>> This seems a bug. I think we should prevent pg_basebackup from
>>>>>>>>>>>>> becoming synchronous standby. Thought?
>>>>>>>>>>>>
>>>>>>>>>>>> Absolutely.  If we have replication clients that are not actually
>>>>>>>>>>>> capable of being standbys, there *must* be a way for the master
>>>>>>>>>>>> to know that.
>>>>>>>>>>>
>>>>>>>>>>> I thought we fixed this already by sending InvalidXlogRecPtr as flush
>>>>>>>>>>> location? And that this only applied in 9.2?
>>>>>>>>>>>
>>>>>>>>>>> Are you saying we picked pg_basebackup *in backup mode* (not log
>>>>>>>>>>> streaming) as synchronous standby?
>>>>>>>>>>
>>>>>>>>>> Yes.
>>>>>>>>>>
>>>>>>>>>>> If so then yes, that is
>>>>>>>>>>> *definitely* a bug that should be fixed. We should never select a
>>>>>>>>>>> connection that's not even streaming log as standby!
>>>>>>>>>>
>>>>>>>>>> Agreed. Attached patch prevents pg_basebackup from becoming sync
>>>>>>>>>> standby. Also this patch fixes another problem: currently only walsender
>>>>>>>>>> which reaches STREAMING state can become sync walsender. OTOH,
>>>>>>>>>> sync walsender thinks that walsender with higher priority will be sync one
>>>>>>>>>> whether its state is STREAMING, and switches to potential sync walsender.
>>>>>>>>>> So when the standby with higher priority connects to the master, we
>>>>>>>>>> might have no sync standby until it reaches the STREAMING state.
>>>>>>>>>> To fix this problem, the patch switches walsender's state from sync to
>>>>>>>>>> potential *after* walsender with higher priority has reached the
>>>>>>>>>> STREAMING state.
>>>>>>>>>>
>>>>>>>>>> We also should not select (1) background stream process forked from
>>>>>>>>>> pg_basebackup and (2) pg_receivexlog as sync standby because they
>>>>>>>>>> don't send back replication progress. To address this, I'm thinking to
>>>>>>>>>> introduce new option "NOSYNC" in "START_REPLICATION" command
>>>>>>>>>> as follows, and to change (1) and (2) so that they specify NOSYNC.
>>>>>>>>>>
>>>>>>>>>>    START_REPLICATION XXX/XXX [NOSYNC]
>>>>>>>>>>
>>>>>>>>>> If the standby specifies NOSYNC option, it's never assigned as sync
>>>>>>>>>> standby even if its name is in synchronous_standby_names. Thought?
>>>>>>>>>
>>>>>>>>> The standby which always sends InvalidXLogRecPtr back should not
>>>>>>>>> become sync one. So instead of NOSYNC option, by checking whether
>>>>>>>>> InvalidXLogRecPtr is sent, we can avoid problematic sync standby.
>>>>>>>>
>>>>>>>> We should not do this because Magnus is proposing the patch
>>>>>>>> (http://archives.postgresql.org/pgsql-hackers/2012-06/msg00348.php)
>>>>>>>> which breaks the above assumption at all. So we should introduce
>>>>>>>> something like NOSYNC option.
>>>>>>>
>>>>>>> Wouldn't the better choice there in that case be to give a switch to
>>>>>>> pg_receivexlog if you *want* it to be able to become a sync replica,
>>>>>>> and by default disallow it? And then keep the backend just treating
>>>>>>> InvalidXlogRecPtr as don't-become-sync-replica.
>>>>>>
>>>>>> I don't object to making pg_receivexlog as sync standby at all. So at least
>>>>>> for me, that switch is not necessary. What I'm worried about is the
>>>>>> background stream process forked from pg_basebackup. I think that
>>>>>> it should not run as sync standby but sending back its replication progress
>>>>>> seems helpful because a user can see the progress from pg_stat_replication.
>>>>>> So I'm thinking that something like NOSYNC option is required.
>>>>>
>>>>> On principle, no. By default, yes.
>>>>>
>>>>> How about:
>>>>> pg_basebackup background: *never* sends flush location, and therefor
>>>>> won't become sync replica
>>>>> pg_receivexlog *optionally* sends flush location. by defualt own't
>>>>> become sync replica, but can be made so with a switch
>>>>
>>>> Wouldn't a user who sees NULL in flush_location from pg_stat_replication
>>>> misunderstand that pg_receivexlog (in default mode) and pg_basebackup
>>>> background don't flush WAL files at all?
>>>
>>> That sounds like a "documentable issue".
>>>
>>> But maybe you're right, and we need the "never become sync" as a flag.
>>
>> You agreed to add something like NOSYNC option into START_REPLICATION command?
>
> I'm on the fence. I was hoping somebody else would chime in with an
> opinion as well.

+1

Regards,

--
Fujii Masao


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
Next
From: Alvaro Herrera
Date:
Subject: Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3