Re: [ADMIN] pg_basebackup blocking all queries with horrible performance - Mailing list pgsql-hackers
From | Magnus Hagander |
---|---|
Subject | Re: [ADMIN] pg_basebackup blocking all queries with horrible performance |
Date | |
Msg-id | CABUevEwh70t0t19PNpfGxPmXdcT82G61_pyZ5LtWvdf0Ptci3A@mail.gmail.com Whole thread Raw |
In response to | Re: [ADMIN] pg_basebackup blocking all queries with horrible performance (Fujii Masao <masao.fujii@gmail.com>) |
List | pgsql-hackers |
On Mon, Jul 2, 2012 at 8:17 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Jul 2, 2012 at 4:01 AM, Magnus Hagander <magnus@hagander.net> wrote: >> On Sun, Jul 1, 2012 at 7:14 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >>> >>> On Fri, Jun 29, 2012 at 7:22 PM, Magnus Hagander <magnus@hagander.net> wrote: >>>> On Wed, Jun 27, 2012 at 7:24 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >>>>> On Thu, Jun 21, 2012 at 3:18 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>>>>> On Wed, Jun 20, 2012 at 7:18 AM, Magnus Hagander <magnus@hagander.net> wrote: >>>>>>>>>> You agreed to add something like NOSYNC option into START_REPLICATION command? >>>>>>>>> >>>>>>>>> I'm on the fence. I was hoping somebody else would chime in with an >>>>>>>>> opinion as well. >>>>>>>> >>>>>>>> +1 >>>>>>> >>>>>>> Nobody else with any opinion on this? :( >>>>>> >>>>>> I don't think we really need a NOSYNC flag at this point. Just not >>>>>> setting the flush location in clients that make a point of flushing in >>>>>> a timely fashion seems fine. >>>>> >>>>> Okay, I'm in the minority, so I'm writing the patch that way. WIP >>>>> patch attached. >>>>> >>>>> In the patch, pg_basebackup background process and pg_receivexlog always >>>>> return invalid location as flush one, and will never become sync standby even >>>>> if their name is in synchronous_standby_names. The timing of their sending >>>> >>>> That doesn't match with the patch, afaics. The patch always sets the >>>> correct write location, which means it can become a remote_write >>>> synchronous standby, no? It will only send it back when timeout >>>> expires, but it will be sent back. >>> >>> No. Though correct write location is sent back, they don't become sync standby >>> because flush location is always invalid. While flush location is >>> invalid, the master >>> will never regard the remote server as sync one even if synchronous_commit is >>> set to remote_write. >> >> Oh. I wasn't aware of that part. >> >> >>>> I wonder if that might actually be a more reasonable mode of operation >>>> in general: >>>> >>>> * always send back the write position, at the write interval >>>> * always send back the flush position, when we're flushing (meaning >>>> when we switch xlog) >>>> >>>> have an option that makes it possible to: >>>> * always send back the write position as soon as it changes (making >>>> for a reasonable remote_write sync standby) >>>> * actually flush the log after each write instead of end of file >>>> (making for a reasonable full sync standby) >>>> >>>> meaning you'd have something like "pg_receivexlog --sync=write" and >>>> "pg_receivexlog --sync=flush" controlling it instead. >>> >>> Yeah, in this way, pg_receivexlog can become sync even if >>> synchronous_commit is on, which seems more useful. But >>> I'm thinking that the synchronous pg_receivexlog stuff should >>> be postponed to 9.3 because its patch seems to become too >>> big to apply at this beta stage. So, in 9.2, to fix the problem, >>> what about just applying the simple patch which prevents >>> pg_basebackup background process and pg_receivexlog from >>> becoming sync standby whatever synchronous_standby_names >>> and synchronous_commit are set to? >> >> Agreed. >> >> With the addition that we should set the write location, because >> that's very useful and per what you said above should be perfectly >> safe. >> >> >>>> And deal with the "user put * in synchronous_standby_names and >>>> accidentally got pg_receivexlog as the sync standby" by more clearly >>>> warning people not to use * for that parameter... Since it's simply >>>> dangerous :) >>> >>> Yep. >> >> What would be good wording? Something along the line of "Using the * >> entry is not recommended since it can lead to unexpected results when >> new standbys are added" or something like that? >> >> >>>>> the reply depends on the standby_message_timeout specified in -s option. So >>>>> the write position may lag behind the true position. >>>>> >>>>> pg_receivexlog accepts new option -S (better option character?). If this option >>>>> is specified, pg_receivexlog returns true flush position, and can become sync >>>>> standby. It sends back the reply to the master each time the write position >>>>> changes or the timeout passes. If synchronous_commit is set to remote_write, >>>>> synchronous replication to pg_receivexlog would work well. >>>> >>>> Yeah, I hadn't considered the remote_write mode, but I guess that's >>>> why you have to track the current write position across loads, which >>>> first confused me. >>> >>> The patch has to track the current write location to decide whether to send >>> back the reply to the master, IOW to know whether the write location >>> has changed, IOW to know whether we've already sent the reply about >>> the latest write location. >> >> Yeha, makes perfect sense. >> >> >>>> Looking at some other usecases for this, I wonder if we should also >>>> force a status message whenever we switch xlog files, even if we >>>> aren't running in sync mode, even if the timeout hasn't expired. I >>>> think that would be a reasonable thing to do, since you often want to >>>> track things based on files. >>> >>> You mean that the pg_receivexlog should send back the correct flush >>> location whenever it switches xlog files? >> >> No, I mean just send back a status message. Meaning that without >> specifiying the sync modes per above, it would send back the *write* >> location. This would be useful for tracking xlog filenames between >> master and pg_receivexlog, without extra delay. >> >>>>> The patch needs more documentation. But I think that it's worth reviewing the >>>>> code in advance, so I attached the WIP patch. Comments? Objections? >>>> >>>> Looking at the code, what exactly prompts the changes to the backend >>>> side? That seems unrelated? Are we actually considering picking a >>>> standby with InvalidXlogRecPtr as a sync standby today? >>>> >>>> Isn't it enough to just send the proper write and flush locations from >>>> the frontend? >>> >>> No, unless I'm missing something. >>> >>> The problem that we should address first is that the master can pick up >>> pg_basebackup background process and pg_receivexlog as a sync >>> standby even if they always return an invalid write and flush positions. >>> Since they don't send any correct write and flush positions, if they are >>> accidentally regarded as sync standby, all transactions can get blocked >>> infinitely. So the patch needed to change the walsender code so that it >>> doesn't pick up the remote server as sync one while its flush position >>> is invalid. >> >> Yeah, that is clearly wrong. I think I missed this behaviour, and got >> confused by the fact that the patch was trying to fix two different >> things - only one of which I was aware of. >> >> So yes, per above, let's isolate out this part as one patch and get >> that into 9.2, along with the "set the proper write location", but >> leave everything else for 9.3. > > Agreed. The attached patch always sets the correct write location and > prevents the remote server sending back invalid flush location from > becoming sync standby. Thanks, applied. I also put the prevent invalid flush location from becoming a sync standby part on 9.1 (required only a minor change - GetXLogReplayRecPtr() didn't take an argument back then). -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
pgsql-hackers by date: