Re: [HACKERS] Quorum commit for multiple synchronous replication. - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: [HACKERS] Quorum commit for multiple synchronous replication. |
Date | |
Msg-id | CAHGQGwE95S5GM9UZh0F3ef2D3iEwJ59skh=EwW5HmDJPe2aXog@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Quorum commit for multiple synchronous replication. (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [HACKERS] Quorum commit for multiple synchronous replication.
(Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
|
List | pgsql-hackers |
On Tue, Apr 18, 2017 at 7:02 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Tue, Apr 18, 2017 at 6:40 PM, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >> At Tue, 18 Apr 2017 14:58:50 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoBqSjUGx0LCDrjEDLB-yx2EvgLMdT8Nz4ZR_xpxrbMU+Q@mail.gmail.com> >>> On Tue, Apr 18, 2017 at 3:04 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >>> > On Wed, Apr 12, 2017 at 2:36 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >>> >> On Thu, Apr 6, 2017 at 4:17 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >>> >>> On Thu, Apr 6, 2017 at 10:51 AM, Noah Misch <noah@leadboat.com> wrote: >>> >>>> On Thu, Apr 06, 2017 at 12:48:56AM +0900, Fujii Masao wrote: >>> >>>>> On Wed, Apr 5, 2017 at 3:45 PM, Noah Misch <noah@leadboat.com> wrote: >>> >>>>> > On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote: >>> >>>>> >> Regarding this feature, there are some loose ends. We should work on >>> >>>>> >> and complete them until the release. >>> >>>>> >> >>> >>>>> >> (1) >>> >>>>> >> Which synchronous replication method, priority or quorum, should be >>> >>>>> >> chosen when neither FIRST nor ANY is specified in s_s_names? Right now, >>> >>>>> >> a priority-based sync replication is chosen for keeping backward >>> >>>>> >> compatibility. However some hackers argued to change this decision >>> >>>>> >> so that a quorum commit is chosen because they think that most users >>> >>>>> >> prefer to a quorum. >>> >>>>> >> >>> >>>>> >> (2) >>> >>>>> >> There will be still many source comments and documentations that >>> >>>>> >> we need to update, for example, in high-availability.sgml. We need to >>> >>>>> >> check and update them throughly. >>> >>>>> >> >>> >>>>> >> (3) >>> >>>>> >> The priority value is assigned to each standby listed in s_s_names >>> >>>>> >> even in quorum commit though those priority values are not used at all. >>> >>>>> >> Users can see those priority values in pg_stat_replication. >>> >>>>> >> Isn't this confusing? If yes, it might be better to always assign 1 as >>> >>>>> >> the priority, for example. >>> >>>>> > >>> >>>>> > [Action required within three days. This is a generic notification.] >>> >>>>> > >>> >>>>> > The above-described topic is currently a PostgreSQL 10 open item. Fujii, >>> >>>>> > since you committed the patch believed to have created it, you own this open >>> >>>>> > item. If some other commit is more relevant or if this does not belong as a >>> >>>>> > v10 open item, please let us know. Otherwise, please observe the policy on >>> >>>>> > open item ownership[1] and send a status update within three calendar days of >>> >>>>> > this message. Include a date for your subsequent status update. Testers may >>> >>>>> > discover new open items at any time, and I want to plan to get them all fixed >>> >>>>> > well in advance of shipping v10. Consequently, I will appreciate your efforts >>> >>>>> > toward speedy resolution. Thanks. >>> >>>>> > >>> >>>>> > [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com >>> >>>>> >>> >>>>> Thanks for the notice! >>> >>>>> >>> >>>>> Regarding the item (2), Sawada-san told me that he will work on it after >>> >>>>> this CommitFest finishes. So we would receive the patch for the item from >>> >>>>> him next week. If there will be no patch even after the end of next week >>> >>>>> (i.e., April 14th), I will. Let's wait for Sawada-san's action at first. >>> >>>> >>> >>>> Sounds reasonable; I will look for your update on 14Apr or earlier. >>> >>>> >>> >>>>> The items (1) and (3) are not bugs. So I don't think that they need to be >>> >>>>> resolved before the beta release. After the feature freeze, many users >>> >>>>> will try and play with many new features including quorum-based syncrep. >>> >>>>> Then if many of them complain about (1) and (3), we can change the code >>> >>>>> at that timing. So we need more time that users can try the feature. >>> >>>> >>> >>>> I've moved (1) to a new section for things to revisit during beta. If someone >>> >>>> feels strongly that the current behavior is Wrong and must change, speak up as >>> >>>> soon as you reach that conclusion. Absent such arguments, the behavior won't >>> >>>> change. >>> >>>> >>> >>>>> BTW, IMO (3) should be fixed so that pg_stat_replication reports NULL >>> >>>>> as the priority if quorum-based sync rep is chosen. It's less confusing. >>> >>>> >>> >>>> Since you do want (3) to change, please own it like any other open item, >>> >>>> including the mandatory status updates. >>> >>> >>> >>> I agree to report NULL as the priority. I'll send a patch for this as well. >>> >>> >>> >>> Regards, >>> >>> >>> >> >>> >> Attached two draft patches. The one makes pg_stat_replication.sync >>> >> priority report NULL if in quorum-based sync replication. To prevent >>> >> extra change I don't change so far the code of setting standby >>> >> priority. The another one improves the comment and documentation. If >>> >> there is more thing what we need to mention in documentation please >>> >> give me feedback. >>> > >>> > Attached is the modified version of the doc improvement patch. >>> > Barring any objection, I will commit this version. >>> >>> Thank you for updating the patch. >>> >>> > >>> > + In term of performance there is difference between two synchronous >>> > + replication method. Generally quorum-based synchronous replication >>> > + tends to be higher performance than priority-based synchronous >>> > + replication. Because in quorum-based synchronous replication, the >>> > + transaction can resume as soon as received the specified number of >>> > + acknowledgement from synchronous standby servers without distinction >>> > + of standby servers. On the other hand in priority-based synchronous >>> > + replication, the standby server that the primary server must wait for >>> > + is fixed until a synchronous standby fails. Therefore, if a server on >>> > + low-performance machine a has high priority and is chosen as a >>> > + synchronous standby server it can reduce performance for database >>> > + applications. >>> > >>> > This description looks misleading. A quorum-based sync rep is basically >>> > more efficient when there are multiple standbys in s_s_names and you want >>> > to replicate the transactions to some of them synchronously. I think that >>> > this assumption should be documented explicitly. So I modified this >>> > description. Please see the modified version in the attached patch. >>> >>> You're right. The modified version looks good to me, thanks. >> >> It looks better to me, too. But (even I'm not sure, of course) >> the sentences seem to need improvement. >> >> | <para> >> | Quorum-based synchronous replication is basically more >> | efficient than priority-based one when you specify multiple >> | standbys in <varname>synchronous_standby_names</> and want >> | to synchronously replicate transactions to two or more of >> | them. In the priority-based case, the replication master >> | must wait for a reply from the slowest standby in the >> | required number of standbys in priority order, which may >> | slower than the rest. > > I supposed that Fujii-san pointed out that quorum-based sync > replication could be more efficient when we want to replicate the > transaction to "part of" standbys listed in s_s_names. Yes. Anyway, I pushed the patch except this paragraph. Regarding this paragraph, the patch for better descriptions is welcome. Regards, -- Fujii Masao
pgsql-hackers by date: