Home > mailing lists

Re: Synchronous replay take III - Mailing list pgsql-hackers

From	Dmitry Dolgov
Subject	Re: Synchronous replay take III
Date	November 30, 2018 23:06:57
Msg-id	CA+q6zcV6463xcLXhN9F1p0vLie8JRs=2jZ27XDW5x_62SUesBg@mail.gmail.com Whole thread Raw
In response to	Re: Synchronous replay take III (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses	Re: Synchronous replay take III (Thomas Munro <thomas.munro@enterprisedb.com>)
List	pgsql-hackers

Tree view

> On Thu, Nov 15, 2018 at 6:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Thu, Mar 1, 2018 at 10:40 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
> >
> > In previous threads[1][2][3] I called this feature proposal "causal
> > reads".  That was a terrible name, borrowed from MySQL.  While it is
> > probably a useful term of art, for one thing people kept reading it as
> > "casual"

Yeah, that was rather annoying that I couldn't get rid of this while playing
with the "take II" version :)

> To be clear what did you mean read-mostly workloads?
>
> I think there are two kind of reads on standbys: a read happend after
> writes and a directly read (e.g. reporting). The former usually
> requires the causal reads as you mentioned in order to read its own
> writes but the latter might be different: it often wants to read the
> latest data on the master at the time. IIUC even if we send a
> read-only query directly to a synchronous replay server we could get a
> stale result if the standby delayed for less than
> synchronous_replay_max_lag. So this synchronous replay feature would
> be helpful for the former case(i.e. a few writes and many reads wants
> to see them) whereas for the latter case perhaps the keeping the reads
> waiting on standby seems a reasonable solution.
>
> Also I think it's worth to consider the cost both causal reads *and*
> non-causal reads.
>
> I've considered a mixed workload (transactions requiring causal reads
> and transactions not requiring it) on the current design. IIUC the
> current design seems like that we create something like
> consistent-reads group by specifying servers. For example, if a
> transaction doesn't want to causality read it can send query any
> server with synchronous_replay = off but if it wants, it should select
> a synchronous replay server. It also means that client applications or
> routing middlewares such as pgpool is required to be aware of
> available synchronous replay standbys. That is, this design would cost
> the read-only transactions requiring causal reads. On the other hand,
> in token-based causal reads we can send read-only query any standbys
> if we can wait for the change to be replayed. Of course if we don't
> wait forever we can timeout and switch to either another standby or
> the master to execute query but we don't need to choose a server of
> standby servers.

Unfortunately, cfbot says that patch can't be applied without conflicts, could
you please post a rebased version and address commentaries from Masahiko?

pgsql-hackers by date:

From: Fabien COELHO
Date: 30 November 2018, 23:04:11
Subject: Re: pgbench doc fix

From: Dmitry Dolgov
Date: 30 November 2018, 23:08:14
Subject: Re: Range phrase operator in tsquery

Re: Synchronous replay take III - Mailing list pgsql-hackers

Previous

Next