Re: synchronous_commit = remote_flush - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: synchronous_commit = remote_flush
Date
Msg-id CAEepm=1EbM7P4YUg0rPB6h1qS8gC2pT2+WpUO-L7MUg5w+gWCw@mail.gmail.com
Whole thread Raw
In response to Re: synchronous_commit = remote_flush  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: synchronous_commit = remote_flush  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Aug 19, 2016 at 6:30 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> I'm wondering if we've hit the point where trying to put all of this in a
> single GUC is a bad idea... changing that probably means a config
> compatibility break, but I don't think that's necessarily a bad thing at
> this point...

Aside from the (IMHO) slightly confusing way that "on" works, which is
the smaller issue I was raising in this thread, I agree that we might
eventually want to escape from the assumption that "local apply" (=
off), local flush, remote write, remote flush, remote apply happen in
that order and therefore a single linear control knob can describe
which of those to wait for.

Some pie-in-the-sky thoughts: we currently can't reach
"group-safe"[1], where you wait only for N servers to have the WAL in
memory (let's say that for us that means write but not flush): the
closest we can get is "1-safe and group-safe", using remote_write to
wait for the standbys to write (= "group-safe"), which implies local
flush (= "1-safe").  Now that'd be a terrible level to use unless your
recovery procedure included cluster-wide communication to straighten
things out, and without any such clusterware it makes a lot of sense
to have the master flush before sending, and I'm not actually
proposing we change that, I'm just speculating that someone might
eventually want it.  We also can't have standbys apply before they
flush; as far as I know there is no theoretical reason why that
shouldn't be allowed, except maybe for some special synchronisation
steps around checkpoint records so that recovery doesn't get too far
ahead.  That'd mirror what happens on the master more closely.
Imagine if you wanted to wait for your transaction to become visible
on certain other servers, but didn't want to wait for any disks:
that'd be the distributed equivalent of today's "off", but today's
"remote_apply" implies local flush and remote flush.  Or more likely
you'd want some combination: 2-safe or group-safe on some subset of
servers to satisfy your durability requirements, and applied on some
other perhaps larger subset of servers for consistency.  But this is
just water cooler handwaving.

[1] https://infoscience.epfl.ch/record/49936/files/WS03

-- 
Thomas Munro
http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: SP-GiST support for inet datatypes
Next
From: Christian Convey
Date:
Subject: Re: WIP: About CMake v2