Re: Request for further clarification on synchronous_commit - Mailing list pgsql-docs

From Kasper Kondzielski
Subject Re: Request for further clarification on synchronous_commit
Date
Msg-id CAFv2VPQRT=8d2Q3ipTXZTaOdEY+taR8gBE77kL6dkk8gE09Xnw@mail.gmail.com
Whole thread Raw
In response to Re: Request for further clarification on synchronous_commit  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Request for further clarification on synchronous_commit  (Bruce Momjian <bruce@momjian.us>)
List pgsql-docs
> On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
> > Hi, thanks for the reply.
> >
> > To be honest I don't think it is better. Previously paragraph about
> > remote_apply was after paragraph about `on` and before remote_write which
> > followed natural order in terms of how strict these parameters are (i.e. how
> > strong are the guarantees they provide). Because of that I think that
> > remote_apply should return to its previous position.

> Uh, not really --- see below.

Ok, I see, thanks. Shouldn't we then stick to this order whenever possible (might be sometimes reversed).
So, in the proposed patch I would suggest putting remote_apply first. (Of course, before that we can mention that the default option is `on`, but without going to much into the details.)

> Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> fsync.  If you want to go in order of severity, with the most severe
> first, it is:
>
>        remote_apply
>        on
>        remote_write
>        local

Wouldn't the table be beneficial when it comes to highlighting these differences?
+-----------------------------+---------------------------------------------------------+
|                             | synchronous_commit                                      |
+-----------------------------+--------------+-------------------+--------------+-------+
| operation on standby server | remote_apply | on (remote_flush) | remote_write | local |
+-----------------------------+--------------+-------------------+--------------+-------+
| write to WAL                | Yes          | Yes               | Yes          | No    |
+-----------------------------+--------------+-------------------+--------------+-------+
| fsync                       | Yes          | Yes               | No           | No    |
+-----------------------------+--------------+-------------------+--------------+-------+
| apply WAL data              | Yes          | No                | No           | No    |
+-----------------------------+--------------+-------------------+--------------+-------+

> and this defines the 'on' behavior:
>
>        /* Define the default setting for synchronous_commit */
>        #define SYNCHRONOUS_COMMIT_ON   SYNCHRONOUS_COMMIT_REMOTE_FLUSH

Is there any valid reason to hide this behavior under `on` alias? In my opinion `remote_flush` does much better job with describing what it does. Maybe we could rename `on` to `remote_flush` but also create an alias `on=remote_flush` to keep backward compatibility? 

+         Finally, when set to <literal>remote_apply</literal>, commits
+         will wait until replies from the current synchronous standby(s)
+         indicate they have received the commit record of the transaction
+         and applied it, so that it has become visible to queries on the
+         standby(s), and also written to durable storage on the standbys.

"and also written to durable storage on the standbys." -> You mean flushed? Maybe it should be better to stick to cohesive terminology to not introduce any confusion.


> Well, there is a doc section that talks about WAL:
>
>        https://www.postgresql.org/docs/12/wal.html
>
> and other parts of the config docs that talk about WAL.

Yes, I know what is WAL for. I only don't get what kind of operation do you mean by 'WAL replay'. The only one thing which I can think of is the process of restoring database after a crash, when we apply changes from WAL to the data pages which haven't been flushed to the disk, but I don't think that this is that. Basically what I wonder is how can a WAL replay influence the transaction commit?

wt., 18 sie 2020 o 19:17 Bruce Momjian <bruce@momjian.us> napisał(a):
On Tue, Aug 18, 2020 at 10:58:51AM -0400, Bruce Momjian wrote:
> Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> fsync.  If you want to go in order of severity, with the most severe
> first, it is:
>
>       remote_apply
>       on
>       remote_write
>       local
>
> This is seen in the C enum ordering for synchronous_commit, but in
> reverse order:
>
>       typedef enum
>       {
>           SYNCHRONOUS_COMMIT_OFF,     /* asynchronous commit */
>           SYNCHRONOUS_COMMIT_LOCAL_FLUSH, /* wait for local flush only */
>           SYNCHRONOUS_COMMIT_REMOTE_WRITE,    /* wait for local flush and remote
>                                                * write */
>           SYNCHRONOUS_COMMIT_REMOTE_FLUSH,    /* wait for local and remote flush */
>           SYNCHRONOUS_COMMIT_REMOTE_APPLY /* wait for local flush and remote apply */
>       }           SyncCommitLevel;

Also, there is some logic to say that the postgresql.conf
synchronous_commit options list should be reordered from:

        #synchronous_commit = on                # synchronization level;
                                                # off, local, remote_write, remote_apply, or on

to

        #synchronous_commit = on                # synchronization level;
                                                # off, local, remote_write, on, or remote_apply

I think we should backpatch the doc changes, but maybe not the
postgresql.conf one --- I am not sure.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

pgsql-docs by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Request for further clarification on synchronous_commit
Next
From: PG Doc comments form
Date:
Subject: Create a Foreign Table for PostgreSQL CSV Logs