Re: Need to know more about pg_test_fsync utility - Mailing list pgsql-general

From Thomas Munro
Subject Re: Need to know more about pg_test_fsync utility
Date
Msg-id CA+hUKGKk0PKS4d_ng2VCbE3qPBm9kMz3HFBgsacEsRUhetc0mQ@mail.gmail.com
Whole thread Raw
In response to Re: Need to know more about pg_test_fsync utility  (PGSQL DBA <pgsqldba.1987@gmail.com>)
Responses Re: Need to know more about pg_test_fsync utility  (PGSQL DBA <pgsqldba.1987@gmail.com>)
List pgsql-general
On Mon, Dec 13, 2021 at 3:04 PM PGSQL DBA <pgsqldba.1987@gmail.com> wrote:
> As you mentioned in question-8, "I'd investigate whether data is being cached unexpectedly, perhaps indicating that
committedtransactions be lost in a system crash event." So, I would like to know that if we configure the disk for the
WALswith read+write disk cache then will it create any performance issue and show the attached output? 

Which OS and file system are you running and what kind of storage
system is it?  If you can turn write caching on and off in your
storage layer without affecting the ops/sec that's a pretty good clue
that it's respecting the cache control commands that the OS sends.
The numbers you showed are reassuringly low.  Hmm, I wonder why you
have such a low number for non-sync'd writes (the last line).

I have a concrete example of using this tool to learn something about
an unfamiliar-to-me operating system, from this morning: I saw a
system that reported ~6k IOPS for open_datasync and only ~600 IOPS for
fdatasync.  With some limited knowledge of file systems I expect the
latter to involve sending a SYNCHRONIZE CACHE command to flush device
caches, and the former to do that, or use a more efficient FUA flag to
flush just particular writes and not the whole write cache.  I didn't
expect it to go 10x faster, so something had to be wrong.  With some
more investigation and reading I learned that the storage drivers I
was using do indeed ignore FUA cache control flags, so that
wal_sync_method is not crash safe; if you lose power and come back up,
you might lose a bunch of committed transactions from the end of the
WAL.  I turned off write caches in the storage device, and then
open_datasync became a lot slower, matching my expectations.

These are primitive tools, but can help to check some assumptions...

Obligatory remark: PostgreSQL 9.5 is out of support, please see
https://www.postgresql.org/support/versioning/.

> I also would like to know is there any best Practice from PostgreSQL which mentions what is the disk latency required
forthe WAL & DATA disk? 

No particular latency is required by PostgreSQL, and I don't have a
general answer to this.  Is it fast enough for you?  What sort of
workload is it running, OLTP, reporting, ...?



pgsql-general by date:

Previous
From: PGSQL DBA
Date:
Subject: Re: Need to know more about pg_test_fsync utility
Next
From: PGSQL DBA
Date:
Subject: Re: Need to know more about pg_test_fsync utility