Re: Need to know more about pg_test_fsync utility - Mailing list pgsql-general

From PGSQL DBA
Subject Re: Need to know more about pg_test_fsync utility
Date
Msg-id CAKaKWS9LMbTXYoeOHUtsukqJS47qR-9urCxFAN_sEQR43BahJA@mail.gmail.com
Whole thread Raw
In response to Re: Need to know more about pg_test_fsync utility  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Need to know more about pg_test_fsync utility  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-general
Hi Thomas,

Thank you for your reply. 

As you mentioned in question-8, "I'd investigate whether data is being cached unexpectedly, perhaps indicating that committed transactions be lost in a system crash event." So, I would like to know that if we configure the disk for the WALs with read+write disk cache then will it create any performance issue and show the attached output?

I also would like to know is there any best Practice from PostgreSQL which mentions what is the disk latency required for the WAL & DATA disk?





 

On Fri, 10 Dec 2021 at 10:56, Thomas Munro <thomas.munro@gmail.com> wrote:
On Fri, Dec 10, 2021 at 3:20 PM PGSQL DBA <pgsqldba.1987@gmail.com> wrote:
> 1) How to interpret the output of pg_test_fsync?

The main interesting area is probably the top section that compares
the different wal_sync_method settings.  For example, it's useful to
verify the claim that fdatasync() is faster than fsync() (because it
only flushes data, not meta-data like file modified time).  It may
also be useful for measuring the effects of different caching settings
on your OS and storage.  Unfortunately open_datasync is a bit
misleading; we don't actually use O_DIRECT with open_datasync anymore,
unless you set wal_level=minimal, which almost nobody ever does.

> 2) What is the meaning of ops/sec & usecs/op?

Number of times it managed to flush data to disk per second
sequentially, and the same information expressed as microseconds per
flush.

> 3) How does this utility work internally?

It just does a loop over some system calls, or to be more precise,

https://github.com/postgres/postgres/blob/master/src/bin/pg_test_fsync/pg_test_fsync.c

> 4) What is the IO pattern of this utility? serial/sequence IO or Multiple thread with Parallel IO?

Sequential, no threads.

> 5) Can we change the testing like FIO with multiple threads and parallel IO?

Nope.  This is a simple tool.  Fio is much more general and useful.

> 6) How a commit happened in the background  while executing this utility?

Nothing happens in the background, it uses synchronous system calls
from one thread.

> 7) How can we use this tool to measure the I/O issue?

It's a type of micro-benchmark that gives you an idea of a sort of
baseline you can expect from a single PostgreSQL session committing to
the WAL.

> 8) In which area or section in the output do we need to focus while troubleshooting I/O issues?

If PostgreSQL couldn't commit small sequential transactions about that
fast I'd be interested in finding out why, and if fdatasync is
performing faster than published/device IOPS suggest should be
possible then I'd investigate whether data is being cached
unexpectedly, perhaps indicating that committed transactions be lost
in a system crash event.

> 9) What is the meaning of “Non-sync’ed 8kB writes?

Calling the pwrite() system call, which writes into your operating
system's page cache but (usually) doesn't wait for any I/O.  Should be
somewhere north of 1 million/sec.
Attachment

pgsql-general by date:

Previous
From: Wicher
Date:
Subject: Re: Advice on using materialized views
Next
From: Thomas Munro
Date:
Subject: Re: Need to know more about pg_test_fsync utility