Home > mailing lists

Re: Performance implications of 8K pread()s - Mailing list pgsql-performance

From	Thomas Munro
Subject	Re: Performance implications of 8K pread()s
Date	July 11, 2023 20:12:45
Msg-id	CA+hUKG+Zr8FBi3pojzU3x4k5XyenHhZ_mdWQ7pKwGeFT2+Tq+Q@mail.gmail.com Whole thread Raw
In response to	Performance implications of 8K pread()s (Dimitrios Apostolou <jimis@gmx.net>)
Responses	Re: Performance implications of 8K pread()s (Thomas Munro <thomas.munro@gmail.com>) Re: Performance implications of 8K pread()s (Dimitrios Apostolou <jimis@gmx.net>)
List	pgsql-performance

Tree view

On Wed, Jul 12, 2023 at 1:11 AM Dimitrios Apostolou <jimis@gmx.net> wrote:
> Note that I suspect my setup being related, (btrfs compression behaving
> suboptimally) since the raw device can give me up to 1GB/s rate. It is however
> evident that reading in bigger chunks would mitigate such setup inefficiencies.
> On a system that reads are already optimal and the read rate remains the same,
> then bigger block size would probably reduce the sys time postgresql consumes
> because of the fewer system calls.

I don't know about btrfs but maybe it can be tuned to prefetch
sequential reads better...

> So would it make sense for postgres to perform reads in bigger blocks? Is it
> easy-ish to implement (where would one look for that)? Or must the I/O unit be
> tied to postgres' page size?

It is hard to implement.  But people are working on it.  One of the
problems is that the 8KB blocks that we want to read data into aren't
necessarily contiguous so you can't just do bigger pread() calls
without solving a lot more problems first.  The project at
https://wiki.postgresql.org/wiki/AIO aims to deal with the
"clustering" you seek plus the "gathering" required for non-contiguous
buffers by allowing multiple block-sized reads to be prepared and
collected on a pending list up to some size that triggers merging and
submission to the operating system at a sensible rate, so we can build
something like a single large preadv() call.  In the current
prototype, if io_method=worker then that becomes a literal preadv()
call running in a background "io worker" process, but it could also be
OS-specific stuff (io_uring, ...) that starts an asynchronous IO
depending on settings.  If you take that branch and run your test you
should see 128KB-sized preadv() calls.

pgsql-performance by date:

From: Philip Semanchuk
Date: 11 July 2023, 19:07:26
Subject: Entire index scanned, but only when in SQL function?

From: Thomas Munro
Date: 11 July 2023, 20:22:42
Subject: Re: Performance implications of 8K pread()s

Re: Performance implications of 8K pread()s - Mailing list pgsql-performance

Previous

Next