pgsql: read_stream: Only increase read-ahead distance when waiting for - Mailing list pgsql-committers

From Andres Freund
Subject pgsql: read_stream: Only increase read-ahead distance when waiting for
Date
Msg-id E1w9Fwo-0031ca-3C@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
read_stream: Only increase read-ahead distance when waiting for IO

This avoids increasing the distance to the maximum in cases where the I/O
subsystem is already keeping up. This turns out to be important for
performance for two reasons:

- Pinning a lot of buffers is not cheap. If additional pins allow us to avoid
  IO waits, it's definitely worth it, but if we can already do all the
  necessary readahead at a distance of 16, reading ahead 512 buffers can
  increase the CPU overhead substantially.  This is particularly noticeable
  when the to-be-read blocks are already in the kernel page cache.

- If the read stream is read to completion, reading in data earlier than
  needed is of limited consequences, leaving aside the CPU costs mentioned
  above. But if the read stream will not be fully consumed, e.g. because it is
  on the inner side of a nested loop join, the additional IO can be a serious
  performance issue. This is not that commonly a problem for current read
  stream users, but the upcoming work, to use a read stream to fetch table
  pages as part of an index scan, frequently encounters this.

Note that this commit would have substantial performance downsides without
earlier commits:

- Commit 6e36930f9aa, which avoids decreasing the readahead distance when
  there was recent IO, is crucial, as otherwise we very often would end up not
  reading ahead aggressively enough anymore with this commit, due to
  increasing the distance less often.

- "read stream: Split decision about look ahead for AIO and combining" is
  important as we would otherwise not perform IO combining when the IO
  subsystem can keep up.

- "aio: io_uring: Trigger async processing for large IOs" is important to
  continue to benefit from memory copy parallelism when using fewer IOs.

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Tested-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu
Discussion: https://postgr.es/m/CA+hUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BAdrZqKZkw@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/f63ca3379025ee4547865182da6cae14aec35d58

Modified Files
--------------
src/backend/storage/aio/read_stream.c | 89 ++++++++++++++++++++++++++---------
1 file changed, 68 insertions(+), 21 deletions(-)


pgsql-committers by date:

Previous
From: John Naylor
Date:
Subject: pgsql: Fix unused function warning on Arm platforms
Next
From: Thomas Munro
Date:
Subject: pgsql: aio: Simplify pgaio_worker_submit().