Re: Blocking I/O, async I/O and io_uring - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Blocking I/O, async I/O and io_uring |
Date | |
Msg-id | b43db30b-8657-7b6d-cef0-fd5520f4b132@oss.nttdata.com Whole thread Raw |
In response to | Blocking I/O, async I/O and io_uring (Craig Ringer <craig.ringer@enterprisedb.com>) |
List | pgsql-hackers |
On 2020/12/08 11:55, Craig Ringer wrote: > Hi all > > A new kernel API called io_uring has recently come to my attention. I assume some of you (Andres?) have been followingit for a while. > > io_uring appears to offer a way to make system calls including reads, writes, fsync()s, and more in a non-blocking, batchedand pipelined manner, with or without O_DIRECT. Basically async I/O with usable buffered I/O and fsync support. Ithas ordering support which is really important for us. > > This should be on our radar. The main barriers to benefiting from linux-aio based async I/O in postgres in the past hasbeen its reliance on direct I/O, the various kernel-version quirks, platform portability, and its maybe-async-except-when-it's-randomly-notnature. > > The kernel version and portability remain an issue with io_uring so it's not like this is something we can pivot over tocompletely. But we should probably take a closer look at it. > > PostgreSQL spends a huge amount of time waiting, doing nothing, for blocking I/O. If we can improve that then we couldpotentially realize some major increases in I/O utilization especially for bigger, less concurrent workloads. The mostobvious candidates to benefit would be redo, logical apply, and bulk loading. > > But I have no idea how to even begin to fit this into PostgreSQL's executor pipeline. Almost all PostgreSQL's code is synchronous-blocking-imperativein nature, with a push/pull executor pipeline. It seems to have been recognised for some timethat this is increasingly hurting our performance and scalability as platforms become more and more parallel. > > To benefit from AIO (be it POSIX, linux-aio, io_uring, Windows AIO, etc) we have to be able to dispatch I/O and do somethingelse while we wait for the results. So we need the ability to pipeline the executor and pipeline redo. > > I thought I'd start the discussion on this and see where we can go with it. What incremental steps can be done to moveus toward parallelisable I/O without having to redesign everything? > > I'm thinking that redo is probably a good first candidate. It doesn't depend on the guts of the executor. It is much lesssensitive to ordering between operations in shmem and on disk since it runs in the startup process. And it hurts REALLYBADLY from its single-threaded blocking approach to I/O - as shown by an extension written by 2ndQuadrant that candouble redo performance by doing read-ahead on btree pages that will soon be needed. > > Thoughts anybody? I was wondering if async I/O might be helpful for the performance improvement of walreceiver. In physical replication, walreceiver receives, writes and fsyncs WAL data. Also it does tasks like keepalive. Since walreceiver is a single process, for example, currently it cannot do other tasks while fsyncing WAL to the disk. OTOH, if walreceiver can do other tasks even while fsyncing WAL by using async I/O, ISTM that it might improve the performance of walreceiver. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
pgsql-hackers by date: