Re: Asynchronous and "direct" IO support for PostgreSQL. - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Asynchronous and "direct" IO support for PostgreSQL.
Date
Msg-id CAM-w4HPpgd5TO2t1fXhNDg62EnYiY0aqW+Xa=MDkn0nFqOjrCA@mail.gmail.com
Whole thread Raw
In response to Asynchronous and "direct" IO support for PostgreSQL.  (Andres Freund <andres@anarazel.de>)
Responses Re: Asynchronous and "direct" IO support for PostgreSQL.
List pgsql-hackers
On Tue, 23 Feb 2021 at 05:04, Andres Freund <andres@anarazel.de> wrote:
>
> ## Callbacks
>
> In the core AIO pieces there are two different types of callbacks at the
> moment:
>
> Shared callbacks, which can be invoked by any backend (normally the issuing
> backend / the AIO workers, but can be other backends if they are waiting for
> the IO to complete). For operations on shared resources (e.g. shared buffer
> reads/writes, or WAL writes) these shared callback needs to transition the
> state of the object the IO is being done for to completion. E.g. for a shared
> buffer read that means setting BM_VALID / unsetting BM_IO_IN_PROGRESS.
>
> The main reason these callbacks exist is that they make it safe for a backend
> to issue non-blocking IO on buffers (see the deadlock section above). As any
> blocked backend can cause the IO to complete, the deadlock danger is gone.

So firstly this is all just awesome work and I have questions but I
don't want them to come across in any way as criticism or as a demand
for more work. This is really great stuff, thank you so much!

The callbacks make me curious about two questions:

1) Is there a chance that a backend issues i/o, the i/o completes in
some other backend and by the time this backend gets around to looking
at the buffer it's already been overwritten again? Do we have to
initiate I/O again or have you found a way to arrange that this
backend has the buffer pinned from the time the i/o starts even though
it doesn't handle the comletion?

2) Have you made (or considered making) things like sequential scans
(or more likely bitmap index scans) asynchronous at a higher level.
That is, issue a bunch of asynchronous i/o and then handle the pages
and return the tuples as the pages arrive. Since sequential scans and
bitmap scans don't guarantee to read the pages in order they're
generally free to return tuples from any page in any order. I'm not
sure how much of a win that would actually be since all the same i/o
would be getting executed and the savings in shared buffers would be
small but if there are mostly hot pages you could imagine interleaving
a lot of in-memory pages with the few i/os instead of sitting idle
waiting for the async i/o to return.



> ## Stats
>
> There are two new views: pg_stat_aios showing AIOs that are currently
> in-progress, pg_stat_aio_backends showing per-backend statistics about AIO.

This is impressive. How easy is it to correlate with system aio stats?


-- 
greg



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Asynchronous and "direct" IO support for PostgreSQL.
Next
From: Andres Freund
Date:
Subject: Re: Asynchronous and "direct" IO support for PostgreSQL.