Home > mailing lists

Re: ice-broker scan thread - Mailing list pgsql-hackers

From	Gavin Sherry
Subject	Re: ice-broker scan thread
Date	November 28, 2005 23:53:55
Msg-id	Pine.LNX.4.58.0511291429330.18112@linuxworld.com.au Whole thread Raw
In response to	ice-broker scan thread (Qingqing Zhou <zhouqq@cs.toronto.edu>)
Responses	Re: ice-broker scan thread Re: ice-broker scan thread Re: ice-broker scan thread Re: ice-broker scan thread
List	pgsql-hackers

Tree view

On Mon, 28 Nov 2005, Qingqing Zhou wrote:

>
> I am considering add an "ice-broker scan thread" to accelerate PostgreSQL
> sequential scan IO speed. The basic idea of this thread is just like the
> "read-ahead" method, but the difference is this one does not read the data
> into shared buffer pool directly, instead, it reads the data into file
> system cache, which makes the integration easy and this is unique to
> PostgreSQL.
>

MySQL, Oracle and others implement read-ahead threads to simulate async IO
'pre-fetching'. I've been experimenting with two ideas. The first is to
increase the readahead when we're doing sequential scans (see prototype
patch using posix fadvise attached). I've not got any hardware at the
moment which I can test this patch on but I am waiting on some dbt-3
results which should indicate whether fadvise is a good idea or a bad one.

The second idea is using posix async IO at key points within the system
to better parallelise CPU and IO work. There areas I think we could use
async IO are: during sequential scans, use async IO to do pre-fetching of
blocks; inside WAL, begin flushing WAL buffers to disk before we commit;
and, inside the background writer/check point process, asynchronously
write out pages and, potentially, asynchronously build new checkpoint segments.

The motivation for using async IO is two fold: first, the results of this
paper[1] are compelling; second, modern OSs support async IO. I know that
Linux[2], Solaris[3], AIX and Windows all have async IO and I presume that
all their rivals have it as well.

The fundamental premise of the paper mentioned above is that if the
database is busy, IO should be busy. With our current block-at-a-time
processing, this isn't always the case. This is why Qingqing's read-ahead
thread makes sense. My reason for mailing is, however, that the async IO
results are more compelling than the read ahead thread.

I haven't had time to prototype whether we can easily implement async IO
but I am planning to work on it in December. The two main goals will be to
a) integrate and utilise async IO, at least within the executor context,
and b) build a primitive kind of scheduler so that we stop prefetching
when we know that there are a certain number of outstanding IOs for a
given device.

Thanks,

Gavin

[1] http://www.vldb2005.org/program/paper/wed/p1116-hall.pdf
[2] http://lse.sourceforge.net/io/aionotes.txt
[3] http://developers.sun.com/solaris/articles/event_completion.html - I'm
fairly sure they have a posix AIO wrapper around these routines, but I
cannot see it documented anywhere :-(

pgsql-hackers by date:

From: David Boreham
Date: 28 November 2005, 23:50:43
Subject: Re: ice-broker scan thread

From: Christopher Kings-Lynne
Date: 28 November 2005, 23:56:45
Subject: Re: ice-broker scan thread

Re: ice-broker scan thread - Mailing list pgsql-hackers

Previous

Next