Re: Extended Prefetching using Asynchronous IO - proposal and patch - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Extended Prefetching using Asynchronous IO - proposal and patch
Date
Msg-id 53FB03CA.2070804@vmware.com
Whole thread Raw
In response to Re: Extended Prefetching using Asynchronous IO - proposal and patch  (johnlumby <johnlumby@hotmail.com>)
List pgsql-hackers
On 08/25/2014 12:49 AM, johnlumby wrote:
> On 08/19/14 18:27, Heikki Linnakangas wrote:
>> Please write the patch without atomic CAS operation. Just use a spinlock.
>
> Umm,   this is a new criticism I think.

Yeah. Be prepared that new issues will crop up as the patch gets slimmer 
and easier to review :-). Right now there's still so much chaff that 
it's difficult to see the wheat.

>   I use CAS for things other
> than locking,
> such as add/remove from shared queue.   I suppose maybe a spinlock on
> the entire queue
> can be used equivalently,  but with more code (extra confusion) and
> worse performance
> (coarser serialization).      What is your objection to using gcc's
> atomic ops?   Portability?

Yeah, portability.

Atomic ops might make sense, but at this stage it's important to slim 
down the patch to the bare minimum, so that it's easier to review. Once 
the initial patch is in, you can improve it with additional patches.

>> There's a patch in the commitfest to add support for that,
>
> sorry,  support for what? There are already spinlocks in postgresql,
> you mean some new kind?    please point me at it with hacker msgid or
> something.

Atomic ops: https://commitfest.postgresql.org/action/patch_view?id=1314

Once that's committed, you can use the new atomic ops in your patch. But 
until then, stick to spinlocks.

> On 08/19/14 19:10, Claudio Freire wrote:
>> On Tue, Aug 19, 2014 at 7:27 PM, Heikki Linnakangas
>> <hlinnakangas@vmware.com> wrote:
>>> Also, please split prefetching of regular index scans into a separate patch. ...
>> That patch already happened on the list, and it wasn't a win in many
>> cases. I'm not sure it should be proposed independently of this one.
>> Maybe a separate patch, but it should be considered dependent on this.
>>
>> I don't have an archive link at hand atm, but I could produce one later.
> Several people have asked to split this patch into several smaller ones
> and I
> have thought about it.     It would introduce some awkward dependencies.
> E.g.  splitting the scanner code  (index,  relation-heap) into separate
> patch
> from aio code would mean that the scanner patch becomes dependent
> on the aio patch.     They are not quite orthogonal.

Right now, please focus on the main AIO patch. That should show a 
benefit for bitmap heap scans too, so to demonstrate the benefits of 
AIO, you don't need to prefetch regular index scans. The main AIO patch 
can be written, performance tested, and reviewed without caring about 
regular index scans at all.

> The reason is that the scanners call a new function, DiscardBuffer(blockid)
> when aio is in use.     We can get around it by providing a stub for
> that function
> in the scanner patch,   but then there is some risk of someone getting the
> wrong version of that function in their build.     It just adds yet more
> complexity
> and breakage opportunities.

Regardless of the regular index scans, we'll need to discuss the new API 
of PrefetchBuffer and DiscardBuffer.

It would be simpler for the callers if you didn't require the 
DiscardBuffer calls. I think it would be totally feasible to write the 
patch that way. Just drop the buffer pin after the asynchronous read 
finishes. When the caller actually needs the page, it will call 
ReadBuffer which will pin it again. You'll get a little bit more bufmgr 
traffic that way, but I think it'll be fine in practice.

> One further comment concerning these BufferAiocb and aiocb control blocks
> being in shared memory :
>
> I explained above that the BufferAiocb must be in shared memory for
> wait/post.
> The aiocb does not necessarily have to be in shared memory,
> but since there is a one-to-one relationship between BufferAiocb and aiocb,
> it makes the code much simpler ,  and the operation much more efficient,
> if the aiocb is embedded in the BufferAiocb as I have it now.
> E.g. if the aiocb is in private-process memory,   then an additional
> allocation
> scheme is needed (fixed number?   palloc()in'g extra ones as needed?  ...)
> which adds complexity,

Yep, use palloc or a fixed pool. There's nothing wrong with that.

>  and probably some inefficiency since a shared
> pool is usually
> more efficient (allows higher maximum per process etc),   and there
> would have to be
> some pointer de-referencing from BufferAiocb to aiocb adding some
> (small) overhead.

I think you're falling into the domain of premature optimization. A few 
pointer dereferences are totally insignificant, and the amount of memory 
you're saving pales in comparison to other similar non-shared pools and 
caches we have (catalog caches, for example). And on the other side of 
the coin, with a shared pool you'll waste memory when async I/O is not 
used (e.g because everything fits in RAM), and when it is used, you'll 
have more contention on locks and cache lines when multiple processes 
use the same objects.

The general guideline in PostgreSQL is that everything is 
backend-private, except structures used to communicate between backends.

- Heikki




pgsql-hackers by date:

Previous
From: Bernd Helmle
Date:
Subject: Re: Hardening pg_upgrade
Next
From: Heikki Linnakangas
Date:
Subject: Re: implement subject alternative names support for SSL connections