Re: Extended Prefetching using Asynchronous IO - proposal and patch - Mailing list pgsql-hackers
From | johnlumby |
---|---|
Subject | Re: Extended Prefetching using Asynchronous IO - proposal and patch |
Date | |
Msg-id | BLU436-SMTP6AB1FF61C05A13A944357A3210@phx.gbl Whole thread Raw |
In response to | Re: Extended Prefetching using Asynchronous IO - proposal and patch (Claudio Freire <klaussfreire@gmail.com>) |
Responses |
Re: Extended Prefetching using Asynchronous IO - proposal
and patch
Re: Extended Prefetching using Asynchronous IO - proposal and patch |
List | pgsql-hackers |
On 05/30/14 09:36, Claudio Freire wrote:
>> I see no such a rule in the man pages of any of the functions involved. And it wouldn't matter anyway; the behavior is exactly the same if you aio_error() first.On Fri, May 30, 2014 at 4:15 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
You are completely correct and I was wrong.
For the case of different-process,
It is not the order of the aio calls that makes it work/not work,
it is whether it polls with a timeout.
I think I was confusing this with a rule concerning order for
calling aio_return.
We don't want polling... And even if we did, calling aio_suspend() in a way that's known to be broken, in a loop, is a pretty crappy way of polling.
I have made a change to the complete_aio functions
to use polling *only* if the caller is not the originator of the aio_read.
If it is the originator, it now calls aio_suspend with no timeout
(i.e. wait till complete).
The loop also now checks for the EINTR case which someone
pointed out.
In my test runs, with a debug message in FileCompleteaio to tell me
whether caller is or is not the originator of the aio_read,
I see > 99.8% of calls are from originator and only < 0.2% are not.
e.g. (samples from two backends)
different : 10
same : 11726
different : 38
same : 12105
new patch based on today 140531 is attached,
This improves one of my benchmarks by about 10% throughput,
and now shows an overall 23% improvement relative to existing code with posix_fadvise.
So hopefully this addresses your performance concern.
If you look at the new patch, you'll see that for the different-pid case,
I still call aio_suspend with a timeout.
As you or Claudio pointed out earlier, it could just as well sleep for the same timeout,
but the small advantage of calling aio_suspend is if the io completed just between
the aio_error returning EINPROGRESS and the aio_suspend call.
Also it makes the code simpler. In fact this change is quite small, just a few lines
in backend/storage/buffer/buf_async.c and backend/storage/file/fd.c
Based on this, I think it is not necessary to get rid of the polling altogether
(and in any case, as far as I can see, very difficult).
Well, as mentioned earlier, it is not broken. Whether it is efficient I am not sure. I have looked at the mutex in aio_suspend that you mentioned and I am not quite convinced that, if caller is not the original aio_read process, it renders the suspend() into an instant timeout. I will see if I can verify that.I don't see the point of pursuing this design further. Surely we don't want to use polling here, and you're relying on undefined behavior anyway. I'm pretty sure aio_return/aio_error won't work from a different process on all platforms, even if it happens to work on Linux. Even on Linux, it might stop working if the underlying implementation changes from the glibc pthread emulation to something kernel-based.
Good point. I have included the guts of your little test program
(modified to do polling) into the existing autoconf test program
that decides on the
#define USE_AIO_ATOMIC_BUILTIN_COMP_SWAP.
See config/c-library.m4.
I hope this goes some way to answer your concern about robustness,
as at least now if the implementation changes in some way that
renders the polling ineffective, it will be caught in configure.
Claudio, I am not quite sure if what I am submitting now isI'll try to do some measuring of performance with: a) git head b) git head + patch as-is c) git head + patch without aio_suspend in foreign processes (just re-read) d) git head + patch with a lwlock (or whatever works) instead of aio_suspend a-c will be the fastest, d might take some while. I'll let you know of the results as I get them.
quite the same as any of yours. As I promised before, but have
not yet done, I will package one or two of my benchmarks and
send them in.
Attachment
pgsql-hackers by date: