Re: fallocate / posix_fallocate for new WAL file creation (etc...) - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date
Msg-id 1372697751.19747.51.camel@jdavis
Whole thread Raw
In response to Re: fallocate / posix_fallocate for new WAL file creation (etc...)  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: fallocate / posix_fallocate for new WAL file creation (etc...)
List pgsql-hackers
On Sun, 2013-06-30 at 18:55 -0400, Greg Smith wrote:
> This makes platform level testing a lot easier, thanks.  Attached is an 
> updated copy of that program with some error checking.  If the files it 
> creates already existed, the code didn't notice, and a series of write 
> errors happened.  If you set the test up right it's not a problem, but 
> it's better if a bad setup is caught.  I wrapped the whole test with a 
> shell script, also attached, which insures the right test sequence and 
> checks.

Thank you.

> That's glibc helpfully converting your call to posix_fallocate into 
> small writes, because the OS doesn't provide a better way in that 
> kernel.  It's not hard to imagine this being slower than what the WAL 
> code is doing right now.  I'm not worried about correctness issues 
> anymore, but my gut paranoia about this not working as expected on older 
> systems was justified.  Everyone who thought I was just whining owes me 
> a cookie.

So your theory is that it may be slower because there are twice as many
syscalls (one per 4K page rather than one per 8K page)? Interesting
observation.

> This is what I plan to benchmark specifically next.

In the interest of keeping this patch moving forward, do you have an
estimate for when this testing will be complete?

>   If the 
> posix_fallocate approach is actually slower than what's done now when 
> it's not getting kernel acceleration, which is the case on RHEL5 era 
> kernels, we might need to make the configure time test more complicated. 
>   Whether posix_fallocate is defined isn't sensitive enough; on Linux it 
> may be the case that this only is usable when fallocate() is also there.

I'd say that if posix_fallocate is slower than the existing code on
pretty much any platform, we shouldn't commit the patch at all. I would
be quite surprised if that was the case, however.

Regards,Jeff Davis





pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Optimizing pglz compressor
Next
From: Alvaro Herrera
Date:
Subject: Re: in-catalog Extension Scripts and Control parameters (templates?)