Re: Commitfest patches - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Commitfest patches
Date
Msg-id 47ED35D2.4000201@enterprisedb.com
Whole thread Raw
In response to Re: Commitfest patches  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-hackers
Gregory Stark wrote:
> I want to know if we're interested in the more invasive patch restructuring
> the buffer manager. My feeling is that we probably are eventually. But I
> wonder if people wouldn't feel more comfortable taking baby steps at first
> which will have less impact in cases where it's not being heavily used.
> 
> It's a lot more work to do the invasive patch and there's not much point in
> doing it if we're not interested in taking it when it's ready.

I like the simplicity of this patch. My biggest worry is the impact on 
performance when the posix_fadvise calls are not helping. I'd like to 
see some testing of that. How expensive are the extra bufmgr hash table 
lookups, compared to all the other stuff that's involved in a bitmap 
heap scan?

Another thing I wonder is whether this approach can easily be extended 
to other types of scans than bitmap heap scans. Normal index scans seem 
most interesting; they can generate a lot of random I/O. I don't see any 
big problems there, except again the worst-case performance: If you 
issue AdviseBuffer calls for all the heap tuples in an index page, in 
the naive way, you can issue hundreds of posix_fadvise calls. But what 
if they're all actually on the same page? Surely you only want to go 
through the buffer manager once (like we do in the scan, thanks to 
ReleaseAndReadBuffer()) in that case, and only call posix_fadvise once. 
But again, maybe you can convince me that it's a non-issue by some 
benchmarking.

> Here's an updated patch. It's mainly just a CVS update but it also includes
> some window dressing: an autoconf test, some documentation, and some fancy
> arithmetic to auto-tune the amount of prefetching based on a more real-world
> parameter "effective_spindle_count". 

I like the "effective_spindle_count". That's very intuitive.

The formula to turn that into preread_pages looks complex, though. I can 
see the reasoning behind it, but I doubt thing behave that beautifully 
in the real world. (That's not important right now, we have plenty of 
time to tweak that after more testing.)

> It also includes printing out how much
> the prefetching target got ramped up to in the explain output -- though I'm
> not sure we really want that in the tree, but it's nice for performance
> testing.

I don't understand the ramp up logic. Can you explain what that's for 
and how it works?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tomas Doran
Date:
Subject: Re: [PATCHES] Implemented current_query
Next
From: Alvaro Herrera
Date:
Subject: Re: [PATCHES] Implemented current_query