On Fri, 2005-02-25 at 00:34 -0800, Jeff Davis wrote:
> I had an idea that might improve parallel seqscans on the same relation.
>
> If you have lots of concurrent seqscans going on a large relation, the
> cache hit ratio is very low. But, if the seqscans are concurrent on the
> same relation, there may be something to gain by starting a seqscan near
> the page being accessed by an already-in-progress seqscan, and wrapping
> back around to that start location. That would make some use of the
> shared buffers, which would otherwise just be cache pollution.
This is cool and was on my list of would-like-to-implement features.
It's usually known as Synchronised Scanning. AFAIK it is free of any
patent restriction: it has already been implemented by both Teradata and
RedBrick.
> This is the first time I've really modified the PG source code to do
> anything that looked promising, so this is more of a question than
> anything else. Is it promising? Is this a potentially good approach? I'm
> happy to post more test data and more documentation, and I'd also be
> happy to bring the code to production quality.
I'll be happy to help you do this, at least for design and code review.
I'll come back later with more detailed comments on your thoughts so
far.
> However, before I spend
> too much more time on that, I'd like to get a general response from a
> 3rd party to let me know if I'm off base.
Third party?
Best Regards, Simon Riggs