Home > mailing lists

Re: Synchronized scans versus relcache reinitialization - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Synchronized scans versus relcache reinitialization
Date	May 30, 2012 23:50:22
Msg-id	21415.1338432596@sss.pgh.pa.us Whole thread Raw
In response to	Re: Synchronized scans versus relcache reinitialization (Jeff Davis <pgsql@j-davis.com>)
List	pgsql-hackers

Tree view

Jeff Davis <pgsql@j-davis.com> writes:
> On Sat, 2012-05-26 at 15:14 -0400, Tom Lane wrote:
>> 3. Having now spent a good deal of time poking at this, I think that the
>> syncscan logic is in need of more tuning, and I am wondering whether we
>> should even have it turned on by default.  It appears to be totally
>> useless for fully-cached-in-RAM scenarios, even if most of the relation
>> is out in kernel buffers rather than in shared buffers.  The best case
>> I saw was less than 2X speedup compared to N-times-the-single-client
>> case, and that wasn't very reproducible, and it didn't happen at all
>> unless I hacked BAS_BULKREAD mode to use a ring buffer size many times
>> larger than the current 256K setting (otherwise the timing requirements
>> are too tight for multiple backends to stay in sync --- a seqscan can
>> blow through that much data in a fraction of a millisecond these days,
>> if it's reading from kernel buffers).  The current tuning may be all
>> right for cases where you're actually reading from spinning rust, but
>> that seems to be a decreasing fraction of real-world use cases.

> Do you mean that the best case you saw ever was 2X, or the best case
> when the table is mostly in kernel buffers was 2X?

I was only examining a fully-cached-in-RAM case.

> I clearly saw better than 2X when the table was on disk, so if you
> aren't, we should investigate.

I don't doubt that syncscan can provide better than 2X speedup if you
have more than 2 concurrent readers for a syncscan traversing data
that's too big to fit in RAM.  What I'm questioning is whether such
cases represent a sufficiently large fraction of our userbase to justify
having syncscan on by default.  I would be happier about having it on
if it seemed to be useful for fully-cached scenarios, but it doesn't.

> One thing we could do is drive the threshold from effective_cache_size
> rather than shared_buffers, which was discussed during 8.3 development.

If we were going to do that, I think that we'd need to consider having
different thresholds for using bulkread access strategy and using
syncscan, because not using bulkread is going to blow out the
shared_buffers cache.  We originally avoided that on the grounds of
not wanting to have to optimize more than 2 behaviors, but maybe it's
time to investigate more.
        regards, tom lane

pgsql-hackers by date:

From: Jeff Davis
Date: 30 May 2012, 23:19:54
Subject: Re: Synchronized scans versus relcache reinitialization

From: Jeff Davis
Date: 30 May 2012, 23:52:49
Subject: Re: temporal support patch

Re: Synchronized scans versus relcache reinitialization - Mailing list pgsql-hackers

Previous

Next