Home > mailing lists

Re: Non-linear Performance - Mailing list pgsql-general

From	Tom Lane
Subject	Re: Non-linear Performance
Date	May 31, 2002 11:05:33
Msg-id	18241.1022857534@sss.pgh.pa.us Whole thread Raw
In response to	Re: Non-linear Performance (Curt Sampson <cjs@cynic.net>)
Responses	Re: Non-linear Performance
List	pgsql-general

Tree view

Curt Sampson <cjs@cynic.net> writes:
> On Thu, 30 May 2002, Tom Lane wrote:
>> I guess that the smaller datasets would get proportionally more benefit
>> from kernel disk caching.

> Actually, I re-did the 100m row and 500m row queries from a cold
> start of the machine, and I still get the same results: 10 sec. vs
> 70 sec. (Thus, 7x as long to query only 5x as much data.) So I
> don't think caching is an issue here.

But even from a cold start, there would be cache effects within the
query, viz. fetching the same table block more than once when it is
referenced from different places in the index.  On the smaller table,
the block is more likely to still be in kernel cache when it is next
wanted.

On a pure random-chance basis, you'd not expect that fetching 5k rows
out of 100m would hit the same table block twice --- but I'm wondering
if the data was somewhat clustered.  Do the system usage stats on your
machine reflect the difference between physical reads and reads
satisfied from kernel buffer cache?

Or maybe your idea about extra seek time is correct.

            regards, tom lane

pgsql-general by date:

From: Doug Fields
Date: 31 May 2002, 10:51:48
Subject: Re: Non-linear Performance

From: Curt Sampson
Date: 31 May 2002, 11:42:59
Subject: Re: sort_mem sizing (Non-linear Performance)

Re: Non-linear Performance - Mailing list pgsql-general

Previous

Next