Re: Non-linear Performance - Mailing list pgsql-general

From Tom Lane
Subject Re: Non-linear Performance
Date
Msg-id 18241.1022857534@sss.pgh.pa.us
Whole thread Raw
In response to Re: Non-linear Performance  (Curt Sampson <cjs@cynic.net>)
Responses Re: Non-linear Performance
List pgsql-general
Curt Sampson <cjs@cynic.net> writes:
> On Thu, 30 May 2002, Tom Lane wrote:
>> I guess that the smaller datasets would get proportionally more benefit
>> from kernel disk caching.

> Actually, I re-did the 100m row and 500m row queries from a cold
> start of the machine, and I still get the same results: 10 sec. vs
> 70 sec. (Thus, 7x as long to query only 5x as much data.) So I
> don't think caching is an issue here.

But even from a cold start, there would be cache effects within the
query, viz. fetching the same table block more than once when it is
referenced from different places in the index.  On the smaller table,
the block is more likely to still be in kernel cache when it is next
wanted.

On a pure random-chance basis, you'd not expect that fetching 5k rows
out of 100m would hit the same table block twice --- but I'm wondering
if the data was somewhat clustered.  Do the system usage stats on your
machine reflect the difference between physical reads and reads
satisfied from kernel buffer cache?

Or maybe your idea about extra seek time is correct.

            regards, tom lane

pgsql-general by date:

Previous
From: Doug Fields
Date:
Subject: Re: Non-linear Performance
Next
From: Curt Sampson
Date:
Subject: Re: sort_mem sizing (Non-linear Performance)