Re: Running a query twice to ensure cached results. - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Running a query twice to ensure cached results.
Date
Msg-id 20060613124208.GC19212@svana.org
Whole thread Raw
In response to Re: Running a query twice to ensure cached results.  ("Luke Lonergan" <llonergan@greenplum.com>)
Responses Re: Running a query twice to ensure cached results.
List pgsql-hackers
On Tue, Jun 13, 2006 at 04:54:05AM -0700, Luke Lonergan wrote:
> > Experimental results here suggest that for larger tables Linux seems
> > to detect a seq-scan and not bother caching.   It's very reproducible
> > for me here to do a reboot and not see the full speedup on a seq_scan
> > until the third time I run a query.su
>
> What you are seeing is the now infamous "Postgres writes a table one more
> time after loading" behavior.
>
> Simon Riggs once dug into it to find the root cause, and I no longer recall
> exactly why, but after you've loaded data, the first seq scan will re-write
> some large portion of the data while doing the initial scan. This wreaks
> havoc on normal benchmarking practices.

Is it possible it may have something to do with the hint bits? There's
are a bunch of bits in the header to deal with speeding up of MVCC
tests. Maybe changing those bits marks the page dirty and forces a
write?

Have a ncie day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

pgsql-hackers by date:

Previous
From: Joachim Wieland
Date:
Subject: timezones to own config file
Next
From: "Luke Lonergan"
Date:
Subject: Re: Running a query twice to ensure cached results.