Re: Running a query twice to ensure cached results. - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Running a query twice to ensure cached results.
Date
Msg-id 20060613130019.GE19212@svana.org
Whole thread Raw
In response to Re: Running a query twice to ensure cached results.  ("Luke Lonergan" <llonergan@greenplum.com>)
Responses Re: Running a query twice to ensure cached results.  ("Luke Lonergan" <llonergan@greenplum.com>)
Re: Running a query twice to ensure cached results.  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Tue, Jun 13, 2006 at 05:46:23AM -0700, Luke Lonergan wrote:
> Martin,
>
> On 6/13/06 5:42 AM, "Martijn van Oosterhout" <kleptog@svana.org> wrote:
>
> > Is it possible it may have something to do with the hint bits? There's
> > are a bunch of bits in the header to deal with speeding up of MVCC
> > tests. Maybe changing those bits marks the page dirty and forces a
> > write?
>
> Yes, that's it.  Writing data twice (three times including WAL) is a Very
> Bad Thing for large data work - how should we fix it?

Well, I don't think you want to get rid of it entirely because
otherwise forevermore, every lookup in that table will require a check
to see if the transaction is committed. So at some point the hint bit
needs to be set and/or the xmin frozen (vacuum freeze does that for
example).

What you might be able to do is to reduce its effect. The thing that
occurs to me is to make hint bit changes only mark a page half-dirty.
If the page is evicted because the space is needed in the buffer cache,
it can be just dropped. However, the bgwriter will write it as normal.
Hence, setting hint bits will become a sort of background operation,
done when there's time.

It seems to me that if a large table is loaded in a single transaction,
the check for if the transaction is committed should be cheap because
it's checking the same transaction id over and over.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: timezones to own config file
Next
From: "Andrew Dunstan"
Date:
Subject: Re: CSV mode option for pg_dump