Re: Turning off HOT/Cleanup sometimes - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Turning off HOT/Cleanup sometimes
Date
Msg-id 552E7FB3.7090801@iki.fi
Whole thread Raw
In response to Re: Turning off HOT/Cleanup sometimes  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Turning off HOT/Cleanup sometimes
List pgsql-hackers
On 04/15/2015 05:44 PM, Alvaro Herrera wrote:
> Simon Riggs wrote:
>> On 15 April 2015 at 09:10, Andres Freund <andres@anarazel.de> wrote:
>
>>> I don't really see the downside to this suggestion.
>>
>> The suggestion makes things better than they are now but is still less
>> than I have proposed.
>>
>> If what you both mean is "IMHO this is an acceptable compromise", I
>> can accept it also, at this point in the CF.
>
> Let me see if I understand things.
>
> What we have now is: when reading a page, we also HOT-clean it.  This
> runs HOT-cleanup a large number of times, and causes many pages to
> become dirty.
>
> Your patch is "when reading a page, HOT-clean it, but only 5 times in
> each scan".  This runs HOT-cleanup at most 5 times, and causes at most 5
> pages to become dirty.
>
> Robert's proposal is "when reading a page, if dirty HOT-clean it; if not
> dirty, also HOT-clean it but only 5 times in each scan".  This runs
> HOT-cleanup some number of times (as many as there are dirty), and
> causes at most 5 pages to become dirty.
>
>
> Am I right in thinking that HOT-clean in a dirty page is something that
> runs completely within CPU cache?  If so, it would be damn fast and
> would have benefits for future readers, for very little cost.

If there are many tuples on the page, it takes some CPU effort to scan 
all the HOT chains and move tuples around. Also, it creates a WAL 
record, which isn't free.

Another question is whether the patch can reliably detect whether it's 
doing a "read-only" scan or not. I haven't tested, but I suspect it'd 
not do pruning when you do something like "INSERT INTO foo SELECT * FROM 
foo WHERE blah". I.e. when the target relation is referenced twice in 
the same statement: once as the target, and second time as a source. 
Maybe that's OK, though.

- Heikki




pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [COMMITTERS] pgsql: Move pg_upgrade from contrib/ to src/bin/
Next
From: Sawada Masahiko
Date:
Subject: Re: Auditing extension for PostgreSQL (Take 2)