Re: Turning off HOT/Cleanup sometimes - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Turning off HOT/Cleanup sometimes
Date
Msg-id CANP8+jLCGAjL9GU8Qn-ojtzjJx-wtsp2tKkAcXfjRou-o2JC3A@mail.gmail.com
Whole thread Raw
In response to Re: Turning off HOT/Cleanup sometimes  (Andres Freund <andres@anarazel.de>)
Responses Re: Turning off HOT/Cleanup sometimes
List pgsql-hackers
On 16 April 2015 at 15:21, Andres Freund <andres@anarazel.de> wrote:
On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote:
> I think you're failing to consider that in the patch there is a
> distinction between read-only page accesses and page updates.  During a
> page update, HOT cleanup is always done even with the patch, so there
> won't be any additional bloat that would not be there without the
> patch.

That's not really true (and my benchmark upthread proves it). The fact
that hot pruning only happens when we can get a cleanup lock means that
we can end up with more pages that are full, if we prune on select less
often. Especially if SELECTs are more frequent than write accesses -
pretty darn common - the likelihood of SELECTs getting the lock is
correspondingly higher.

Your point that we *must* do *some* HOT cleanup on SELECTs is proven beyond question. Alvaro has not disputed that, ISTM you misread that. Pavan has questioned that point but the results upthread are there, he explains he hasn't read that yet. 

The only question is "how much cleanup on SELECT"? Having one SELECT hit 10,000 cleanups while another hits 0 creates an unfairness and unpredictability in the way we work. Maybe some people running a backup actually like the fact it cleans the database; others think that is a bad thing. Few people issuing large queries think it is good behaviour. Anybody running replication also knows that this causes a huge slam of WAL which can increase replication delay, which is a concern for HA.

That is how we arrive at the idea of a cleanup limit, further enhanced by a limit that applies only to dirtying clean blocks, which we have 4? recent votes in favour of.

I would personally be in favour of a parameter to control the limit, since whatever we chose is right/wrong depending upon circumstances. I am however comfortable with not having a parameter if people think it is hard to tune that, which I agree it would be, hence no parameter in the patch.
 

pgsql-hackers by date:

Previous
From: Jacek Wielemborek
Date:
Subject: Performance tuning assisted by a GUI application
Next
From: "David G. Johnston"
Date:
Subject: Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?