Re: Turning off HOT/Cleanup sometimes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Turning off HOT/Cleanup sometimes
Date
Msg-id 3329.1389288097@sss.pgh.pa.us
Whole thread Raw
In response to Re: Turning off HOT/Cleanup sometimes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Turning off HOT/Cleanup sometimes  (Robert Haas <robertmhaas@gmail.com>)
Re: Turning off HOT/Cleanup sometimes  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jan 8, 2014 at 3:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> We also make SELECT clean up blocks as it goes. That is useful in OLTP
>> workloads, but it means that large SQL queries and pg_dump effectively
>> do much the same work as VACUUM, generating huge amounts of I/O and
>> WAL on the master, the cost and annoyance of which is experienced
>> directly by the user. That is avoided on standbys.

> On a pgbench workload, though, essentially all page cleanup happens as
> a result of HOT cleanups, like >99.9%.  It might be OK to have that
> happen for write operations, but it would be a performance disaster if
> updates didn't try to HOT-prune.  Our usual argument for doing HOT
> pruning even on SELECT cleanups is that not doing so pessimizes
> repeated scans, but there are clearly cases that end up worse off as a
> result of that decision.

My recollection of the discussion when HOT was developed is that it works
that way not because anyone thought it was beneficial, but simply because
we didn't see an easy way to know when first fetching a page whether we're
going to try to UPDATE some tuple on the page.  (And we can't postpone the
pruning, because the query will have tuple pointers into the page later.)
Maybe we should work a little harder on passing that information down.
It seems reasonable to me that SELECTs shouldn't be tasked with doing
HOT pruning.

> I'm not entirely wild about adding a parameter in this area because it
> seems that we're increasingly choosing to further expose what arguably
> ought to be internal implementation details.

I'm -1 for a parameter as well, but I think that just stopping SELECTs
from doing pruning at all might well be a win.  It's at least worthy
of some investigation.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Standalone synchronous master
Next
From: Robert Haas
Date:
Subject: Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL