Re: Do we need so many hint bits? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Do we need so many hint bits?
Date
Msg-id 2718.1353036070@sss.pgh.pa.us
Whole thread Raw
In response to Do we need so many hint bits?  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Do we need so many hint bits?  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Re: Do we need so many hint bits?  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Do we need so many hint bits?  (Jeff Davis <pgsql@j-davis.com>)
Re: Do we need so many hint bits?  (Merlin Moncure <mmoncure@gmail.com>)
Re: Do we need so many hint bits?  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
Jeff Davis <pgsql@j-davis.com> writes:
> It occurred to me recently that many of the hint bits aren't terribly
> important (at least it's not obvious to me). HEAP_XMIN_COMMITTED clearly
> has a purpose, and we'd expect it to be used many times following the
> initial CLOG lookup.

Right.

> But the other tuple hint bits seem to be there just for symmetry,
> because they shouldn't last long. If HEAP_XMIN_INVALID or
> HEAP_XMAX_COMMITTED is set, then it's (hopefully) going to be vacuumed
> soon, and gone completely. And if HEAP_XMAX_INVALID is set, then it
> should just be changed to InvalidTransactionId.

Hm.  It is not cheaper to change xmax to 0 than it is to set the hint
bit --- you still need a write, and there are also added locking and
atomicity worries --- so I'm not convinced by your argument there.
But you might be right that the expected number of wins from the other
two bits is a lot less.

> Removing those 3 hints would give us 3 more flag bits (eventually, after
> we are sure they aren't just leftover), and it would also reduce the
> chance that a page is dirtied for no other reason than to set them.

We aren't pressed for flag bits particularly.  I think the main
attraction of this idea is precisely to reduce unnecessary page dirties,
and so that leads me to suggest a variant: keep the four bits defined as
now, but do not attempt to set XMIN_INVALID or XMAX_COMMITTED unless the
page is already dirty.  This would make it a straight-up trade of more
clog consultation for fewer page dirties.

> Also, I am wondering about PD_ALL_VISIBLE. It was originally introduced
> in the visibility map patch, apparently as a way to know when to clear
> the VM bit when doing an update. It was then also used for scans, which
> showed a significant speedup. But I wonder: why not just use the
> visibilitymap directly from those places?

Added contention for access to the visibility map pages would be a
problem.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Do we need so many hint bits?
Next
From: Phil Sorber
Date:
Subject: Re: [WIP] pg_ping utility