Re: Visibility map, partial vacuums - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Visibility map, partial vacuums
Date
Msg-id 26361.1227467112@sss.pgh.pa.us
Whole thread Raw
In response to Re: Visibility map, partial vacuums  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Visibility map, partial vacuums  (Jeff Davis <pgsql@j-davis.com>)
Re: Visibility map, partial vacuums  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> I committed the changes to FSM truncation yesterday, that helps with the 
> truncation of the visibility map as well. Attached is an updated 
> visibility map patch.

I looked over this patch a bit ...

> 1. The bits in the visibility map are set in the 1st phase of lazy 
> vacuum. That works, but it means that after a delete or update, it takes 
> two vacuums until the bit in the visibility map is set. The first vacuum 
> removes the dead tuple, and only the second sees that there's no dead 
> tuples and sets the bit.

I think this is probably not a big issue really.  The point of this change
is to optimize things for pages that are static over the long term; one
extra vacuum cycle before the page is deemed static doesn't seem like a
problem.  You could even argue that this saves I/O because we don't set
the bit (and perhaps later have to clear it) until we know that the page
has stayed static across a vacuum cycle and thus has a reasonable
probability of continuing to do so.

A possible problem is that if a relation is filled all in one shot,
autovacuum would trigger a single vacuum cycle on it and then never have
a reason to trigger another; leading to the bits never getting set (or
at least not till an antiwraparound vacuum occurs).  We might want to
tweak autovac so that an extra vacuum cycle occurs in this case.  But
I'm not quite sure what a reasonable heuristic would be.

Some other points:

* ISTM that the patch is designed on the plan that the PD_ALL_VISIBLE
page header flag *must* be correct, but it's really okay if the backing
map bit *isn't* correct --- in particular we don't trust the map bit
when performing antiwraparound vacuums.  This isn't well documented.

* Also, I see that vacuum has a provision for clearing an incorrectly
set PD_ALL_VISIBLE flag, but shouldn't it fix the map too?

* It would be good if the visibility map fork were never created until
there is occasion to set a bit in it; this would for instance typically
mean that temp tables would never have one.  I think that
visibilitymap.c doesn't get this quite right --- in particular
vm_readbuf seems willing to create/extend the fork whether its extend
argument is true or not, so it looks like an inquiry operation would
cause the map fork to be created.  It should be possible to act as
though a nonexistent fork just means "all zeroes".

* heap_insert's all_visible_cleared variable doesn't seem to get
initialized --- didn't your compiler complain?

* You missed updating SizeOfHeapDelete and SizeOfHeapUpdate
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: portability of "designated initializers"
Next
From: Devrim GÜNDÜZ
Date:
Subject: Logging auto_explain outputs to another log file