Re: Write visibility map during CLUSTER/VACUUM FULL - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Write visibility map during CLUSTER/VACUUM FULL
Date
Msg-id 20210830002642.GT26465@telsasoft.com
Whole thread Raw
In response to Re: Write visibility map during CLUSTER/VACUUM FULL  (Anna Akenteva <a.akenteva@postgrespro.ru>)
Responses Re: Write visibility map during CLUSTER/VACUUM FULL
Re: Write visibility map during CLUSTER/VACUUM FULL
List pgsql-hackers
On Mon, Jun 28, 2021 at 11:22:01AM +0300, Anna Akenteva wrote:
> On 2019-11-29 05:32, Michael Paquier wrote:
> > These comments are unanswered for more than 2 months, so I am marking
> > this entry as returned with feedback.
> 
> I'd like to revive this patch. To make further work easier, attaching a
> rebased version of the latest patch by Alexander.
> 
> As for having yet another copy of logic determining visibility: the
> visibility check was mainly taken from heap_page_is_all_visible(), so I
> refactored the code to make sure that logic is not duplicated. The updated
> patch is attached too.

I agree that it's strange that VACUUM(FREEZE) freezes tuples but not VM bits,
nor page-level PD_ALL_VISIBLE (which is implied by all frozen).  After FULL
runs (taking an exclusive lock on the table), it's necessary to then vacuum the
table again to get what's intended.

Rebased on f10f0ae420ee62400876ab34dca2c09c20dcd030.

And rephrased Anna's two independent/alternate patches as a 2nd patch on top of
the 1st, as that helps me to read it and reduces its total size.

I noticed in testing the patch that autovacuum is still hitting the relation
shortly after vacuum full.  This is due to n_ins_since_autovacuum, which is new
in pg13.  I don't know how to address that (or even if it should be addressed
at all).

Also, pg_class.relallvisible is not set unless vacuum/analyze is run again
(which is mitigated by the n_ins behavior above).  It seems like this might be
important: an plan which uses index-only scan might be missed in favour of
something else until autovacuum runs (it will use cost-based delay, and might
not be scheduled immediately, could be interrupted, or even diasbled).

I'm testing like this:
CREATE EXTENSION IF NOT EXISTS pg_visibility ; DROP TABLE t; CREATE TABLE t AS SELECT generate_series(1,99999); VACUUM
FULLt; ANALYZE t; SELECT n_ins_since_vacuum, n_tup_upd, n_tup_del, n_tup_hot_upd FROM pg_stat_all_tables WHERE
relname='t';SELECT relpages, relallvisible FROM pg_class WHERE oid='t'::regclass; SELECT COUNT(1), COUNT(1)FILTER(WHERE
all_visible='t')allvis, COUNT(1)FILTER(WHERE all_frozen='t') allfrz, COUNT(1)FILTER(WHERE pd_all_visible='t') allvispd
FROMpg_visibility('t');
 

-- 
Justin

Attachment

pgsql-hackers by date:

Previous
From: Yura Sokolov
Date:
Subject: jff: checksum algorithm is not as intended
Next
From: Michael Paquier
Date:
Subject: Re: [PATCH] Tab completion for ALTER TABLE … ADD …