Thread: Why doesn't Vacuum FULL update the VM

Why doesn't Vacuum FULL update the VM

From
Melanie Plageman
Date:
Hi,

I noticed that VACUUM FULL actually does freeze the tuples in the
rewritten table (heap_freeze_tuple()) but then it doesn't mark them
all visible or all frozen in the visibility map. I don't understand
why. It seems like it would save us future work.

Here is an example:

create extension pg_visibility;
drop table if exists foo;
create table foo(a int) with (autovacuum_enabled=false);
insert into foo select i%3 from generate_series(1,300)i;
update foo set a = 5 where a = 2;
select * from pg_visibility_map_summary('foo');
vacuum (verbose) foo;
select * from pg_visibility_map_summary('foo');
vacuum (full, verbose) foo;
select * from pg_visibility_map_summary('foo');

I don't see why the visibility map shouldn't be updated so that all of
the pages show all visible and all frozen for this relation after the
vacuum full.

- Melanie



Re: Why doesn't Vacuum FULL update the VM

From
Vik Fearing
Date:
On 9/1/23 21:34, Melanie Plageman wrote:
> Hi,
> 
> I noticed that VACUUM FULL actually does freeze the tuples in the
> rewritten table (heap_freeze_tuple()) but then it doesn't mark them
> all visible or all frozen in the visibility map. I don't understand
> why. It seems like it would save us future work.

I have often wondered this as well, but obviously I haven't done 
anything about it.

> I don't see why the visibility map shouldn't be updated so that all of
> the pages show all visible and all frozen for this relation after the
> vacuum full.

It cannot just blindly mark everything all visible and all frozen 
because it will copy over dead tuples that concurrent transactions are 
still allowed to see.
-- 
Vik Fearing




Re: Why doesn't Vacuum FULL update the VM

From
Peter Geoghegan
Date:
On Fri, Sep 1, 2023 at 12:34 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> I don't see why the visibility map shouldn't be updated so that all of
> the pages show all visible and all frozen for this relation after the
> vacuum full.

There was a similar issue with COPY FREEZE. It was fixed relatively
recently -- see commit 7db0cd21.

--
Peter Geoghegan



Re: Why doesn't Vacuum FULL update the VM

From
Melanie Plageman
Date:
On Fri, Sep 1, 2023 at 8:38 PM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Fri, Sep 1, 2023 at 12:34 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > I don't see why the visibility map shouldn't be updated so that all of
> > the pages show all visible and all frozen for this relation after the
> > vacuum full.
>
> There was a similar issue with COPY FREEZE. It was fixed relatively
> recently -- see commit 7db0cd21.

Thanks for digging that up for me!

My first thought after looking a bit at the vacuum full/cluster code
is that we could add an all_visible flag to the RewriteState and set
it to false in heapam_relation_copy_for_cluster() in roughly the same
cases as heap_page_is_all_visible(), then, if rwstate->all_visible is
true in raw_heap_insert(), when we need to advance to the next block,
we set the page all visible and update the VM. Either way, we reset
all_visible to true since we are advancing to the next block.

I wrote a rough outline of that idea in the attached patches. It
doesn't emit WAL for the VM update or handle toast tables or anything
(it is just a rough sketch), but I just wondered if this was in the
right direction.

- Melanie

Attachment