Re: should vacuum's first heap pass be read-only? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: should vacuum's first heap pass be read-only?
Date
Msg-id CA+TgmoY233jGJphik-hLb56JEDpW0Bks23zi8rq-jmAyiF-L3Q@mail.gmail.com
Whole thread Raw
In response to Re: should vacuum's first heap pass be read-only?  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: should vacuum's first heap pass be read-only?  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Tue, Apr 5, 2022 at 4:30 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Apr 5, 2022 at 1:10 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > I had assumed that this would not be the case, because if the page is
> > being accessed by the workload, it can be pruned - and probably frozen
> > too, if we wanted to write code for that and spend the cycles on it -
> > and if it isn't, pruning and freezing probably aren't needed.
>
> [ a lot of things ]

I don't understand what any of this has to do with the point I was raising here.

> > > But, these same LP_DEAD-heavy tables *also* have a very decent
> > > chance of benefiting from a better index vacuuming strategy, something
> > > *also* enabled by the conveyor belt design. So overall, in either scenario,
> > > VACUUM concentrates on problems that are particular to a given table
> > > and workload, without being hindered by implementation-level
> > > restrictions.
> >
> > Well this is what I'm not sure about. We need to demonstrate that
> > there are at least some workloads where retiring the LP_DEAD line
> > pointers doesn't become the dominant concern.
>
> It will eventually become the dominant concern. But that could take a
> while, compared to the growth in indexes.
>
> An LP_DEAD line pointer stub in a heap page is 4 bytes. The smallest
> possible B-Tree index tuple is 20 bytes on mainstream platforms (16
> bytes + 4 byte line pointer). Granted deduplication makes this less
> true, but that's far from guaranteed to help. Also, many tables have
> way more than one index.
>
> Of course it isn't nearly as simple as comparing the bytes of bloat in
> each case. More generally, I don't claim that it's easy to
> characterize which factor is more important, even in the abstract,
> even under ideal conditions -- it's very hard. But I'm sure that there
> are routinely very large differences among indexes and the heap
> structure.

Yeah, I think we need to better understand how this works out.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Granting SET and ALTER SYSTE privileges for GUCs
Next
From: "Gunnar \"Nick\" Bluth"
Date:
Subject: Re: [PATCH] pg_stat_toast