Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From John Naylor
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id CAFBsxsE4yxzj=vdbWVgEZsr3u12qw6SdXgRLr7uVGyqdnrxpCg@mail.gmail.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: New strategies for freezing, advancing relfrozenxid early
List pgsql-hackers
On Wed, Sep 14, 2022 at 12:53 AM Peter Geoghegan <pg@bowt.ie> wrote:

> This is still only scratching the surface of what is possible with
> dead_items. The visibility map snapshot concept can enable a far more
> sophisticated approach to resource management in vacuumlazy.c. It
> could help us to replace a simple array of item pointers (the current
> dead_items array) with a faster and more space-efficient data
> structure. Masahiko Sawada has done a lot of work on this recently, so
> this may interest him.

I don't quite see how it helps "enable" that. It'd be more logical to
me to say the VM snapshot *requires* you to think harder about
resource management, since a palloc'd snapshot should surely be
counted as part of the configured memory cap that admins control.
(Commonly, it'll be less than a few dozen MB, so I'll leave that
aside.) Since Masahiko hasn't (to my knowlege) gone as far as
integrating his ideas into vacuum, I'm not sure if the current state
of affairs has some snag that a snapshot will ease, but if there is,
you haven't described what it is.

I do remember your foreshadowing in the radix tree thread a while
back, and I do think it's an intriguing idea to combine pages-to-scan
and dead TIDs in the same data structure. The devil is in the details,
of course. It's worth looking into.

> VM snapshots could also make it practical for the new data structure
> to spill to disk to avoid multiple index scans/passed by VACUUM.

I'm not sure spilling to disk is solving the right problem (as opposed
to the hash join case, or to the proposed conveyor belt system which
has a broader aim). I've found several times that a customer will ask
if raising maintenance work mem from 1GB to 10GB will make vacuum
faster. Looking at the count of index scans, it's pretty much always
"1", so even if the current approach could scale above 1GB, "no" it
wouldn't help to raise that limit.

Your mileage may vary, of course.

Continuing my customer example, searching the dead TID list faster
*will* make vacuum faster. The proposed tree structure is more memory
efficient, and IIUC could scale beyond 1GB automatically since each
node is a separate allocation, so the answer will be "yes" in the rare
case the current setting is in fact causing multiple index scans.
Furthermore, it doesn't have to anticipate the maximum size, so there
is no up front calculation assuming max-tuples-per-page, so it
automatically uses less memory for less demanding tables.

(But +1 for changing that calculation for as long as we do have the
single array.)

--
John Naylor
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Ekaterina Kiryanova
Date:
Subject: Inconsistencies in error messages
Next
From: John Naylor
Date:
Subject: Re: Inconsistencies in error messages