Home > mailing lists

Re: Disabling Heap-Only Tuples - Mailing list pgsql-hackers

From	Matthias van de Meent
Subject	Re: Disabling Heap-Only Tuples
Date	September 19, 2023 17:33:22
Msg-id	CAEze2WiDqwVsTa_Ln=wG535=GSNF+SKDeeTJAXT_vmTYsYptmg@mail.gmail.com Whole thread
In response to	Re: Disabling Heap-Only Tuples (Andres Freund <andres@anarazel.de>)
Responses	Re: Disabling Heap-Only Tuples
List	pgsql-hackers

Tree view

On Tue, 19 Sept 2023 at 18:56, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-09-19 18:30:44 +0200, Alvaro Herrera wrote:
> > This makes me think that maybe the logic needs to be a little more
> > complex to avoid the problem you describe: if an UPDATE is prevented
> > from being HOT because of this setting, but then it goes and consults
> > FSM and it gives the update a higher block number than the tuple's
> > current block (or it fails to give a block number at all so it is forced
> > to extend the relation), then the update should give up on that strategy
> > and use a HOT update after all.  (I have not read the actual patch;
> > maybe it already does this?  It sounds kinda obvious.)
>
> Yea, a setting like what's discussed here seems, uh, not particularly useful
> for achieving the goal of compacting tables.  I don't think guiding this
> through SQL makes a lot of sense. For decent compaction you'd want to scan the
> table backwards, and move rows from the end to earlier, but stop once
> everything is filled up. You can somewhat do that from SQL, but it's going to
> be awkward and slow.  I doubt you even want to use the normal UPDATE WAL
> logging.

We can't move tuples around (or, not that I know of) without using a
transaction ID to control the visibility of the two locations of that
tuple. Doing table compaction would thus likely require using
transactions to move these tuples around. Using a single backend and
bulk operations, it'll still lock each tuple that is being moved, and
that can be noticed by user DML queries. I'd rather make the user's
queries move the data around than this long-duration, locking
background operation.

> I think having explicit compaction support in VACUUM or somewhere similar
> would make sense, but I don't think the proposed GUC is a useful stepping
> stone.

The point of this GUC is that the compaction can happen organically in
the user's UPDATE workflow, so that there is no long locking operation
going on (as you would see with VACUUM FULL / CLUSTER / pg_repack).

> > > But without any kind of auto-tuning, in my opinion, it's a fairly poor
> > > feature. Sure, some people will get use out of it, if they're
> > > sufficiently knowledgeable and sufficiently determined. But I think
> > > for most people in most situations, it will be a struggle.
>
> Indeed. I think it'd often just explode table and index sizes, because HOT
> pruning won't be able to make usable space in pages anymore (due to dead
> items).

You seem to misunderstand the latest patch. It explicitly only blocks
local updates if the update can then move the new tuple to an earlier
page. If that is not possible, then it'll insert locally (assuming
that is still possible) and HOT can then still apply.

And yes, moving tuples to earlier pages will indeed increase index
bloat, because it does create dead tuples where previously we could've
applied HOT. But we do have VACUUM and REINDEX CONCURRENTLY to clean
that up without serious long-duration stop-the-world actions, while
the other builtin cleanup methods don't.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

pgsql-hackers by date:

From: Robert Haas
Date: 19 September 2023, 17:10:29
Subject: Re: Projection pushdown to index access method

From: Tom Lane
Date: 19 September 2023, 17:46:36
Subject: Re: Projection pushdown to index access method

Re: Disabling Heap-Only Tuples - Mailing list pgsql-hackers

Previous

Next