Re: Disabling Heap-Only Tuples - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: Disabling Heap-Only Tuples |
Date | |
Msg-id | CAEze2WiDqwVsTa_Ln=wG535=GSNF+SKDeeTJAXT_vmTYsYptmg@mail.gmail.com Whole thread Raw |
In response to | Re: Disabling Heap-Only Tuples (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Disabling Heap-Only Tuples
|
List | pgsql-hackers |
On Tue, 19 Sept 2023 at 18:56, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2023-09-19 18:30:44 +0200, Alvaro Herrera wrote: > > This makes me think that maybe the logic needs to be a little more > > complex to avoid the problem you describe: if an UPDATE is prevented > > from being HOT because of this setting, but then it goes and consults > > FSM and it gives the update a higher block number than the tuple's > > current block (or it fails to give a block number at all so it is forced > > to extend the relation), then the update should give up on that strategy > > and use a HOT update after all. (I have not read the actual patch; > > maybe it already does this? It sounds kinda obvious.) > > Yea, a setting like what's discussed here seems, uh, not particularly useful > for achieving the goal of compacting tables. I don't think guiding this > through SQL makes a lot of sense. For decent compaction you'd want to scan the > table backwards, and move rows from the end to earlier, but stop once > everything is filled up. You can somewhat do that from SQL, but it's going to > be awkward and slow. I doubt you even want to use the normal UPDATE WAL > logging. We can't move tuples around (or, not that I know of) without using a transaction ID to control the visibility of the two locations of that tuple. Doing table compaction would thus likely require using transactions to move these tuples around. Using a single backend and bulk operations, it'll still lock each tuple that is being moved, and that can be noticed by user DML queries. I'd rather make the user's queries move the data around than this long-duration, locking background operation. > I think having explicit compaction support in VACUUM or somewhere similar > would make sense, but I don't think the proposed GUC is a useful stepping > stone. The point of this GUC is that the compaction can happen organically in the user's UPDATE workflow, so that there is no long locking operation going on (as you would see with VACUUM FULL / CLUSTER / pg_repack). > > > But without any kind of auto-tuning, in my opinion, it's a fairly poor > > > feature. Sure, some people will get use out of it, if they're > > > sufficiently knowledgeable and sufficiently determined. But I think > > > for most people in most situations, it will be a struggle. > > Indeed. I think it'd often just explode table and index sizes, because HOT > pruning won't be able to make usable space in pages anymore (due to dead > items). You seem to misunderstand the latest patch. It explicitly only blocks local updates if the update can then move the new tuple to an earlier page. If that is not possible, then it'll insert locally (assuming that is still possible) and HOT can then still apply. And yes, moving tuples to earlier pages will indeed increase index bloat, because it does create dead tuples where previously we could've applied HOT. But we do have VACUUM and REINDEX CONCURRENTLY to clean that up without serious long-duration stop-the-world actions, while the other builtin cleanup methods don't. Kind regards, Matthias van de Meent Neon (https://neon.tech)
pgsql-hackers by date: