Re: Emit fewer vacuum records by reaping removable tuples during pruning - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Emit fewer vacuum records by reaping removable tuples during pruning |
Date | |
Msg-id | CA+TgmoaMAxnm7kPW8auJ-T1-nZ+D8LrBxkAHZS75utvsk_ZfiA@mail.gmail.com Whole thread Raw |
In response to | Re: Emit fewer vacuum records by reaping removable tuples during pruning (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Emit fewer vacuum records by reaping removable tuples during pruning
Re: Emit fewer vacuum records by reaping removable tuples during pruning |
List | pgsql-hackers |
On Fri, Jan 5, 2024 at 3:34 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > Yes, attached is a patch set which does this. My previous patchset > already reduced the number of places we unlock the buffer and update > the freespace map in lazy_scan_heap(). This patchset combines the > lazy_scan_prune() and lazy_scan_noprune() FSM update cases. I also > have a version which moves the freespace map updates into > lazy_scan_prune() and lazy_scan_noprune() -- eliminating all of these > from lazy_scan_heap(). This is arguably more clear. But Andres > mentioned he wanted fewer places unlocking the buffer and updating the > FSM. Hmm, interesting. I haven't had time to study this fully today, but I think 0001 looks fine and could just be committed. Hooray for killing useless variables with dumb names. This part of 0002 makes me very, very uncomfortable: + /* + * Update all line pointers per the record, and repair fragmentation. + * We always pass no_indexes as true, because we don't know whether or + * not this option was used when pruning. This reduces the validation + * done on replay in an assert build. + */ + heap_page_prune_execute(buffer, true, redirected, nredirected, nowdead, ndead, nowunused, nunused); The problem that I have with this is that we're essentially saying that it's ok to lie to heap_page_prune_execute because we know how it's going to use the information, and therefore we know that the lie is harmless. But that's not how things are supposed to work. We should either find a way to tell the truth, or change the name of the parameter so that it's not a lie, or change the function so that it doesn't need this parameter in the first place, or something. I can occasionally stomach this sort of lying as a last resort when there's no realistic way of adapting the code being called, but that's surely not the case here -- this is a newborn parameter, and it shouldn't be a lie on day 1. Just imagine if some future developer thought that the no_indexes parameter meant that the relation actually didn't have indexes (the nerve of them!). I took a VERY fast look through the rest of the patch set and I think that the overall direction looks like it's probably reasonable, but that's a very preliminary conclusion which I reserve the right to revise after studying further. @Andres: Are you planning to review/commit any of this? Are you hoping that I'm going to do that? Somebody else want to jump in here? -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: