Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date | |
Msg-id | CAD21AoCbnN6JiXm4sWqAh5kH656oiHhi=HjFwtzKTr=g53YmnA@mail.gmail.com Whole thread Raw |
In response to | Re: [PoC] Improve dead tuple storage for lazy vacuum (John Naylor <john.naylor@enterprisedb.com>) |
Responses |
Re: [PoC] Improve dead tuple storage for lazy vacuum
(Masahiko Sawada <sawada.mshk@gmail.com>)
Re: [PoC] Improve dead tuple storage for lazy vacuum (John Naylor <john.naylor@enterprisedb.com>) |
List | pgsql-hackers |
On Fri, Mar 10, 2023 at 3:42 PM John Naylor <john.naylor@enterprisedb.com> wrote: > > On Thu, Mar 9, 2023 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > I've attached the new version patches. I merged improvements and fixes > > I did in the v29 patch. > > I haven't yet had a chance to look at those closely, since I've had to devote time to other commitments. I remember I wasn'tparticularly impressed that v29-0008 mixed my requested name-casing changes with a bunch of other random things. Separatingthose out would be an obvious way to make it easier for me to look at, whenever I can get back to this. I needto look at the iteration changes as well, in addition to testing memory measurement (thanks for the new results, theylook encouraging). Okay, I'll separate them again. > > > Apart from the memory measurement stuff, I've done another todo item > > on my list; adding min max classes for node3 and node125. I've done > > This didn't help us move us closer to something committable the first time you coded this without making sure it was agood idea. It's still not helping and arguably makes it worse. To be fair, I did speak positively about _considering_ additionalsize classes some months ago, but that has a very obvious maintenance cost, something we can least afford rightnow. > > I'm frankly baffled you thought this was important enough to work on again, yet thought it was a waste of time to try toprove to ourselves that autovacuum in a realistic, non-deterministic workload gave the same answer as the current tid lookup.Even if we had gone that far, it doesn't seem like a good idea to add non-essential code to critical paths right now. I didn't think that proving tidstore and the current tid lookup return the same result was a waste of time. I've shared a patch to do that in tidstore before. I agreed not to add it to the tree but we can test that using this patch. Actually I've done a test that ran pgbench workload for a few days. IIUC it's still important to consider whether to have node1 since it could be a good alternative for the path compression. The prototype also implemented it. Of course we can leave it for future improvement. But considering this item with the performance tests helps us to prove our decoupling approach is promising. > We're rapidly running out of time, and we're at the point in the cycle where it's impossible to get meaningful review fromanyone not already intimately familiar with the patch series. I only want to see progress on addressing possible (especiallyarchitectural) objections from the community, because if they don't notice them now, they surely will after commit. Right, we've been making many design decisions. Some of them are agreed just between you and me and some are agreed with other hackers. There are some irrevertible design decisions due to the remaining time. > I have my own list of possible objections as well as bikeshedding points, which I'll clean up and share next week. Thanks. > I plan to invite Andres to look at that list and give his impressions, because it's a lot quicker than reading the patches.Based on that, I'll hopefully be able to decide whether we have enough time to address any feedback and do remainingpolishing in time for feature freeze. > > I'd suggest sharing your todo list in the meanwhile, it'd be good to discuss what's worth doing and what is not. Apart from more rounds of reviews and tests, my todo items that need discussion and possibly implementation are: * The memory measurement in radix trees and the memory limit in tidstores. I've implemented it in v30-0007 through 0009 but we need to review it. This is the highest priority for me. * Additional size classes. It's important for an alternative of path compression as well as supporting our decoupling approach. Middle priority. * Node shrinking support. Low priority. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: