Re: Unhappy about API changes in the no-fsm-for-small-rels patch - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Unhappy about API changes in the no-fsm-for-small-rels patch |
Date | |
Msg-id | CAA4eK1Lsi=njVNHCngM9g0gnWZ4BXusznK21-eFNNdd3vrof3A@mail.gmail.com Whole thread Raw |
In response to | Re: Unhappy about API changes in the no-fsm-for-small-rels patch (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Wed, Apr 17, 2019 at 9:46 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2019-04-17 15:49:29 +0530, Amit Kapila wrote: > > > *and* stash the bitmap of > > > pages that we think are used/free as a bitmap somewhere below the > > > relcache. > > > > I think maintaining at relcache level will be tricky when there are > > insertions and deletions happening in the small relation. We have > > considered such a case during development wherein we don't want the > > FSM to be created if there are insertions and deletions in small > > relation. The current mechanism addresses both this and normal case > > where there are not many deletions. Sure there is some overhead of > > building the map again and rechecking each page. The first one is a > > memory operation and takes a few cycles > > Yea, I think creating / resetting the map is basically free. > > I'm not sure I buy the concurrency issue - ISTM it'd be perfectly > reasonable to cache the local map (in the relcache) and use it for local > FSM queries, and rebuild it from scratch once no space is found. That'd > avoid a lot of repeated checking of relation size for small, but > commonly changed relations. > Okay, so you mean to say that we need to perform additional system call (to get a number of blocks) only when no space is found in the existing set of pages? I think that is a fair point, but can't we achieve that by updating relpages in relation after a call to RelationGetNumberofBlocks? > Add a pre-check of smgr_fsm_nblocks (if > > 0, there has to be an fsm), and there should be fewer syscalls. > Yes, that check is a good one and I see that we already do this check in fsm code before calling smgrexists. > > > and for the second we optimized by checking the pages alternatively > > which means we won't check more than two pages at-a-time. This cost > > is paid by not checking FSM and it could be somewhat better in some > > cases [1]. > > Well, it's also paid by potentially higher bloat, because the > intermediate pages aren't tested. > > > > > If we cleared that variable at truncations, I think we should > > > be able to make that work reasonably well? > > > > Not only that, I think it needs to be cleared whenever we create the > > FSM as well which could be tricky as it can be created by the vacuum. > > ISTM normal invalidation logic should just take of that kind of thing. > Do you mean to say that we don't need to add any new invalidation call and the existing invalidation calls will automatically take care of same? > > > OTOH, if we want to extend it later for whatever reason to a relation > > level cache, it shouldn't be that difficult as the implementation is > > mostly contained in freespace.c (fsm* functions) and I think the > > relation is accessible in most places. We might need to rip out some > > calls to clearlocalmap. > > But it really isn't contained to freespace.c. That's my primary > concern. Okay, I get that point. I think among that also the need to call FSMClearLocalMap seems to be your main worry which is fair, but OTOH, the places where it should be called shouldn't be a ton. > You added new parameters (undocumented ones!), > I think this is mostly for compatibility with the old code. I agree that is a wart, but without much inputs during development, it doesn't seem advisable to change old behavior, that is why we have added a new parameter to GetPageWithFreeSpace. However, if we want we can remove that parameter or add document it in a better way. > FSMClearLocalMap() needs to be called by callers and xlog, etc. > Agreed, that this is an additional requirement, but we have documented the cases atop of this function where it needs to be called. We might have missed something, but we tried to cover all cases that we are aware of. Can we make it more clear by adding the comments atop freespace.c API where this map is used? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: