Re: Unhappy about API changes in the no-fsm-for-small-rels patch - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Date
Msg-id CAA4eK1Lsi=njVNHCngM9g0gnWZ4BXusznK21-eFNNdd3vrof3A@mail.gmail.com
Whole thread Raw
In response to Re: Unhappy about API changes in the no-fsm-for-small-rels patch  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Apr 17, 2019 at 9:46 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2019-04-17 15:49:29 +0530, Amit Kapila wrote:
> > > *and* stash the bitmap of
> > > pages that we think are used/free as a bitmap somewhere below the
> > > relcache.
> >
> > I think maintaining at relcache level will be tricky when there are
> > insertions and deletions happening in the small relation.  We have
> > considered such a case during development wherein we don't want the
> > FSM to be created if there are insertions and deletions in small
> > relation.  The current mechanism addresses both this and normal case
> > where there are not many deletions.  Sure there is some overhead of
> > building the map again and rechecking each page.  The first one is a
> > memory operation and takes a few cycles
>
> Yea, I think creating / resetting the map is basically free.
>
> I'm not sure I buy the concurrency issue - ISTM it'd be perfectly
> reasonable to cache the local map (in the relcache) and use it for local
> FSM queries, and rebuild it from scratch once no space is found. That'd
> avoid a lot of repeated checking of relation size for small, but
> commonly changed relations.
>

Okay, so you mean to say that we need to perform additional system
call (to get a number of blocks) only when no space is found in the
existing set of pages?  I think that is a fair point, but can't we
achieve that by updating relpages in relation after a call to
RelationGetNumberofBlocks?

>  Add a pre-check of smgr_fsm_nblocks (if >
> 0, there has to be an fsm), and there should be fewer syscalls.
>

Yes, that check is a good one and I see that we already do this check
in fsm code before calling smgrexists.

>
> > and for the second we optimized by checking the pages alternatively
> > which means we won't check more than two pages at-a-time.  This cost
> > is paid by not checking FSM and it could be somewhat better in some
> > cases [1].
>
> Well, it's also paid by potentially higher bloat, because the
> intermediate pages aren't tested.
>
>
> > >  If we cleared that variable at truncations, I think we should
> > > be able to make that work reasonably well?
> >
> > Not only that, I think it needs to be cleared whenever we create the
> > FSM as well which could be tricky as it can be created by the vacuum.
>
> ISTM normal invalidation logic should just take of that kind of thing.
>

Do you mean to say that we don't need to add any new invalidation call
and the existing invalidation calls will automatically take care of
same?

>
> > OTOH, if we want to extend it later for whatever reason to a relation
> > level cache, it shouldn't be that difficult as the implementation is
> > mostly contained in freespace.c (fsm* functions) and I think the
> > relation is accessible in most places.  We might need to rip out some
> > calls to clearlocalmap.
>
> But it really isn't contained to freespace.c. That's my primary
> concern.

Okay, I get that point.  I think among that also the need to call
FSMClearLocalMap seems to be your main worry which is fair, but OTOH,
the places where it should be called shouldn't be a ton.

> You added new parameters (undocumented ones!),
>

I think this is mostly for compatibility with the old code.  I agree
that is a wart,  but without much inputs during development, it
doesn't seem advisable to change old behavior, that is why we have
added a new parameter to GetPageWithFreeSpace.  However, if we want we
can remove that parameter or add document it in a better way.

> FSMClearLocalMap() needs to be called by callers and xlog, etc.
>

Agreed, that this is an additional requirement, but we have documented
the cases atop of this function where it needs to be called.  We might
have missed something, but we tried to cover all cases that we are
aware of.  Can we make it more clear by adding the comments atop
freespace.c API where this map is used?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Sergei Kornilov
Date:
Subject: Re: bgwriter_lru_maxpages limits in PG 10 sample conf
Next
From: Amit Kapila
Date:
Subject: Re: Unhappy about API changes in the no-fsm-for-small-rels patch