Re: Tid scan improvements - Mailing list pgsql-hackers

From Edmund Horner
Subject Re: Tid scan improvements
Date
Msg-id CAMyN-kBaD+fE5SwFhMbaV1Lk6FKaVvSSfHy-yHr6QHY7K-YVmQ@mail.gmail.com
Whole thread Raw
In response to Re: Tid scan improvements  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Tid scan improvements  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Fri, 23 Nov 2018 at 07:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 11/22/18 8:41 AM, David Rowley wrote:
> > ...
> > 3. I'd rather see EnsureTidRangeSpace() keep doubling the size of the
> > allocation until it reaches the required size. See how
> > MakeSharedInvalidMessagesArray() does it.  Doing it this way ensures
> > we always have a power of two sized array which is much nicer if we
> > ever reach the palloc() limit as if the array is sized at the palloc()
> > limit / 2 + 1, then if we try to double it'll fail.  Of course, it's
> > unlikely to be a problem here, but... the question would be how to
> > decide on the initial size.
>
> I think it kinda tries to do that in some cases, by doing this:
>
>      *numAllocRanges *= 2;
>      ...
>      tidRanges = (TidRange *)
>          repalloc(tidRanges,
>                   *numAllocRanges * sizeof(TidRange));
>
> The problem here is that what matters is not numAllocRanges being 2^N,
> but the number of bytes allocated being 2^N. Because that's what ends up
> in AllocSet, which keeps lists of 2^N chunks.
>
> And as TidRange is 12B, so this is guaranteed to waste memory, because
> no matter what the first factor is, the result will never be 2^N.

For simplicity, I think making it a strict doubling of capacity each
time is fine.  That's what we see in numerous other places in the
backend code.

What we don't really see is intentionally setting the initial capacity
so that each subsequent capacity is close-to-but-not-exceeding a power
of 2 bytes.  You can't really do that optimally if working in terms of
whole numbers of items that aren't each a power of 2 size.  This step,
there may be 2/3 of an item spare; next step, we'll have a whole item
spare that we're not going to use.  So we could keep track in terms of
bytes allocated, and then figure out how many items we can fit at the
current time.

In my opinion, such complexity is overkill for Tid scans.

Currently, we try to pick an initial size based on the number of
expressions.  We assume each expression will yield one range, and
allow that a saop expression might require us to enlarge the array.

Again, for simplicity, we should scrap that and pick something like
floor(256/sizeof(TidRange)) = 21 items, with about 1.5% wastage.


pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
Next
From: Michael Paquier
Date:
Subject: Re: Add extension options to control TAP and isolation tests