Re: Pre-alloc ListCell's optimization - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Pre-alloc ListCell's optimization
Date
Msg-id 25997.1306430043@sss.pgh.pa.us
Whole thread Raw
In response to Re: Pre-alloc ListCell's optimization  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Pre-alloc ListCell's optimization  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
Stephen Frost <sfrost@snowman.net> writes:
> Basically, I added a ListCell array into the List structure and then
> added a bitmap to keep track of which positions in the array were
> filled.

Hm.  I've gotten the impression from previous testing that there are an
awful lot of extremely short lists (1 or 2 elements) running around in a
typical query.  (One source for those is the argument lists for
operators and functions.)  I'm worried that this type of approach would
bloat the storage required in those cases to a degree that would make
the patch unattractive.  ISTM the first thing we'd need to have before
we could think about this rationally is some measurements about the
frequencies of different List lengths in a typical workload.

When Neil redid the List infrastructure a few years ago, there was some
discussion of special-casing the very first ListCell, and allocating
just that cell along with the List header.  That'd be sort of the
minimal version of what you've done here, and would be guaranteed to
never eat any wasted space (since a list that has a header isn't empty).
We should probably compare the behavior of that minimalistic version to
versions with different sizes of preallocated arrays.

> An alternative approach that I was already considering would be to
> just allocate ListCell's in bulk (kind of a poor-man's slab allocator, I
> believe).  To do that we'd have to make the bitmap be a variable length
> array of bitmaps and then have a list of pointers to the ListCell block
> allocations.  Seems like that's probably overkill for this, however.

That would be pointing in the direction of trying to save space for very
long Lists, which is a use-case that I'm not sure occurs often enough
for us to be worth spending effort on, and in any case is a distinct
issue from that of saving palloc time for very short Lists.  Again, some
statistics about actual list lengths would be really nice to have ...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: LOCK DATABASE
Next
From: "Kevin Grittner"
Date:
Subject: Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum