Re: Pre-alloc ListCell's optimization - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Pre-alloc ListCell's optimization
Date
Msg-id 20110527040847.GQ4548@tamriel.snowman.net
Whole thread Raw
In response to Re: Pre-alloc ListCell's optimization  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
* Greg Stark (gsstark@mit.edu) wrote:
> On Thu, May 26, 2011 at 11:57 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> > While I agree that there is some bloat that'll happen with this
> > approach, we could reduce it by just having a 4-entry cache instead of
> > an 8-entry cache.  I'm not really sure that saving those 64 bytes per
> > list is really worth it though.
>
> First off this whole direction seems a bit weird to me. It sounds like
> you're just reimplementing palloc inside the List data structure with
> its allocator and everything. Why not just improve the memory
> allocator in palloc instead of layering a second one on top of it?

I do think it'd be great to improve palloc(), but having looked through
that code, figuring out how to improve it for the small case (such as
with the lists) while keeping it working well for larger and other cases
doesn't exactly look trivial.

> But assuming there's an advantage I've missed there's another
> possibility here: Are most of these small lists constructed with
> list_makeN?

Looks like we've got 306 cases of list_make1(), 82 cases of list_makeN()
(where N > 1), but that said, one can make a list w/ just lappend(), and
that seems to happen with some regularity.

> But all this seems odd to me. The only reason for any of this is for
> api convenience so we can pass around lists instead of passing arrays.
> If the next links are really a big source of overhead we should just
> fix our apis to take arrays of the right length or arrays with a
> separate length argument.

I'm not really sure I agree with this..  Lists are pretty useful and
easier to manage when you don't know the size.  I expect quite a few of
these lists are small for simple queries and can get pretty large for
complex queries.  Also, in many cases it's natural to step through the
list and not need random access into it, which at least reduces the
reasons to go to the effort of having a variable length array.

> Or if it's just palloc we should fix our memory allocator to behave
> the way the callers need it to. Heikki long ago suggested adding a
> stack allocator for the parser to use for its memory context to reduce
> overhead of small allocations which won't be freed until the context
> is freed for example.

Much of this originated from Greg's oprofile and Tom's further
commentary on it here:

http://archives.postgresql.org/pgsql-hackers/2011-04/msg00714.php
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Pre-alloc ListCell's optimization
Next
From: Stephen Frost
Date:
Subject: Re: Pre-alloc ListCell's optimization