Thread: RFC: list API / memory allocations

RFC: list API / memory allocations

From
Andres Freund
Date:
Hi List,

In many scenarios memory allocation is one of the top 3 functions showing up 
in profiles. Looking at hierarchical profiles (-fno-omit-frame-pointer) at least 
during parsing, planning and executor startup most of that is spent around the 
list API.

Many - especially in the parse-analyze phase  - of those allocations can be 
avoided because the lists are immutable and their size is known upfront.

Some examples:
parser: list_make$n, buildRelationAliases
planner: expression_tree_mutator, get_relation_info;
executor: ExecInitExpr

For that I added new functions/defines which allocate all the needed memory in 
one hunk:
list_immutable_make$n(),
List *list_new_immutable_n(NodeTag t, size_t n);
List *list_make_n(NodeTag t, size_t n, ...);

With those I could avoid about 1/3 of the allocations in some example 
scenarios (pgbench, custom benchmarks) by replacing a few notable callsites to 
use statically allocated lists.

The obvious problem with that approach is that approach is that those List's 
and ListCell's cannot be individually deallocated. Which especially in the 
planner isn't done anyway.
To check that I added an allocate_only to both when USE_ASSERT_CHECKING.

Using that approach I did measure improvements between 0.5-20% depending on 
the statement (using -M simple). Complex statements which are quick to execute 
and not too complicated to plan are the ones benefiting most. Which is not 
surprising.

The questions I have here:
* is that approach acceptable? I converted a the most notable callsites and 
the benefit is quite measurable. On the other hand one has to be rather careful 
when analyzing whether lists will be deallocated later on. Also its touching 
rather much code

* I plan to add list_copy_immutable as another function as that is the next 
biggest scenario where memory is allocated in forseeable scenarios.

* any suggestions what name to use instead of immutable? That doesn't really 
cut the real meaning of "allocate only". I didn't find a name though.


Then there is the 2/3 of calls which I couldn't beat with that approach. 
Mostly because the size of the lists is too hard to figure out.
My current approach to that would be to preallocate listcells by some caller 
defined amount.
A *very* quick hack which always overallocates Lists to contain 10 elements 
yields promising results (around another 2-20% of parse only time).

The problem with that is that is that the most sensible way seems to be to 
define the amount of preallocation per list. Which is currently not easily 
representable because empty lists are just 0/NILL.

So I have 3 solutions. All of which are not that great:
* allow empty Lists. This possibly requires a bit of code churn but seems 
somewhat sensible.
* Save the amount of preallocation needed in NIL by defining that no pointer 
can be less than say 0x100. That also requires some churn and gets points for 
uglyness
* always overallocate new lists. This is rather wasteful.

Any ideas?

if anybody wants the preliminary patches I am happy to send them but I would 
much rather prefer to make them somewhat after input.


Andres




Re: RFC: list API / memory allocations

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> In many scenarios memory allocation is one of the top 3 functions showing up 
> in profiles. Looking at hierarchical profiles (-fno-omit-frame-pointer) at least 
> during parsing, planning and executor startup most of that is spent around the 
> list API.

> Many - especially in the parse-analyze phase  - of those allocations can be 
> avoided because the lists are immutable and their size is known upfront.

The fundamental problem with all of those proposals is that now you have
some lists in the system that aren't like other lists, and will result
in dumping core if the wrong sorts of operations are applied to them.
I don't particularly care for introducing that kind of fragility into
the system in return for marginal speed gains.  I'm not impressed by
Asserts showing that no such thing happens in the cases you tested;
the test coverage won't be complete, and even if it is, innocent-looking
code changes later on could create new problems.

Now, if you could do something that *doesn't* restrict what operations
could be applied to the lists, that would be good.

I've wished for a long while that we could allocate the list header and
the first list cell in a single palloc cycle.  This would basically
require getting list_delete_cell to refrain from pfree'ing a cell that
got allocated that way, which is easy as long as you have the list
header at hand, but what happens if the list is later concat'd to
another?  A subsequent delete operation would be referring to the other
list header and would come to the wrong conclusion.

While thinking about this just now, it occurred to me that maybe the
issues could be dodged if the cell, not the header, were first in the
combined palloc block.  list_concat is then no longer a problem, as long
as it doesn't try to throw away the second list's header.  But I haven't
thought long enough to be sure how well that would work.
        regards, tom lane


Re: RFC: list API / memory allocations

From
Andres Freund
Date:
On Friday, November 18, 2011 10:11:29 PM Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > In many scenarios memory allocation is one of the top 3 functions showing
> > up in profiles. Looking at hierarchical profiles
> > (-fno-omit-frame-pointer) at least during parsing, planning and executor
> > startup most of that is spent around the list API.
> > 
> > Many - especially in the parse-analyze phase  - of those allocations can
> > be avoided because the lists are immutable and their size is known
> > upfront.
> 
> The fundamental problem with all of those proposals is that now you have
> some lists in the system that aren't like other lists, and will result
> in dumping core if the wrong sorts of operations are applied to them.
> I don't particularly care for introducing that kind of fragility into
> the system in return for marginal speed gains.  I'm not impressed by
> Asserts showing that no such thing happens in the cases you tested;
> the test coverage won't be complete, and even if it is, innocent-looking
> code changes later on could create new problems.
Yes. I dislike that myself (as noted). It seems rather fragile although I 
think at least during parsing we could simply generally refrain from parsing

I don't think that the gains are marginal though. After covering only a small 
number of cases there are not uncommon/artificial cases gaining more than 20% 
parsing speed.
One prime example of workloads benefiting hugely is something like SELECT * 
FROM pg_class WHERE oid = ...;
Essentially all queries which request few rows with a large number of columns 
benefit rather measurably. 

> Now, if you could do something that *doesn't* restrict what operations
> could be applied to the lists, that would be good.
If every list cell/header would grow another field "allocate_only" (which I 
currently added for cassert only) those could just get skipped when deleting.
Some places are directly freeing list headers but that seems to be a bad idea 
anyway.
The disadvantage is that those would essentially be there till context reset 
unless done via some special api. Also not that nice... On the other hand only 
very few callsites free list(-cells) during parsing anyway.
Without looking I didn't see measurable increase in memory usage due to the 
new field with that approach.

The only way out of that seems to be to add refcounted list cells/headers :(. 
I fear that would be rather complex, expensive and failure prone so I don't 
like to go there.

> I've wished for a long while that we could allocate the list header and
> the first list cell in a single palloc cycle. 
Yea. Although that is only a rather small portion of the problems/allocations.

> This would basically
> require getting list_delete_cell to refrain from pfree'ing a cell that
> got allocated that way, which is easy as long as you have the list
> header at hand, but what happens if the list is later concat'd to
> another?  A subsequent delete operation would be referring to the other
> list header and would come to the wrong conclusion.
I don't think any such scheme is safe.

> While thinking about this just now, it occurred to me that maybe the
> issues could be dodged if the cell, not the header, were first in the
> combined palloc block.  list_concat is then no longer a problem, as long
> as it doesn't try to throw away the second list's header.  But I haven't
> thought long enough to be sure how well that would work.
I don't think that would work without carefully revising list usage all 
around... Several places remove nodes from a list and then do list_free() on 
the remainder.


Something aside:
For my POC memory allocator I added "intrusive" lists which have the next, 
prev elements embedded in the stored element. I wonder if some of the list 
usage could be replaced by such a scheme. Obviously for every embeded 
list_node a Node can only be in one list...


Andres


Re: RFC: list API / memory allocations

From
Stephen Frost
Date:
Andres,

* Andres Freund (andres@anarazel.de) wrote:
> For that I added new functions/defines which allocate all the needed memory in
> one hunk:
> list_immutable_make$n(),
> List *list_new_immutable_n(NodeTag t, size_t n);
> List *list_make_n(NodeTag t, size_t n, ...);

A while back, I posted a patch to try and address this same issue.  The
approach that I took was to always pre-allocate a certain (#defined)
amount (think it was 5 or 10 elements).  There were a number of places
that caused problems with that approach because they hacked on the list
element structures directly (instead of using the macros/functions)-
you'll want to watch out for those areas in any work on lists.

That patch is here:
http://archives.postgresql.org/pgsql-hackers/2011-05/msg01213.php

The thread on it might also be informative.

I do like your approach of being able to pass the ultimate size of the
list in..  Perhaps the two approaches could be merged?  I was able to
make everything work with my approach, provided all the callers used the
list API (I did that by making sure the links, etc, actually pointed to
the right places in the pre-allocated array).  One downside was that the
size ended up being larger that it might have been in some cases.
Thanks,
    Stephen

Re: RFC: list API / memory allocations

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Now, if you could do something that *doesn't* restrict what operations
> could be applied to the lists, that would be good.

If the API is followed, I believe my previous patch works for
everything, but it wasn't variable about the size of the new list.
Perhaps we could combine the two approaches, though there would be more
overhead from dealing with a variable bitmap for what's currently used.

> I've wished for a long while that we could allocate the list header and
> the first list cell in a single palloc cycle.

You've mentioned that before and, to be honest, I could have sworn that
we're doing that already..
Thanks,
    Stephen

Re: RFC: list API / memory allocations

From
Robert Haas
Date:
On Sat, Nov 19, 2011 at 12:33 PM, Stephen Frost <sfrost@snowman.net> wrote:
> You've mentioned that before and, to be honest, I could have sworn that
> we're doing that already..

I tried to write a patch for that at one point, but it crashed and
burned over the exact same set of issues discussed upthread, which I
wasn't able to resolve satisfactorily.  It's just really difficult to
change the API for something like memory allocation after the fact;
it's too hard to find the bits of code that do whatever naughty thing
you don't want them to.

One random idea... would there by any sense in having a palloc-like
function that is defined to allocate multiple objects at once?  In
other words, if you need 4 list cells, then instead of asking palloc
for them one at a time, you make one function call and get four
pointers back at one go.  I'm not sure whether that would help at all;
palloc might not be able to take advantage of the additional
information usefully.

To some extent I feel like this is all optimizing something that's
likely already so well-optimized that future gains, if any, are likely
to be small.  I feel like the only way we're likely to get much of a
win here is if we can reduce the amount of memory that has to be
allocated in the first place (allocate fewer data structures, don't
copy them as much, etc.).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: RFC: list API / memory allocations

From
Bruce Momjian
Date:
Did we make any headway on this?

---------------------------------------------------------------------------

On Sat, Nov 19, 2011 at 12:31:09PM -0500, Stephen Frost wrote:
> Andres,
> 
> * Andres Freund (andres@anarazel.de) wrote:
> > For that I added new functions/defines which allocate all the needed memory in 
> > one hunk:
> > list_immutable_make$n(),
> > List *list_new_immutable_n(NodeTag t, size_t n);
> > List *list_make_n(NodeTag t, size_t n, ...);
> 
> A while back, I posted a patch to try and address this same issue.  The
> approach that I took was to always pre-allocate a certain (#defined)
> amount (think it was 5 or 10 elements).  There were a number of places
> that caused problems with that approach because they hacked on the list
> element structures directly (instead of using the macros/functions)-
> you'll want to watch out for those areas in any work on lists.
> 
> That patch is here:
> http://archives.postgresql.org/pgsql-hackers/2011-05/msg01213.php
> 
> The thread on it might also be informative.
> 
> I do like your approach of being able to pass the ultimate size of the
> list in..  Perhaps the two approaches could be merged?  I was able to
> make everything work with my approach, provided all the callers used the
> list API (I did that by making sure the links, etc, actually pointed to
> the right places in the pre-allocated array).  One downside was that the
> size ended up being larger that it might have been in some cases.
> 
>     Thanks,
> 
>         Stephen



--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: RFC: list API / memory allocations

From
Bruce Momjian
Date:
Is this a TODO?

---------------------------------------------------------------------------

On Sat, Nov 19, 2011 at 12:31:09PM -0500, Stephen Frost wrote:
> Andres,
> 
> * Andres Freund (andres@anarazel.de) wrote:
> > For that I added new functions/defines which allocate all the needed memory in 
> > one hunk:
> > list_immutable_make$n(),
> > List *list_new_immutable_n(NodeTag t, size_t n);
> > List *list_make_n(NodeTag t, size_t n, ...);
> 
> A while back, I posted a patch to try and address this same issue.  The
> approach that I took was to always pre-allocate a certain (#defined)
> amount (think it was 5 or 10 elements).  There were a number of places
> that caused problems with that approach because they hacked on the list
> element structures directly (instead of using the macros/functions)-
> you'll want to watch out for those areas in any work on lists.
> 
> That patch is here:
> http://archives.postgresql.org/pgsql-hackers/2011-05/msg01213.php
> 
> The thread on it might also be informative.
> 
> I do like your approach of being able to pass the ultimate size of the
> list in..  Perhaps the two approaches could be merged?  I was able to
> make everything work with my approach, provided all the callers used the
> list API (I did that by making sure the links, etc, actually pointed to
> the right places in the pre-allocated array).  One downside was that the
> size ended up being larger that it might have been in some cases.
> 
>     Thanks,
> 
>         Stephen



--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: RFC: list API / memory allocations

From
Stephen Frost
Date:
Bruce,

* Bruce Momjian (bruce@momjian.us) wrote:
> Did we make any headway on this?

I've got the code written but I need to test it in a stable environment
to see what kind of improvment it provides.  I've actually been fighting
for the past 2 months with the box that I want to use and think I may
just give up and steal a different server.

It hasn't fallen off my list of things to do, just hasn't been a high
priority.
Thanks,
    Stephen

Re: RFC: list API / memory allocations

From
Bruce Momjian
Date:
On Thu, Aug 16, 2012 at 09:51:28PM -0400, Stephen Frost wrote:
> Bruce,
> 
> * Bruce Momjian (bruce@momjian.us) wrote:
> > Did we make any headway on this?
> 
> I've got the code written but I need to test it in a stable environment
> to see what kind of improvment it provides.  I've actually been fighting
> for the past 2 months with the box that I want to use and think I may
> just give up and steal a different server.
> 
> It hasn't fallen off my list of things to do, just hasn't been a high
> priority.

OK, thanks for the status update.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +