RFC: list API / memory allocations - Mailing list pgsql-hackers

From Andres Freund
Subject RFC: list API / memory allocations
Date
Msg-id 201111182147.19104.andres@anarazel.de
Whole thread Raw
Responses Re: RFC: list API / memory allocations
Re: RFC: list API / memory allocations
List pgsql-hackers
Hi List,

In many scenarios memory allocation is one of the top 3 functions showing up 
in profiles. Looking at hierarchical profiles (-fno-omit-frame-pointer) at least 
during parsing, planning and executor startup most of that is spent around the 
list API.

Many - especially in the parse-analyze phase  - of those allocations can be 
avoided because the lists are immutable and their size is known upfront.

Some examples:
parser: list_make$n, buildRelationAliases
planner: expression_tree_mutator, get_relation_info;
executor: ExecInitExpr

For that I added new functions/defines which allocate all the needed memory in 
one hunk:
list_immutable_make$n(),
List *list_new_immutable_n(NodeTag t, size_t n);
List *list_make_n(NodeTag t, size_t n, ...);

With those I could avoid about 1/3 of the allocations in some example 
scenarios (pgbench, custom benchmarks) by replacing a few notable callsites to 
use statically allocated lists.

The obvious problem with that approach is that approach is that those List's 
and ListCell's cannot be individually deallocated. Which especially in the 
planner isn't done anyway.
To check that I added an allocate_only to both when USE_ASSERT_CHECKING.

Using that approach I did measure improvements between 0.5-20% depending on 
the statement (using -M simple). Complex statements which are quick to execute 
and not too complicated to plan are the ones benefiting most. Which is not 
surprising.

The questions I have here:
* is that approach acceptable? I converted a the most notable callsites and 
the benefit is quite measurable. On the other hand one has to be rather careful 
when analyzing whether lists will be deallocated later on. Also its touching 
rather much code

* I plan to add list_copy_immutable as another function as that is the next 
biggest scenario where memory is allocated in forseeable scenarios.

* any suggestions what name to use instead of immutable? That doesn't really 
cut the real meaning of "allocate only". I didn't find a name though.


Then there is the 2/3 of calls which I couldn't beat with that approach. 
Mostly because the size of the lists is too hard to figure out.
My current approach to that would be to preallocate listcells by some caller 
defined amount.
A *very* quick hack which always overallocates Lists to contain 10 elements 
yields promising results (around another 2-20% of parse only time).

The problem with that is that is that the most sensible way seems to be to 
define the amount of preallocation per list. Which is currently not easily 
representable because empty lists are just 0/NILL.

So I have 3 solutions. All of which are not that great:
* allow empty Lists. This possibly requires a bit of code churn but seems 
somewhat sensible.
* Save the amount of preallocation needed in NIL by defining that no pointer 
can be less than say 0x100. That also requires some churn and gets points for 
uglyness
* always overallocate new lists. This is rather wasteful.

Any ideas?

if anybody wants the preliminary patches I am happy to send them but I would 
much rather prefer to make them somewhat after input.


Andres




pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Core Extensions relocation
Next
From: Tom Lane
Date:
Subject: Re: RFC: list API / memory allocations