Re: Improving the memory allocator - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Improving the memory allocator
Date
Msg-id 201104260212.11833.andres@anarazel.de
Whole thread Raw
In response to Re: Improving the memory allocator  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tuesday, April 26, 2011 01:39:37 AM Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > So after all this my question basically is: How important do we think the
> > mctx.c abstraction is?
> 
> I think it's pretty important.  As a specific example, a new context
> type in which pfree() is a no-op would be fine with me.  A new context
> type in which pfree() dumps core will be useless, or close enough to
> useless.
Well, what I suggested for that would be using a different api for such small 
+ static size allocations.
But as I said, I would prefer to exhaust other possibilities first.

> That means you can't get rid of the per-chunk back-link to the
> context, but you might be able to get rid of the other overhead such as
> per-chunk size data.  (It might be practical to not support
> GetMemoryChunkSize for such contexts, or if it's a slab allocator then
> you could possibly know that all the chunks in a given block have size X.)
For the slab allocator design I have in mind I would need to have a back 
pointer to the block, not the context... Thats one other reason why I started 
thinking about removing the abstraction.
So far I couldn't envision a clean design where you can intermix two 
implementations with such a different interpretations. One could have the 
blocks and contexts have a 'allocator' node tag in the first element and 
switch over that but I don't really like that.

And I don't see a way with that abstraction to let the compiler do expensive 
stuff like the index offset determination at compile time instead of run time 
with that abstraction. And I think pulling such computations out of runtime is 
quite an important part of improvements in that area.

> Another point worth making is that it's a nonstarter to propose running
> any large part of the system in memory context types that are incapable
> of supporting all the debugging options we rely on (freed-memory-reuse
> detection and write-past-end-of-chunk in particular).  It's okay if a
> production build hasn't got that support, not okay for debug builds.
Totally with you. I have absolutely no problem of enlarging the chunkheader 
for debug builds and I can't envision a design where that would be a major 
problem.

> Perhaps you'll propose using completely different context
> implementations in the two cases, but I'd be suspicious of that because
> it'd mean the "fast" context code doesn't get any developer testing.
I don't like that option either. Its *way* to easy to screw up slightly in 
that area.

> > Especially as I hope its possible to write a single allocator
> > which is "good enough" for everything.
> I'll lay a side bet that that approach is a dead end.  If one size fits
> all were good enough, we'd not be having this conversation at all.  The
> point of the mctx interface layer from the beginning was to support
> multiple allocator policies, and that's the direction I think we want to
> go to improve this.
I don't think I am with you here. I am not around that long so I might be 
missing something but I haven't found much evidence of somebody trying to 
improve the allocator on a whole instead of improving currently problematic 
pieces for 10+ years. So I don't see there is enough evidence proving that 
there isn't a possibility to envision an allocator thats good enough for all 
needs.

I quite much fear having to figure out which allocator to use where. I don't 
see that working very well.

> BTW, what your numbers actually suggest to me is not that we need a
> better allocator, but that we need a better implementation of List.
> We speculated back when Neil redid List the first time about aggregating
> list cells to reduce palloc traffic, but it was left out to keep the
> patch complexity down.  Now that the bugs have been shaken out it might
> be time to have another go at that.  In particular, teaching List to
> allocate the list head and first cell together would alone remove a
> third of your runtime ...
Thats certainly true for that workload. The hotspot is somewhere else entirely 
though if you start doing even mildly more complex statements than the default 
readonly statements from pgbench.
Its actually not totally easy finding any workload thats not totally IO bound 
where memory allocation is not in the top 5 in a profile...

But sure. Improving that point is a good idea independent from the allocator. 
One less allocation won't hurt any allocator ;)

Greetings,

Andres


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: wrong hint message for ALTER FOREIGN TABLE
Next
From: Bruce Momjian
Date:
Subject: pg_upgrade cleanup