Thread: Multiple sorts in a query

Multiple sorts in a query

From
Simon Riggs
Date:
Just wanted to check some thoughts about how memory allocation works in
complex queries. Been thinking some more about recent Solaris testing
results that *seemed* to show issues with multiple concurrent queries
that have multiple sorts.

If we have a query that uses multiple sorts, we may have a top-level
sort, with child nodes that contain sorts also. In some cases we may
find with sub-nodes that have both inner and outer sub-trees that
contain sorts also.

If we allocate large chunks of memory we use malloc(). So complex
queries can have multiple mallocs, followed by multiple reallocs. That
in itself seems likely to end up with roughly double memory use, since
realloc won't work properly/quickly with multiple mallocs. (Double since
we allocate X bytes, then 2X bytes etc until we hit the limit.)

When we later free() the memory, do we always free() it in the reverse
order in which it was allocated? If not, how does that effect reducing
the sbrk point, or other aspects of reusing allocated memory?

Is it possible that Solaris's default malloc isn't appropriate for
repeated use in complex queries that use multiple sorts?
http://developers.sun.com/solaris/articles/multiproc/multiproc.html
and recent OpenSolaris bug reports.

Anyway, feel free to jump in.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Multiple sorts in a query

From
Martijn van Oosterhout
Date:
On Tue, May 19, 2009 at 12:32:13PM +0100, Simon Riggs wrote:
> If we allocate large chunks of memory we use malloc(). So complex
> queries can have multiple mallocs, followed by multiple reallocs. That
> in itself seems likely to end up with roughly double memory use, since
> realloc won't work properly/quickly with multiple mallocs. (Double since
> we allocate X bytes, then 2X bytes etc until we hit the limit.)

I don't know about Solaris, but glibc has a threshold above which it
starts using mmap() instead of sbrk(). Thus, once you start using very
large blocks, freeing always returns the memory to the kernel,
irrespective of other allocations.

The threshold is dynamic apparently, but starts at 128KB.

Just a thought,

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Re: Multiple sorts in a query

From
Simon Riggs
Date:
On Tue, 2009-05-19 at 09:17 -0400, Merlin Moncure wrote:
> On Tue, May 19, 2009 at 7:44 AM, Martijn van Oosterhout
> >
> > The threshold is dynamic apparently, but starts at 128KB.
> 
> I just read an article that suggests assuming that can be dangerous
> (by one of the authors of jemalloc)...an interesting read.

> http://www.canonware.com/~ttt/2009/05/mr-malloc-gets-schooled.html

Thanks both, interesting read. Hmmm...

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Multiple sorts in a query

From
Greg Stark
Date:
On Tue, May 19, 2009 at 12:32 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> If we have a query that uses multiple sorts, we may have a top-level
> sort, with child nodes that contain sorts also. In some cases we may
> find with sub-nodes that have both inner and outer sub-trees that
> contain sorts also.

Well a top-level sort and a child sort wouldn't both be accumulating
rows at the same time. The child could still be alive behaving like a
tuplestore though.

> If we allocate large chunks of memory we use malloc(). So complex
> queries can have multiple mallocs, followed by multiple reallocs. That
> in itself seems likely to end up with roughly double memory use, since
> realloc won't work properly/quickly with multiple mallocs. (Double since
> we allocate X bytes, then 2X bytes etc until we hit the limit.)

I think it's even worse than that since the old and new allocation
have to briefly coexist. So at least transiently we use 3x the size of
the actual array.

> When we later free() the memory, do we always free() it in the reverse
> order in which it was allocated? If not, how does that effect reducing
> the sbrk point, or other aspects of reusing allocated memory?
>
> Is it possible that Solaris's default malloc isn't appropriate for
> repeated use in complex queries that use multiple sorts?

Well anything's possible. Do you have any specific ideas? I would
expect any decent malloc library to shrink sbrk based on statically
analyzing where its allocations actually are, so I wouldn't expect the
pattern of frees to matter on that front. It might still fragment
memory if we allocate a bunch of large tuplestore/tuplesorts and then
allocate one object in a longer lived memory context.

What problems have you seen?

-- 
greg


Re: Multiple sorts in a query

From
Merlin Moncure
Date:
On Tue, May 19, 2009 at 7:44 AM, Martijn van Oosterhout
<kleptog@svana.org> wrote:
> On Tue, May 19, 2009 at 12:32:13PM +0100, Simon Riggs wrote:
>> If we allocate large chunks of memory we use malloc(). So complex
>> queries can have multiple mallocs, followed by multiple reallocs. That
>> in itself seems likely to end up with roughly double memory use, since
>> realloc won't work properly/quickly with multiple mallocs. (Double since
>> we allocate X bytes, then 2X bytes etc until we hit the limit.)
>
> I don't know about Solaris, but glibc has a threshold above which it
> starts using mmap() instead of sbrk(). Thus, once you start using very
> large blocks, freeing always returns the memory to the kernel,
> irrespective of other allocations.
>
> The threshold is dynamic apparently, but starts at 128KB.

I just read an article that suggests assuming that can be dangerous
(by one of the authors of jemalloc)...an interesting read.

"Update in 2006:
The above was written in 2001. Since then the world has changed a lot.
Memory got bigger. Applications got bigger. The virtual address space
layout in 32 bit linux changed.

In the new situation, brk() and mmap space is shared and there are no
artificial limits on brk size imposed by the kernel. What is more,
applications have started using transient allocations larger than the
128Kb as was imagined in 2001."

http://www.canonware.com/~ttt/2009/05/mr-malloc-gets-schooled.html

merlin


Re: Multiple sorts in a query

From
Chuck McDevitt
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Simon Riggs
> Sent: Tuesday, May 19, 2009 4:32 AM
> To: pgsql-hackers
> Subject: [HACKERS] Multiple sorts in a query
>
>
> Just wanted to check some thoughts about how memory allocation works in
> complex queries. Been thinking some more about recent Solaris testing
> results that *seemed* to show issues with multiple concurrent queries
> that have multiple sorts.
>
> If we have a query that uses multiple sorts, we may have a top-level
> sort, with child nodes that contain sorts also. In some cases we may
> find with sub-nodes that have both inner and outer sub-trees that
> contain sorts also.
>
> If we allocate large chunks of memory we use malloc(). So complex
> queries can have multiple mallocs, followed by multiple reallocs. That
> in itself seems likely to end up with roughly double memory use, since
> realloc won't work properly/quickly with multiple mallocs. (Double
> since
> we allocate X bytes, then 2X bytes etc until we hit the limit.)
>
> When we later free() the memory, do we always free() it in the reverse
> order in which it was allocated? If not, how does that effect reducing
> the sbrk point, or other aspects of reusing allocated memory?
>
> Is it possible that Solaris's default malloc isn't appropriate for
> repeated use in complex queries that use multiple sorts?
> http://developers.sun.com/solaris/articles/multiproc/multiproc.html
> and recent OpenSolaris bug reports.

Solaris default malloc always uses sbrk(), and never ever tried to reduce the sbrk point.

If you want a malloc that uses mmap, there is an non-default malloc that does that (libumem or something?)


Re: Multiple sorts in a query

From
Simon Riggs
Date:
On Tue, 2009-05-19 at 09:33 -0700, Chuck McDevitt wrote:

> > Is it possible that Solaris's default malloc isn't appropriate for
> > repeated use in complex queries that use multiple sorts?
> > http://developers.sun.com/solaris/articles/multiproc/multiproc.html
> > and recent OpenSolaris bug reports.
> 
> Solaris default malloc always uses sbrk(), and never ever tried to
> reduce the sbrk point.
> 
> If you want a malloc that uses mmap, there is an non-default malloc
> that does that (libumem or something?)

OK, thanks Chuck. Doesn't sound good.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Multiple sorts in a query

From
Simon Riggs
Date:
On Tue, 2009-05-19 at 13:52 +0100, Greg Stark wrote:

> So at least transiently we use 3x the size of the actual array.

I was conjecturing, prior to investigation. Are you saying you know
this/have seen this already?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Multiple sorts in a query

From
Greg Stark
Date:
Well I'm just saying if you realloc a x kilobyte block into a 2x block  
and the allocator can't expand it and has to copy then it seems  
inevitable.

-- 
Greg


On 19 May 2009, at 14:11, Simon Riggs <simon@2ndQuadrant.com> wrote:

>
> On Tue, 2009-05-19 at 13:52 +0100, Greg Stark wrote:
>
>> So at least transiently we use 3x the size of the actual array.
>
> I was conjecturing, prior to investigation. Are you saying you know
> this/have seen this already?
>
> -- 
> Simon Riggs           www.2ndQuadrant.com
> PostgreSQL Training, Services and Support
>


Re: Multiple sorts in a query

From
Zdenek Kotala
Date:
Chuck McDevitt píše v út 19. 05. 2009 v 09:33 -0700:

> 
> Solaris default malloc always uses sbrk(), and never ever tried to reduce the sbrk point.
> 
> If you want a malloc that uses mmap, there is an non-default malloc that does that (libumem or something?)
> 

There are severals memory allocator on Solaris. You can choose what you
need. See

mapalloc (it uses mmap insted of srbk)
mtmalloc (optimized fro multi threaded apps)
mumem_alloc
watchmalloc
bsdmalloc
maybe more.

What I heart is that standard malloc is not good, but it is still here
for compatibility reason with old application which depends on some
functionality.
Zdenek 



Re: Multiple sorts in a query

From
Simon Riggs
Date:
On Tue, 2009-05-19 at 22:19 +0200, Zdenek Kotala wrote:
> Chuck McDevitt píše v út 19. 05. 2009 v 09:33 -0700:

> > Solaris default malloc always uses sbrk(), and never ever tried to reduce the sbrk point.
> > 
> > If you want a malloc that uses mmap, there is an non-default malloc that does that (libumem or something?) 
> 
> There are severals memory allocator on Solaris. You can choose what you
> need. See
> 
> mapalloc (it uses mmap insted of srbk)
> mtmalloc (optimized fro multi threaded apps)
> mumem_alloc
> watchmalloc
> bsdmalloc
>  
> maybe more.
> 
> What I heart is that standard malloc is not good, but it is still here
> for compatibility reason with old application which depends on some
> functionality.

Which one is used in the default PostgreSQL build for Solaris? If you
use default malloc, have you tested the others and would you recommend
one in particular?

Which one has Dimitri used in his performance testing?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Multiple sorts in a query

From
Simon Riggs
Date:
On Tue, 2009-05-19 at 16:49 -0400, Greg Stark wrote:

> Well I'm just saying if you realloc a x kilobyte block into a 2x block  
> and the allocator can't expand it and has to copy then it seems  
> inevitable.

OK, understood.

So there is grounds at least for an investigation into how that works
and whether it is as inefficient as we think it might be.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



Re: Multiple sorts in a query

From
Andres Freund
Date:
On 05/20/2009 10:14 AM, Simon Riggs wrote:
> On Tue, 2009-05-19 at 22:19 +0200, Zdenek Kotala wrote:
>> Chuck McDevitt píše v út 19. 05. 2009 v 09:33 -0700:
>> What I heart is that standard malloc is not good, but it is still here
>> for compatibility reason with old application which depends on some
>> functionality.
>
> Which one is used in the default PostgreSQL build for Solaris? If you
> use default malloc, have you tested the others and would you recommend
> one in particular?
You don't even need to recompile it most of the time (unless statically 
compiled or similar things). LD_PRELOAD'ing another malloc library 
should normally be enough.

Andres


Re: Multiple sorts in a query

From
Zdenek Kotala
Date:
Simon Riggs píše v st 20. 05. 2009 v 09:14 +0100:

> > 
> > What I heart is that standard malloc is not good, but it is still here
> > for compatibility reason with old application which depends on some
> > functionality.
> 
> Which one is used in the default PostgreSQL build for Solaris? If you
> use default malloc, have you tested the others and would you recommend
> one in particular?

We use default one. I did not tested difference between them, but IIRC
that Jignesh did some testing with umem. I will ask him. However if you
give me test scenario I can test it.
    Zdenek



Re: Multiple sorts in a query

From
Simon Riggs
Date:
On Wed, 2009-05-20 at 23:01 -0400, Zdenek Kotala wrote:

> We use default one. I did not tested difference between them, but IIRC
> that Jignesh did some testing with umem. I will ask him. However if you
> give me test scenario I can test it.

Talk with Dimitri from Sun who is doing scalability benchmarks on
pgsql-perform list now.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support