Thread: Re: Proposal for fixing intra-query memory leaks

Re: Proposal for fixing intra-query memory leaks

From
Karel Zak
Date:
On Sat, 29 Apr 2000, Tom Lane wrote:

> Background
> ----------
> 
> We already do most of our memory allocation in "memory contexts", which
> are usually AllocSets as implemented by backend/utils/mmgr/aset.c.
> (Is there any value in allowing for other memory context types?  We could
I have "aset" variant, that allow create and work with memory context in
a shared memory. 

> save some cycles by getting rid of a level of indirection here.)  What
> we need to do is create more contexts and define proper rules about when
> they can be freed.
> 
> The basic operations on a memory context are:
> 
> * create a context
A questuon, need we really any name for context? I mean an example
"CreateGlobalMemory(char *name)". Global memory routines expect that
a context name is static. It is a litle problem if contexts are created
dynamic. I suggest change it and allocate context name in context memory.  

About blocks deallocation, now if context is destroyed for all blocks are
call free(). And if some context needs new block the aset.c call malloc(). 
What create a "BlockFreeList" with limited size (and allow set size for 
this pool as new postmaster command line switch), and instead call the free()
for each block, remove a block (standard block, not a large) to this 
BlockFreeList and instead the malloc() try first look at to this list?  
> in that context.  The MemoryContextSwitchTo() operation selects a new
> current context (and returns the previous context, so that the caller can
> restore the previous context before exiting).
> 
> Note: there is no really good reason for pfree() to be tied to the current
> memory context; it ought to be possible to pfree() a chunk of memory no
> matter which context it was allocated from.  Currently we cannot do that
I not sure if I understent here. 

What the chunk->aset pointer in all chunks? It is pointer to a setData and 
the AllocSetFree() need this pointer only. The AllocSetFree() not use full 
context struct.  
IMHO is very easy implement pfree() and repalloc() independent on 
CurrentMemoryContext setting. The MemoryContextSwitchTo() needs palloc() 
only. Or not?


> because of the possibility that there is more than one kind of memory
> context.  If they were all AllocSets then the problem goes away, which is
> one reason I'd like to eliminate the provision for other kinds of
> contexts.


> Additions to the memory-context mechanism
> -----------------------------------------
> 
> If we are going to have more contexts, we need more mechanism for keeping
> track of them; else we risk leaking whole contexts under error conditions.
> We can do this as follows:
> 
> 1. There will be two kinds of contexts, "permanent" and "temporary".
> Permanent contexts are never reset or deleted except by explicit caller
> command (in practice, they probably won't ever be, period).  There will
> not be very many of these --- perhaps only the existing TopMemoryContext
> and CacheMemoryContext.  We should avoid having very much code run with
> CurrentMemoryContext pointing at a permanent context, since any forgotten
> palloc() represents a permanent memory leak.
And what you mean about total persistent contexts in shared memory? 
I create for the QueryCache specific top context that is stored in the 
QueryCache pool (shmem) and is independent on standard TopMemoryContext. 

We probably don't know what we will need in future save to shared memory.
IMHO is good create some common mechanism for shmem, that allow use current
routines (an example copyObject()) for a work with shmem.


> 2. Temporary contexts are remembered by the context manager and are
> guaranteed to be deleted at transaction end.  (If we ever have nested
> transactions, we'd probably want to tie each temporary context to a
> particular transaction, but for now that's not necessary.)  Most activity
> will happen in temporary contexts.

What is the "context manager"?

Your suggestion/planns are very interesting. I agree that more contexts
(for error, query, transactions, ..etc.) is a good idea. 
                        Karel



Re: Proposal for fixing intra-query memory leaks

From
Tom Lane
Date:
Karel Zak <zakkr@zf.jcu.cz> writes:
>> (Is there any value in allowing for other memory context types?  We could

> I have "aset" variant, that allow create and work with memory context in
> a shared memory. 

Hmm.  Well, maybe nailing things down to a single kind of context is
pushing it too far.  What I mainly want is to be able to free a palloc'd
chunk without assuming that it came from the currently active context.
With AllocSets as aset.c does them, this is possible because every chunk
has a back-link to its owning context.  We could still make it work with
multiple context types if we require that they all have the same kind of
back-link in the same place.  Essentially, the overhead data for an
allocated chunk would have to be the same for all context types.  But
we could still have different context types with different allocation
methods for each type.  pfree macro would get the back-link from 8 bytes
before the given chunk address, verify that it points at something with
the right node type to be a memory context header, and then call through
a function pointer in the context header to get to the routine that
actually knows how to free a chunk of that context type.

>  A questuon, need we really any name for context? I mean an example
> "CreateGlobalMemory(char *name)". Global memory routines expect that
> a context name is static. It is a litle problem if contexts are created
> dynamic. I suggest change it and allocate context name in context memory.  

Good point.  I don't believe that contexts have names at all at the
moment (portals do, but not contexts).  But it seems like it would be
helpful for debugging to have some kind of name for each context.  It
wouldn't have to be unique or anything, but when you were trying to
figure out where your memory leak is, being able to see which context
has gotten bloated would be useful...

> About blocks deallocation, now if context is destroyed for all blocks are
> call free(). And if some context needs new block the aset.c call malloc(). 
> What create a "BlockFreeList" with limited size (and allow set size for 
> this pool as new postmaster command line switch), and instead call the free()
> for each block, remove a block (standard block, not a large) to this 
> BlockFreeList and instead the malloc() try first look at to this list?  

I think this is going in the wrong direction.  We are not trying to
write a better malloc than malloc.  What we are trying to do is make
allocation of small chunks very fast and make it possible to release a
whole bunch of related chunks without keeping track of them individually.
I don't believe that we need to improve on malloc at the level of
grabbing or releasing whole blocks.

>  IMHO is very easy implement pfree() and repalloc() independent on 
> CurrentMemoryContext setting. The MemoryContextSwitchTo() needs palloc() 
> only. Or not?

Yes, but *only if the chunk you are being asked to pfree came from an
AllocSet-style context*.  Currently it could have come from a different
kind of context that isn't based on AllocSets.  (I think we have some
contexts where palloc is just a direct malloc, for example.)

>  And what you mean about total persistent contexts in shared memory? 
> I create for the QueryCache specific top context that is stored in the 
> QueryCache pool (shmem) and is independent on standard TopMemoryContext. 

QueryCache would probably be a permanent context ...
        regards, tom lane


Re: Proposal for fixing intra-query memory leaks

From
Karel Zak
Date:

On Tue, 2 May 2000, Tom Lane wrote:

> Karel Zak <zakkr@zf.jcu.cz> writes:
> >> (Is there any value in allowing for other memory context types?  We could
> 
> > I have "aset" variant, that allow create and work with memory context in
> > a shared memory. 
> 
> Hmm.  Well, maybe nailing things down to a single kind of context is
> pushing it too far.  What I mainly want is to be able to free a palloc'd
> chunk without assuming that it came from the currently active context.
With my QueryCache contexts it will possible too. I use same AllocSet 
structs - different is only a block spring (not from malloc, but from 
QueryCache shmem pool). Change chunks headers will easy if it will need. 
A problem will, how select right method for free/realloc. This information
is only in context struct, but not in a chunk header or in AllocSet struct
(now).

> With AllocSets as aset.c does them, this is possible because every chunk
> has a back-link to its owning context.  We could still make it work with
> multiple context types if we require that they all have the same kind of
> back-link in the same place.  Essentially, the overhead data for an
> allocated chunk would have to be the same for all context types.  But
> we could still have different context types with different allocation
> methods for each type.  pfree macro would get the back-link from 8 bytes
> before the given chunk address, verify that it points at something with
OK. You need in chunk header chunk-size too, not the back-to-context-link 
only.

> the right node type to be a memory context header, and then call through
> a function pointer in the context header to get to the routine that
> actually knows how to free a chunk of that context type.
The idea it is very good, because if we will work with more context and
allocations types we need identify chunks. 
The chunk header must be relevant and same for all allocation methods and
must be independent on AllocSel (aset.c) code. Now, it is not. Am I right?

> Good point.  I don't believe that contexts have names at all at the
> moment (portals do, but not contexts).  But it seems like it would be
> helpful for debugging to have some kind of name for each context.  It
> wouldn't have to be unique or anything, but when you were trying to
> figure out where your memory leak is, being able to see which context
> has gotten bloated would be useful...
Well, but not expect that header must be static string, better will
_allocate_ it in a CreateContext() function.

                    Karel



Re: Proposal for fixing intra-query memory leaks

From
Tom Lane
Date:
Karel Zak <zakkr@zf.jcu.cz> writes:
>  The chunk header must be relevant and same for all allocation methods and
> must be independent on AllocSel (aset.c) code. Now, it is not. Am I right?

Right.  Actually it would work to have additional data before the
standard link-to-context-plus-chunk-size fields, if a particular method
really needed more per-chunk data.  But I doubt we'd ever need that.

And, as you saw, the links to the pfree and prealloc routines would need
to be present in standard places in the context header.
        regards, tom lane


Re: Proposal for fixing intra-query memory leaks

From
Karel Zak
Date:
On Tue, 2 May 2000, Tom Lane wrote:

> Karel Zak <zakkr@zf.jcu.cz> writes:
> >  The chunk header must be relevant and same for all allocation methods and
> > must be independent on AllocSel (aset.c) code. Now, it is not. Am I right?
> 
> Right.  Actually it would work to have additional data before the
> standard link-to-context-plus-chunk-size fields, if a particular method
> really needed more per-chunk data.  But I doubt we'd ever need that.
The chunk:
 +--------------------+----------------------+-------------------+ | depend-header      | standard-header      |
data.............|  +--------------------+----------------------+-------------------+ ^                    ^
         ^ |                    |                      | |                    |              pointer to chunk  |
           | |          begin of standard chunk header                           |particular method header, begin of
thisheader is unknown for common routines. 
 

IMHO in standard chunk header not must be chunk size. If something needs
chunk size it cat use depend-header. 
It will very nice :-)
                        Karel



Re: Proposal for fixing intra-query memory leaks

From
Tom Lane
Date:
Karel Zak <zakkr@zf.jcu.cz> writes:
>  IMHO in standard chunk header not must be chunk size. If something needs
> chunk size it cat use depend-header. 

We might as well put the size in the standard header, though, because
(a) all context management methods are going to need it (how you gonna
do repalloc otherwise?) and (b) there are alignment considerations.
The size of the headers has to be a multiple of MAXALIGN which is 8
on quite a few architectures.  If you just make each of the
standard-header and depend-header a multiple of MAXALIGN then you end
up wasting 8 bytes per alloc chunk on these machines.  You could get
around that with some notational ugliness (ie, standard header is not
declared as part of depend-header but you compute the pointer to it
separately).  But I don't see the value of working that hard to keep
the size out of the standard header.

Also, since aset.c is going to be our standard memory allocator for
the foreseeable future, there's no good reason to make its life more
difficult by having to work with both a standard-header and a
depend-header.  If it can get along with only a standard-header,
why not let it do so?
        regards, tom lane