Re: [HACKERS] aggregation memory leak and fix - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] aggregation memory leak and fix |
Date | |
Msg-id | 199903220507.AAA04750@candle.pha.pa.us Whole thread Raw |
In response to | Re: [HACKERS] aggregation memory leak and fix (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] aggregation memory leak and fix
|
List | pgsql-hackers |
> Bruce Momjian <maillist@candle.pha.pa.us> writes: > >> What we need to do is work out what the best set of memory context > >> definitions is, and then decide on a strategy for making sure that > >> lower-level routines allocate their return values in the right context. > > > Let's suppose that we want to free all the memory used as expression > > intermediate values after each row is processed. > > It is my understanding that all these are created in utils/adt/*.c > > files, and that the entry point to all those functions via > > fmgr()/fmgr_c(). > > That's probably the bulk of the specific calls of palloc(). Someone > (Jan?) did a scan of the code a while ago looking for palloc() calls, > and there aren't that many outside of the data-type-specific functions. > But we'd have to look individually at all the ones that are elsewhere. > > > So, if we go into an expression memory context before calling > > fmgr/fmgr_c in the executor, and return to the normal context after the > > function call, all our intermediates are trapped in the expression > > memory context. > > OK, so you're saying we leave the data-type-specific functions as is > (calling palloc() to allocate their result areas), and make each call > site specifically responsible for setting the context that palloc() will > allocate from? That could work, I think. We'd need to see what side > effects it'd have on other uses of palloc(). > > What we'd probably want is to use a stack discipline for the current > palloc-target memory context: when you set the context, you get back the > ID of the old context, and you are supposed to restore that old context > before returning. > > > At the end of each row, we just free the expression memory context. In > > almost all cases, the data is stored in tuples, and we can free it. In > > a few cases like aggregates, we have to save off the value we need to > > keep before freeing the expression context. > > Actually, nodeAgg would just have to set an appropriate context before > calling fmgr to execute the aggregate's transition functions, and then > it wouldn't need an extra copy step. The results would come back in the > right context already. > > > In fact, you could even optimize the cleanup to only do free'ing if > > some expression memory was allocated. In most cases, it is not. > > Jan's stuff should already fall through pretty quickly if there's > nothing in the context, I think. Note that what we want to do between > tuples is a "context clear" of the expression context, not a "context > delete" and then "context create" a new expression context. Context > clear should be a pretty quick no-op if nothing's been allocated in that > context... > > > In fact the nodeAgg.c patch that I backed out attempted to do that, > > though because there wasn't code that checked if the Datum was > > pg_type.typbyval, it didn't work 100%. > > Right. But if we approach it this way (clear the context at appropriate > times) rather than thinking in terms of explicitly pfree'ing individual > objects, life gets much simpler. Also, if we insist on being able to > pfree individual objects inside a context, we can't use Jan's faster > allocator! Remember, the reason it is faster and lower overhead is that > it doesn't keep track of individual objects, only pools. > > I'd like to see us head in the direction of removing most of the > explicit pfree calls that exist now, and instead rely on clearing > memory contexts at appropriate times in order to manage memory. > The fewer places where we need pfree, the more contexts can be run > with the low-overhead space allocator. Also, the fewer explicit > pfrees we need, the simpler and more reliable the code gets. Tom, are you saying you agree with my approach, and I should give it a try? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: