Thread: Patch for fixing a few memory leaks

Patch for fixing a few memory leaks

From
Teodor Sigaev
Date:
Patch fix memory leaks in src/backend/utils/fmgr/dfmgr.c .
This leaks is very significant with massive update/insert tables with gist
indexes in one transaction or with following sequence of commands:
1. COPY in table large number of row
2. CREATE GiST index on table
3. VACUUM ANALYZE
On third step postgres eats very big number of memory.
This patch fix it.

BTW
Tom, I want to notice that initGISTstate is called for every inserting value
(for each row). I think it's not good, because this function called 'fmgr_info'
7 times. 'fmgr_info' call a  'load_external_function' with execution of sequence
search on library name. Any suggestion?

--
Teodor Sigaev
teodor@stack.net


Attachment

Re: Patch for fixing a few memory leaks

From
Bruce Momjian
Date:
Patch applied.  Thanks.

> Patch fix memory leaks in src/backend/utils/fmgr/dfmgr.c .
> This leaks is very significant with massive update/insert tables with gist 
> indexes in one transaction or with following sequence of commands:
> 1. COPY in table large number of row
> 2. CREATE GiST index on table
> 3. VACUUM ANALYZE
> On third step postgres eats very big number of memory.
> This patch fix it.
> 
> BTW
> Tom, I want to notice that initGISTstate is called for every inserting value 
> (for each row). I think it's not good, because this function called 'fmgr_info' 
> 7 times. 'fmgr_info' call a  'load_external_function' with execution of sequence 
> search on library name. Any suggestion?
> 
> -- 
> Teodor Sigaev
> teodor@stack.net
> 

[ Attachment, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Patch for fixing a few memory leaks

From
Tom Lane
Date:
Teodor Sigaev <teodor@stack.net> writes:
> Patch fix memory leaks in src/backend/utils/fmgr/dfmgr.c .

Applied, thanks.  (Looks like the leaks were introduced fairly
recently by the dynamic-search-path feature.)

> Tom, I want to notice that initGISTstate is called for every inserting
> value (for each row). I think it's not good, because this function
> called 'fmgr_info' 7 times. 'fmgr_info' call a
> 'load_external_function' with execution of sequence search on library
> name. Any suggestion?

fmgr_info shouldn't be all that expensive; I'm not really inclined to
worry about it.  Do you have evidence to the contrary?
        regards, tom lane


Re: Patch for fixing a few memory leaks

From
Peter Eisentraut
Date:
Tom Lane writes:

> Applied, thanks.  (Looks like the leaks were introduced fairly
> recently by the dynamic-search-path feature.)

Is there some sort of a system behind which places are subject to leaks
and which places are just too lazy to call pfree()?

I know that index support procedures must not leak, hmm, I guess this
would include the function manager...

(If that was not the right explanation, stop reading here.)

Why aren't index support procedures called with an appropriate memory
context set up?  Since the functions currently do all the cleaning
themselves, couldn't it work like this:

1. set up memory context
2. call index procedure
3. clean out memory context

(This could even be slightly more efficient.)

Then again, I'm probably oversimplifying things...

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



Re: Patch for fixing a few memory leaks

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Is there some sort of a system behind which places are subject to leaks
> and which places are just too lazy to call pfree()?

> I know that index support procedures must not leak, hmm, I guess this
> would include the function manager...

Yeah, that's basically why there's a problem here --- if this weren't
getting called from the index support area, I don't think the leak would
matter.

> Why aren't index support procedures called with an appropriate memory
> context set up?

I looked at recovering space after index operations but decided it would
take more work than I could invest at the time.  The trouble is that
several of the index AMs allocate space that they expect to stick around
across operations, so they'd have to be fixed to use a special context
for such things.  Eventually it'd be nice to fix it properly, ie, run
index support routines with CurrentMemoryContext = a short-term context,
just as you say.
        regards, tom lane


Re: Patch for fixing a few memory leaks

From
Teodor Sigaev
Date:
>>Tom, I want to notice that initGISTstate is called for every inserting
>>value (for each row). I think it's not good, because this function
>>called 'fmgr_info' 7 times. 'fmgr_info' call a
>>'load_external_function' with execution of sequence search on library
>>name. Any suggestion?
>>
>
> fmgr_info shouldn't be all that expensive; I'm not really inclined to
> worry about it.  Do you have evidence to the contrary?


Tom, I make some test with this ugly patch which makes structure giststate
static (pls, don't commit this patch :) ).

Test:
1. install contrib/btree_gist
2. create 'sql.cmd' file contains:
DROP TABLE tbl;
BEGIN TRANSACTION;
CREATE TABLE tbl (v INT);
CREATE INDEX tblidx ON tbl USING GIST (v);
COPY tbl FROM '/tmp/data';
END TRANSACTION;
3. create /tmp/data with 10000 random values.

Result:
1. Original gist.c
% time psql wow < sql.cmd
psql wow < sql.cmd  0.00s user 0.02s system 0% cpu 7.170 total
2. Patched gist.c
% time psql wow < sql.cmd
psql wow < sql.cmd  0.02s user 0.00s system 2% cpu 0.699 total

We can see that calling fmgr_info for 70000 times may be very expensive.


--
Teodor Sigaev
teodor@stack.net


Attachment

Re: Patch for fixing a few memory leaks

From
Tom Lane
Date:
Teodor Sigaev <teodor@stack.net> writes:
>>> Tom, I want to notice that initGISTstate is called for every inserting
>>> value (for each row). I think it's not good, because this function
>>> called 'fmgr_info' 7 times. 'fmgr_info' call a
>>> 'load_external_function' with execution of sequence search on library
>>> name. Any suggestion?
>> 
>> fmgr_info shouldn't be all that expensive; I'm not really inclined to
>> worry about it.  Do you have evidence to the contrary?

> Result:
> 1. Original gist.c
> % time psql wow < sql.cmd
> psql wow < sql.cmd  0.00s user 0.02s system 0% cpu 7.170 total
> 2. Patched gist.c
> % time psql wow < sql.cmd
> psql wow < sql.cmd  0.02s user 0.00s system 2% cpu 0.699 total

> We can see that calling fmgr_info for 70000 times may be very expensive.

Okay, I've done something about this: fmgr_info results for index
support functions are now kept in the relcache.  I now get roughly
the same timings for loading 10000 tuples into either a plain
btree index or a btree_gist index, rather than a factor-of-7 penalty
for btree_gist as before ...
        regards, tom lane