Thread: Patch for fixing a few memory leaks
Patch fix memory leaks in src/backend/utils/fmgr/dfmgr.c . This leaks is very significant with massive update/insert tables with gist indexes in one transaction or with following sequence of commands: 1. COPY in table large number of row 2. CREATE GiST index on table 3. VACUUM ANALYZE On third step postgres eats very big number of memory. This patch fix it. BTW Tom, I want to notice that initGISTstate is called for every inserting value (for each row). I think it's not good, because this function called 'fmgr_info' 7 times. 'fmgr_info' call a 'load_external_function' with execution of sequence search on library name. Any suggestion? -- Teodor Sigaev teodor@stack.net
Attachment
Patch applied. Thanks. > Patch fix memory leaks in src/backend/utils/fmgr/dfmgr.c . > This leaks is very significant with massive update/insert tables with gist > indexes in one transaction or with following sequence of commands: > 1. COPY in table large number of row > 2. CREATE GiST index on table > 3. VACUUM ANALYZE > On third step postgres eats very big number of memory. > This patch fix it. > > BTW > Tom, I want to notice that initGISTstate is called for every inserting value > (for each row). I think it's not good, because this function called 'fmgr_info' > 7 times. 'fmgr_info' call a 'load_external_function' with execution of sequence > search on library name. Any suggestion? > > -- > Teodor Sigaev > teodor@stack.net > [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Teodor Sigaev <teodor@stack.net> writes: > Patch fix memory leaks in src/backend/utils/fmgr/dfmgr.c . Applied, thanks. (Looks like the leaks were introduced fairly recently by the dynamic-search-path feature.) > Tom, I want to notice that initGISTstate is called for every inserting > value (for each row). I think it's not good, because this function > called 'fmgr_info' 7 times. 'fmgr_info' call a > 'load_external_function' with execution of sequence search on library > name. Any suggestion? fmgr_info shouldn't be all that expensive; I'm not really inclined to worry about it. Do you have evidence to the contrary? regards, tom lane
Tom Lane writes: > Applied, thanks. (Looks like the leaks were introduced fairly > recently by the dynamic-search-path feature.) Is there some sort of a system behind which places are subject to leaks and which places are just too lazy to call pfree()? I know that index support procedures must not leak, hmm, I guess this would include the function manager... (If that was not the right explanation, stop reading here.) Why aren't index support procedures called with an appropriate memory context set up? Since the functions currently do all the cleaning themselves, couldn't it work like this: 1. set up memory context 2. call index procedure 3. clean out memory context (This could even be slightly more efficient.) Then again, I'm probably oversimplifying things... -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Peter Eisentraut <peter_e@gmx.net> writes: > Is there some sort of a system behind which places are subject to leaks > and which places are just too lazy to call pfree()? > I know that index support procedures must not leak, hmm, I guess this > would include the function manager... Yeah, that's basically why there's a problem here --- if this weren't getting called from the index support area, I don't think the leak would matter. > Why aren't index support procedures called with an appropriate memory > context set up? I looked at recovering space after index operations but decided it would take more work than I could invest at the time. The trouble is that several of the index AMs allocate space that they expect to stick around across operations, so they'd have to be fixed to use a special context for such things. Eventually it'd be nice to fix it properly, ie, run index support routines with CurrentMemoryContext = a short-term context, just as you say. regards, tom lane
>>Tom, I want to notice that initGISTstate is called for every inserting >>value (for each row). I think it's not good, because this function >>called 'fmgr_info' 7 times. 'fmgr_info' call a >>'load_external_function' with execution of sequence search on library >>name. Any suggestion? >> > > fmgr_info shouldn't be all that expensive; I'm not really inclined to > worry about it. Do you have evidence to the contrary? Tom, I make some test with this ugly patch which makes structure giststate static (pls, don't commit this patch :) ). Test: 1. install contrib/btree_gist 2. create 'sql.cmd' file contains: DROP TABLE tbl; BEGIN TRANSACTION; CREATE TABLE tbl (v INT); CREATE INDEX tblidx ON tbl USING GIST (v); COPY tbl FROM '/tmp/data'; END TRANSACTION; 3. create /tmp/data with 10000 random values. Result: 1. Original gist.c % time psql wow < sql.cmd psql wow < sql.cmd 0.00s user 0.02s system 0% cpu 7.170 total 2. Patched gist.c % time psql wow < sql.cmd psql wow < sql.cmd 0.02s user 0.00s system 2% cpu 0.699 total We can see that calling fmgr_info for 70000 times may be very expensive. -- Teodor Sigaev teodor@stack.net
Attachment
Teodor Sigaev <teodor@stack.net> writes: >>> Tom, I want to notice that initGISTstate is called for every inserting >>> value (for each row). I think it's not good, because this function >>> called 'fmgr_info' 7 times. 'fmgr_info' call a >>> 'load_external_function' with execution of sequence search on library >>> name. Any suggestion? >> >> fmgr_info shouldn't be all that expensive; I'm not really inclined to >> worry about it. Do you have evidence to the contrary? > Result: > 1. Original gist.c > % time psql wow < sql.cmd > psql wow < sql.cmd 0.00s user 0.02s system 0% cpu 7.170 total > 2. Patched gist.c > % time psql wow < sql.cmd > psql wow < sql.cmd 0.02s user 0.00s system 2% cpu 0.699 total > We can see that calling fmgr_info for 70000 times may be very expensive. Okay, I've done something about this: fmgr_info results for index support functions are now kept in the relcache. I now get roughly the same timings for loading 10000 tuples into either a plain btree index or a btree_gist index, rather than a factor-of-7 penalty for btree_gist as before ... regards, tom lane