Thread: Postgres eats up memory when using cursors
Hello, I would like to bring subj to your attention again. The problem is the following: I need to have cursor opened for a long time. I declare a cursor on the table of approx. 1 million rows. And start fetching data by 1000 rows at each fetch. Data processing can take quite a long time (3-4 days) Theoretically postgres process should remain the same in size. But it grows... In the end of 3rd day it becames 256Mb large!!!! And this is REAL problem. Also I would like to mention that this problem was raised by other person also. I would like to hear any comments on this issue. -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------
Denis Perchine <dyp@perchine.com> writes: > I declare a cursor on the table of approx. 1 million rows. > And start fetching data by 1000 rows at each fetch. > Data processing can take quite a long time (3-4 days) > Theoretically postgres process should remain the same in size. > But it grows... In the end of 3rd day it becames 256Mb large!!!! Query details please? You can't expect any results from such a vague report. regards, tom lane
On Thursday 01 March 2001 21:33, Tom Lane wrote: > Denis Perchine <dyp@perchine.com> writes: > > I declare a cursor on the table of approx. 1 million rows. > > And start fetching data by 1000 rows at each fetch. > > Data processing can take quite a long time (3-4 days) > > Theoretically postgres process should remain the same in size. > > But it grows... In the end of 3rd day it becames 256Mb large!!!! > > Query details please? You can't expect any results from such a > vague report. :-))) That's right. declare senders_c cursor for select email, first_name, last_name from senders order by email fetch 1000 from senders_c db=# explain declare senders_c cursor for select email, first_name, last_name from senders order by email; NOTICE: QUERY PLAN: Index Scan using senders_email_key on senders (cost=0.00..197005.37 rows=928696 width=36) db=# \d senders Table "senders" Attribute | Type | Modifier ------------+-----------+---------- email | text | first_name | text | last_name | text | stamp | timestamp | Index: senders_email_key db=# \d senders_email_key Index "senders_email_key" Attribute | Type -----------+------ email | text unique btree That's all. I could not imagine anything more simple... -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------
Denis Perchine <dyp@perchine.com> writes: > I declare a cursor on the table of approx. 1 million rows. > And start fetching data by 1000 rows at each fetch. > Data processing can take quite a long time (3-4 days) > Theoretically postgres process should remain the same in size. > But it grows... In the end of 3rd day it becames 256Mb large!!!! > declare senders_c cursor for select email, first_name, last_name from senders > order by email > fetch 1000 from senders_c > db=# explain declare senders_c cursor for select email, first_name, last_name > from senders order by email; > NOTICE: QUERY PLAN: > Index Scan using senders_email_key on senders (cost=0.00..197005.37 > rows=928696 width=36) > db=# \d senders > Table "senders" > Attribute | Type | Modifier > ------------+-----------+---------- > email | text | > first_name | text | > last_name | text | > stamp | timestamp | > Index: senders_email_key > db=# \d senders_email_key > Index "senders_email_key" > Attribute | Type > -----------+------ > email | text > unique btree > That's all. I could not imagine anything more simple... Looks pretty simple to me too; I was expecting that you were doing expression evaluations or some such. I cannot reproduce any backend leakage in this query, either in 7.0.* or current sources. I did create table senders (email text, first_name text, last_name text); \copy senders from 'todo' -- some junk data create index sendersi on senders(email); insert into senders select * from senders; -- repeat a few times begin; explain declare senders_c cursor for select email, first_name, last_name from senders order by email; NOTICE: QUERY PLAN: Index Scan using sendersi on senders (cost=0.00..102.78 rows=2161 width=36) end; then made a script file begin; declare senders_c cursor for select email, first_name, last_name from senders order by email; fetch 10000 from senders_c \g zzz fetch backward 10000 from senders_c \g zzz fetch 10000 from senders_c \g zzz fetch backward 10000 from senders_c \g zzz fetch 10000 from senders_c \g zzz ... repeat a few thousand times ... end; Backend memory usage is absolutely rock steady. Curiously, there does seem to be a leak in psql, or possibly libpq, when running this script. It's per-query not per-tuple, so it's easier to see if you make the fetches shorter: fetch 100 from senders_c \g zzz fetch backward 100 from senders_c \g zzz Repeating these like mad, psql grows about 1Kb/sec on my machine. This occurs with current sources but NOT with 7.0.2 psql. Peter, any thoughts about that? regards, tom lane
> Looks pretty simple to me too; I was expecting that you were doing > expression evaluations or some such. I cannot reproduce any backend > leakage in this query, either in 7.0.* or current sources. I did > > create table senders (email text, first_name text, last_name text); > \copy senders from 'todo' -- some junk data > create index sendersi on senders(email); > insert into senders select * from senders; -- repeat a few times > > begin; > explain > declare senders_c cursor for select email, first_name, last_name > from senders order by email; > NOTICE: QUERY PLAN: > > Index Scan using sendersi on senders (cost=0.00..102.78 rows=2161 > width=36) > > end; > > then made a script file > > begin; > declare senders_c cursor for select email, first_name, last_name > from senders order by email; > > fetch 10000 from senders_c > \g zzz > fetch backward 10000 from senders_c > \g zzz > fetch 10000 from senders_c > \g zzz > fetch backward 10000 from senders_c > \g zzz > fetch 10000 from senders_c > \g zzz > ... repeat a few thousand times ... > > end; > > Backend memory usage is absolutely rock steady. > > Curiously, there does seem to be a leak in psql, or possibly libpq, > when running this script. It's per-query not per-tuple, so it's > easier to see if you make the fetches shorter: > > fetch 100 from senders_c > \g zzz > fetch backward 100 from senders_c > \g zzz > > Repeating these like mad, psql grows about 1Kb/sec on my machine. > This occurs with current sources but NOT with 7.0.2 psql. Peter, > any thoughts about that? Actually you should also consider that there is really intensive inserts happend at the same time. Cursor is closed in 3-4 days only... -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------
Tom Lane writes: > Curiously, there does seem to be a leak in psql, or possibly libpq, > when running this script. It's per-query not per-tuple, so it's > easier to see if you make the fetches shorter: > > fetch 100 from senders_c > \g zzz > fetch backward 100 from senders_c > \g zzz > > Repeating these like mad, psql grows about 1Kb/sec on my machine. > This occurs with current sources but NOT with 7.0.2 psql. Peter, > any thoughts about that? Well, here's a memory leak: Index: print.c =================================================================== RCS file: /home/projects/pgsql/cvsroot/pgsql/src/bin/psql/print.c,v retrieving revision 1.16 diff -c -r1.16 print.c *** print.c 2001/02/27 08:13:27 1.16 --- print.c 2001/03/01 17:30:00 *************** *** 1116,1121 **** --- 1116,1122 ---- free(footers[0]); free(footers); } + free(align); } ===snip This leaks (columns + 1) per query, 4 bytes in your case. But is has been there in 7.0 as well. I couldn't find anything suspicious in the CVS diffs of psql or libpq since 7.0. So let's think: What if you use COPY? \copy? Plain SELECT? What's the leakage per query cycle (not per second)? Is it related to the size of the data? -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: >> Repeating these like mad, psql grows about 1Kb/sec on my machine. >> This occurs with current sources but NOT with 7.0.2 psql. Peter, >> any thoughts about that? > Well, here's a memory leak: Good catch. I confirm this stops the leak in my test. > This leaks (columns + 1) per query, 4 bytes in your case. But is has been > there in 7.0 as well. Ooops, my bad --- repeating my test shows process growth in 7.0 as well. I had missed it the first time because I'd previously done a plain "select *", which ballooned the process footprint of psql. So the leakage was happening within previously-grabbed memory and wasn't obvious in "top". This doesn't seem to actually be related to Denis' problem, but a leak is a leak ... regards, tom lane
> > This leaks (columns + 1) per query, 4 bytes in your case. But is has > > been there in 7.0 as well. > > Ooops, my bad --- repeating my test shows process growth in 7.0 as well. > I had missed it the first time because I'd previously done a plain > "select *", which ballooned the process footprint of psql. So the > leakage was happening within previously-grabbed memory and wasn't > obvious in "top". > > This doesn't seem to actually be related to Denis' problem, but a > leak is a leak ... Is this is the same leak? I mean I do not use psql in any mean... -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------
Denis Perchine <dyp@perchine.com> writes: >> This doesn't seem to actually be related to Denis' problem, but a >> leak is a leak ... > Is this is the same leak? I mean I do not use psql in any mean... It isn't. I'm now suspecting that your problem must be related to doing inserts while a cursor is held open. Are the inserts done by the same backend, or a different one? Could you experiment and see if you can devise a complete example? I'm kinda busy right now looking at WAL... regards, tom lane
> >> This doesn't seem to actually be related to Denis' problem, but a > >> leak is a leak ... > > > > Is this is the same leak? I mean I do not use psql in any mean... > > It isn't. > > I'm now suspecting that your problem must be related to doing inserts > while a cursor is held open. Are the inserts done by the same backend, > or a different one? By different ones. This backend just do fetches. > Could you experiment and see if you can devise a complete example? I'm > kinda busy right now looking at WAL... Inserts are the only operations done besides fetching. This is a complete example. There are no other actions are taken on this table. Only inserts made by other backend, and fetches trough only on cursor. I mean that this is the only backend doing fetches. -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------