"Chris Ruprecht" <chrup999@yahoo.com> writes:
> on a daily basis, I load records from an external source into my database. =
> During the load, the records go through a validation process during which I=
> have to read anywhere between 2 and 15 records from the database.
> The postmaster process which does this, starts out with 10 MB memory usage =
> but soon swells up to about 500 MB (on a normal day). Some days, I receive =
> a larger amount of data and the process swells up to about 1.4 GB before it=
> bombs out with a 'memory exhausted' message.
[ Chris was kind enough to send me his function and sample data ]
Well, the good news is that CVS-tip sources leak no memory on your
example: total memory consumption on my machine stabilizes at about 4Mb
with 2Mb resident. Most of the credit goes to changes Jan made a couple
months ago, but I did some additional cleanup just now.
The bad news is that the changes associated with this are too extensive
to consider back-patching into 7.1.*. The older code is just too
cavalier about allowing transient memory allocated during plpgsql
function execution to be left unreclaimed until the function exits.
That doesn't matter for a function that doesn't run very long, but
since you have a plpgsql function that iterates over tens of thousands
of records, any intrafunction leak will kill you.
What you might consider doing as a hack solution until 7.2 comes out
is to move the body of the main loop of your function out to a separate
plpgsql function, ie, make the main loop look something like
for ipt in select * from dayload2 where type is null loop
n := n + process_one_record(ipt);
end loop;
return n;
This'd be a tad slower, but memory used within the process_one_record
function will be freed when it exits, so that should hold down the
leakage to a tolerable level.
regards, tom lane