On Tue, Mar 11, 2003 at 10:34:08AM -0500, Tom Lane wrote:
> Magnus =?iso-8859-1?Q?M=E5nsson?= <ganja@0x63.nu> writes:
> > Making the script wasnt any big problem, the problem is the amount of memory
> > it wants to use, if I had a terabyte of memory I do not think that would be
> > enough, if I try to convert a 100th of the database at a time it uses all
> > the swap, 1G and the 512M memory.
>
> It would help if you showed us the function and told us what version of
> Postgres you're talking about. This could just be a garden-variety
> memory leakage problem.
>
> regards, tom lane
>
I have included my function below, a short explaination:
The connectionstats_yyyy_mm_dd tables contains information about connections
on the network, one tables for every day. hosts_yyyy_mm_dd contains
information about the hosts that made the connections, also one table for
every day.
connectionstats_2003_01_01 contains 1.7 million rows and hosts_2003_01_01
98.000 rows.
My goal is to get all connectionstats in one table and all hosts in one
table, though the connectionstats are referring to hosts using the id I have
to recreate the ids and give the connection-row the right id. I also have to
check that I haven't already inserted the host in the hosts-table from
another day (ip and hostname together should be a unique key).
If I run this function it will in some minutes eat up all the avail memory
(512+1024M) and after that the postmaster-process will be killed by the
kernel.
My workaround that I have made now is a version where I have starthour as an
parameter and it runs it in 1 hour-intervals to us less memory (1 day takes
24 function-calls).
The version of postgresql running on the server is 7.1 and if needed I can
test it on data I have duplicæted to another server with 7.3.
--create or replace function constats()
drop function constats()
create function constats()
returns int2 as '
declare
rec1_ RECORD;
rec2_ RECORD;
rec3_ RECORD;
client_ int;
server_ int;
begin
for rec1_ in select * from connectionstats_2003_01_01 loop
select into rec2_ address, hostname, is_local from hosts_2003_01_01 where id=rec1_.client;
if not found then
raise notice ''foo: %'', rec1_.client;
end if;
select into rec3_ id from new_hosts where address=rec2_.address and hostname=rec2_.hostname;
if not found then
--The host was not found in the new table, insert it.
insert into new_hosts (address, hostname, is_local) values (rec2_.address, rec2_.hostname, rec2_.is_local);
get diagnostics client_ = result_oid;
else
client_ := rec3_.id;
end if;
select into rec2_ address, hostname, is_local from hosts_2003_01_01 where id=rec1_.server;
if not found then
raise notice ''bar'';
end if;
select into rec3_ id from new_hosts where address=rec2_.address and hostname=rec2_.hostname;
if not found then
--The host was not found in the new table, insert it.
insert into new_hosts (address, hostname, is_local) values (rec2_.address, rec2_.hostname, rec2_.is_local);
get diagnostics server_ = result_oid;
else
server_ := rec3_.id;
end if;
insert into new_connectionstats (time, client, server, server_port, protocol, service, inbound, outbound) values
(rec1_.time, client_, server_, rec1_.server_port, rec1_.protocol, rec1_.service, rec1_.inbound, rec1_.outbound);
end loop;
return (0);
end;'
LANGUAGE 'plpgsql';
--
Magnus Månsson