Re: commiting in/between functions - Mailing list pgsql-general

From Magnus Månsson
Subject Re: commiting in/between functions
Date
Msg-id 20030311185252.GA13086@freija
Whole thread Raw
In response to Re: commiting in/between functions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Tue, Mar 11, 2003 at 10:34:08AM -0500, Tom Lane wrote:
> Magnus =?iso-8859-1?Q?M=E5nsson?= <ganja@0x63.nu> writes:
> > Making the script wasnt any big problem, the problem is the amount of memory
> > it wants to use, if I had a terabyte of memory I do not think that would be
> > enough, if I try to convert a 100th of the database at a time it uses all
> > the swap, 1G and the 512M memory.
>
> It would help if you showed us the function and told us what version of
> Postgres you're talking about.  This could just be a garden-variety
> memory leakage problem.
>
>             regards, tom lane
>

I have included my function below, a short explaination:
The connectionstats_yyyy_mm_dd tables contains information about connections
on the network, one tables for every day. hosts_yyyy_mm_dd contains
information about the hosts that made the connections, also one table for
every day.
connectionstats_2003_01_01 contains 1.7 million rows and hosts_2003_01_01
98.000 rows.

My goal is to get all connectionstats in one table and all hosts in one
table, though the connectionstats are referring to hosts using the id I have
to recreate the ids and give the connection-row the right id. I also have to
check that I haven't already inserted the host in the hosts-table from
another day (ip and hostname together should be a unique key).

If I run this function it will in some minutes eat up all the avail memory
(512+1024M) and after that the postmaster-process will be killed by the
kernel.

My workaround that I have made now is a version where I have starthour as an
parameter and it runs it in 1 hour-intervals to us less memory (1 day takes
24 function-calls).

The version of postgresql running on the server is 7.1 and if needed I can
test it on data I have duplicæted to another server with 7.3.


--create or replace function constats()
drop function constats()
create function constats()
returns int2 as '
declare
  rec1_ RECORD;
  rec2_ RECORD;
  rec3_ RECORD;
  client_ int;
  server_ int;
begin
  for rec1_ in select * from connectionstats_2003_01_01 loop
    select into rec2_ address, hostname, is_local from hosts_2003_01_01 where id=rec1_.client;
    if not found then
      raise notice ''foo: %'', rec1_.client;
    end if;
    select into rec3_ id from new_hosts where address=rec2_.address and hostname=rec2_.hostname;
    if not found then
        --The host was not found in the new table, insert it.
      insert into new_hosts (address, hostname, is_local) values (rec2_.address, rec2_.hostname, rec2_.is_local);
      get diagnostics client_ = result_oid;
    else
      client_ := rec3_.id;
    end if;
    select into rec2_ address, hostname, is_local from hosts_2003_01_01 where id=rec1_.server;
    if not found then
      raise notice ''bar'';
    end if;
    select into rec3_ id from new_hosts where address=rec2_.address and hostname=rec2_.hostname;
    if not found then
        --The host was not found in the new table, insert it.
      insert into new_hosts (address, hostname, is_local) values (rec2_.address, rec2_.hostname, rec2_.is_local);
      get diagnostics server_ = result_oid;
    else
      server_ := rec3_.id;
    end if;
    insert into new_connectionstats (time, client, server, server_port, protocol, service, inbound, outbound) values
    (rec1_.time, client_, server_, rec1_.server_port, rec1_.protocol, rec1_.service, rec1_.inbound, rec1_.outbound);
  end loop;
  return (0);
end;'
LANGUAGE 'plpgsql';

--
Magnus Månsson

pgsql-general by date:

Previous
From: "Johnson, Shaunn"
Date:
Subject: list of non-printing table
Next
From: Yohans Mendoza
Date:
Subject: unsubscribe