Re: overload - Mailing list pgsql-sql

From Viktor Bojović
Subject Re: overload
Date
Msg-id CAJu1cLYZC+TEXnh+s_jVUCBzPm6C1SgHtgd+=hOhPRDvAXuEKw@mail.gmail.com
Whole thread Raw
In response to Re: overload  (lists-pgsql@useunix.net)
Responses Re: overload  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: overload  (lists-pgsql@useunix.net)
List pgsql-sql
Thanx Wayne,
at the end i did it that way and it works. 
The code is below.

CREATE FUNCTION pattern_counter1("patLength" integer) RETURNS character varying
LANGUAGE plperl
AS $_X$
my $rvCnt = spi_exec_query("select count(1) as cnt from entry");
#my $rowCountAll = $rvCnt->{processed};
my $row = $rvCnt->{rows}[0];
my $rowCountAll = $row->{cnt};
my $windowSize = 500000;
my %patterns=();
for (my $p=0;$p<$rowCountAll;$p+=$windowSize){
my $sql="select sequence from entry limit $windowSize offset $p";

my $rv = spi_exec_query($sql);
my $rowCount = $rv->{processed};
my $patLen = $_[0];
my $patt = '';

foreach my $rn (0 .. $rowCount -1){
my $row = $rv->{rows}[$rn];
my $seq = $row->{sequence};
for (my $x = 1;$x<=length($seq) - $patLen;$x++){
$patt=substr($seq,$x,$patLen);
if (! defined $patterns{$patt}) {
$patterns{$patt}=1;
}else{
$patterns{$patt}++;
}
}
}
}

foreach $patt (keys %patterns){
my $sql="insert into patterns values('".$patt."',".$patterns{$patt}.")";
spi_exec_query($sql);
}
return $tmp;
$_X$;


On Fri, Jul 8, 2011 at 8:50 PM, <lists-pgsql@useunix.net> wrote:
I'm have the same situation with large tables.  Take a look at using a
cursor to fetch several thousand rows at a time.  I presume what's
happening is that perl is attempting to create a massive list/array in
memory.  If you use a cursor the list should only contain X number of
rows where X in the number specified at each fetch execution.  You'll
need to define the cursor inside a transaction block.

- begin transaction
- define the cursor
- fetch rows from cursor
- while row count from previous step > 0, execute previous step
- terminate transaction

Or you could use plpgsql instead of plperl, FOR loops over result sets in
plpgsql implicitly use cursors... it's just a little less code.

Hope that helps,
Wayne

On Tue, Jul 05, 2011 at 10:29:03PM +0200, Viktor Bojovi?? wrote:
> Hi,
> while reading 20GB table through PL/PERL function , it constantly grows in
> RAM.
> I wanted to ask you which is the best way to read table inside that
> function without such memory consumption.
> Thanks in advance
>
> Code is here:
>
> CREATE  FUNCTION pattern_counter("patLength" integer)
>   RETURNS varchar AS
> $BODY$
>     my $rv = spi_exec_query("select sequence from entry");
>     my $rowCount = $rv->{processed};
>     my $patLen = $_[0];
>     my $patt = '';
>     my %patterns=();
>     foreach my $rn (0 .. $rowCount -1){
>     my $row = $rv->{rows}[$rn];
>     my $seq = $row->{sequence};
>     for (my $x = 1;$x<=length($seq) - $patLen;$x++){
>         $patt=substr($seq,$x,$patLen);
>         if (! defined $patterns{$patt}) {
>         $patterns{$patt}=1;
>         }else{
>         $patterns{$patt}++;
>         }
>     }
>     }
>     foreach $patt (keys %patterns){
>     my $sql="insert into patterns values('".$patt."',".$patterns{$patt}.")";
>     spi_exec_query($sql);
>     }
> return '';
> $BODY$
>   LANGUAGE plperl VOLATILE
>   COST 100;
>
>
>
> --
> ---------------------------------------
> Viktor Bojovi??
> ---------------------------------------
> Wherever I go, Murphy goes with me



--
---------------------------------------
Viktor Bojović
---------------------------------------
Wherever I go, Murphy goes with me

pgsql-sql by date:

Previous
From: lists-pgsql@useunix.net
Date:
Subject: Re: overload
Next
From: Pavel Stehule
Date:
Subject: Re: overload