Thread: out of memory for query result

out of memory for query result

From
Allen
Date:
I am trying to select a result set from a 2-table join, which should be
returning 5,045,358 rows. I receive this error:

     DBD::Pg::st execute failed: out of memory for query result

I am using Perl with DBI cursor (so i think) to retreive the data
(prepare, execute,  fetchrow_hashref, ..., finish). Perhaps either the
DBD or libpq or something is buffering the result and not passing
individual rows from the server (which runs on the same server as the
application).

I am using Postgres 7.4.7, under 4.11-RELEASE FreeBSD 4.11-RELEASE using
perl v5.8.6 and DBD-Pg-1.32_1.

Any suggestions on how to avoid this? Should i be using the API
differently with Perl?

Thanks,
Allen

Re: out of memory for query result

From
Martijn van Oosterhout
Date:
On Sat, Oct 22, 2005 at 03:46:18PM -0400, Allen wrote:
> I am trying to select a result set from a 2-table join, which should be
> returning 5,045,358 rows. I receive this error:
>
>     DBD::Pg::st execute failed: out of memory for query result

AFAIK, DBD:Pg never uses a cursor unless you ask it to. So you probably
want to code a loop like:

DECLARE CURSOR blah AS ...
while( FETCH 1000 )
{
  process rows...
}

If you don't use a cursor in the backend, then DBI will try to pull the
*entire* result and store it in memory, which is why you don't have
enough...

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: out of memory for query result

From
Allen Fair
Date:
 From my googling, it seems the Perl DBD driver for Postgres does *not*
support the cursor (see below). I hope someone can refute this!

I am otherwise looking for code to implement Postgres cursors in Perl. I
can not find the "DECLARE CURSOR" defined in the Perl DBI documentation
either. Thanks Martijn for your reply, it helped me dig deeper.

The following code does not work, but I'll keep trying! (I just added
the declare phrase.)
     $dbh = DBI->connect("DBI:Pg:dbname=$dbName;host=$host",
       $dbUser, $dbPassword,
       { RaiseError => 0, AutoCommit => 0, PrintError => 1 });
     $sth = $dbh->prepare("declare csr cursor for $sqlstatement");
     $sth->execute(@statement_parms) or die $DBI::errstr;
     while (my $hr = $sth->fetchrow_hashref) {
       # do something wonderful
     }
     $sth->finish();
DBD::Pg::st fetchrow_hashref failed: no statement executing

 From the DBD-Pg-1.32 module documentation on CPAN...
http://search.cpan.org/~rudy/DBD-Pg-1.32/Pg.pm#Cursors
"Although PostgreSQL has a cursor concept, it has *not* been used in the
current implementation. Cursors in PostgreSQL can only be used inside a
transaction block. Because only one transaction block at a time is
allowed, this would have implied the restriction, not to use any nested
SELECT statements. Hence the execute method fetches all data at once
into data structures located in the frontend application. This has to be
considered when selecting large amounts of data!"

Is this a Perl only restriction? How about Python or Ruby?

Thanks,
Allen

Martijn van Oosterhout wrote:
> On Sat, Oct 22, 2005 at 03:46:18PM -0400, Allen wrote:
>
>>I am trying to select a result set from a 2-table join, which should be
>>returning 5,045,358 rows. I receive this error:
>>
>>    DBD::Pg::st execute failed: out of memory for query result
>
>
> AFAIK, DBD:Pg never uses a cursor unless you ask it to. So you probably
> want to code a loop like:
>
> DECLARE CURSOR blah AS ...
> while( FETCH 1000 )
> {
>   process rows...
> }
>
> If you don't use a cursor in the backend, then DBI will try to pull the
> *entire* result and store it in memory, which is why you don't have
> enough...
>
> Hope this helps,

Re: out of memory for query result

From
Douglas McNaught
Date:
Allen Fair <allen@cyberdesk.com> writes:

>  From my googling, it seems the Perl DBD driver for Postgres does
>  *not* support the cursor (see below). I hope someone can refute this!
>
> I am otherwise looking for code to implement Postgres cursors in
> Perl. I can not find the "DECLARE CURSOR" defined in the Perl DBI
> documentation either. Thanks Martijn for your reply, it helped me dig
> deeper.
>
> The following code does not work, but I'll keep trying! (I just added
> the declare phrase.)
>      $dbh = DBI->connect("DBI:Pg:dbname=$dbName;host=$host",
>        $dbUser, $dbPassword,
>        { RaiseError => 0, AutoCommit => 0, PrintError => 1 });
>      $sth = $dbh->prepare("declare csr cursor for $sqlstatement");
>      $sth->execute(@statement_parms) or die $DBI::errstr;

I think you need to DECLARE the cursor first, then for your loop do:

<loop>
  FETCH 100 FROM csr;
  <loop calling fetchrow_hashref() 100 times or until it returns undef>
    <process the row>
  </loop>
</loop>

You can execute FETCH once for each row, but it'll be faster to batch
it up as above.

Read up on DECLARE and FETCH in the SQL docs.  I don't know of any
reason why you can't use them from Perl; it's just not done
automatically behind the scenes.

-Doug


Re: out of memory for query result

From
Martijn van Oosterhout
Date:
On Sat, Oct 22, 2005 at 06:15:59PM -0400, Allen Fair wrote:
> From my googling, it seems the Perl DBD driver for Postgres does *not*
> support the cursor (see below). I hope someone can refute this!
>
> I am otherwise looking for code to implement Postgres cursors in Perl. I
> can not find the "DECLARE CURSOR" defined in the Perl DBI documentation
> either. Thanks Martijn for your reply, it helped me dig deeper.

Well, DBI doesn't support doing the cursor bit for you. But you can use
cursors just fine. Your code is almost there:

>     $dbh = DBI->connect("DBI:Pg:dbname=$dbName;host=$host",
>       $dbUser, $dbPassword,
>       { RaiseError => 0, AutoCommit => 0, PrintError => 1 });
>     $sth = $dbh->prepare("declare csr cursor for $sqlstatement");
>     $sth->execute(@statement_parms) or die $DBI::errstr;

      for(;;)
      {
        $sth = $dbh->prepare("fetch 1000 from csr");
        $sth->execute();
        last if( $sth->rows == 0 );

>     while (my $hr = $sth->fetchrow_hashref) {
>       # do something wonderful
>     }
>     $sth->finish();

      }

> Is this a Perl only restriction? How about Python or Ruby?

It's not a restriction. No other language does it automatically either.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: out of memory for query result

From
Allen
Date:
Thanks, everyone. I got it to work! Here is my solution hoping it is
useful to the next programmer.

PROBLEM: Perl DBI for Postgres does not implement cursors. All query
results are cached in memory. For very large result sets, this give the
"out of memory for query result" message.

The prepare(select_statement)/execute(@parms) did not open a cursor as I
had thought. It must be explicitly coded. This technique is only
applicable for processing large result sets that you do not want cached
completely in memory. It may not work for nested cursors?

SOLUTION: Run raw "DECLARE CURSOR" and "FETCH nnn FROM cursor" commands
to extract your data.
* Connect with AutoCommit=>0 to enable transactions
* prepare/execute DECLARE cursorname CURSOR FOR select...
   ? Parameters to the SQL are specified here.
* Loop
   * prepare/execute FETCH nnn FROM cursor_name,
     which buffers only the next 'nnn' rows from the cursor
     use a large enough number to decrease server/client overhead
     and small enough to co-exist with other apps/threads.
   * Loop
     * fetchrow_hashref until undef (end of current FETCH set)
     * do something wonderful with the row
* prepare/execute Close Cursor

EXAMPLE: This may not be the cleanest code, but works!

#!/usr/local/bin/perl -w
use strict;
use DBI;

my $dbName='allen';
my $host='localhost';
my $dbUser=$dbName;
my $dbPassword='';

my $dbh = DBI->connect("DBI:Pg:dbname=$dbName;host=$host", $dbUser,
$dbPassword,
         { RaiseError => 0, AutoCommit => 0, PrintError => 1 })
         or die "Can't connect to db\n";

loopCursor(
   sub
   {
     my ($row) = @_;
     print "$row->{name}\n";
   },
   "SELECT name from population where ssn>=?",1
);
$dbh->commit();
$dbh->disconnect();
exit;

my $csrnum=0;
sub loopCursor
{
   my ($proc, $sql, @parms) = @_;
   my ($sth, $row);
   ++$csrnum;
   my $count=0;
   eval {
     runSQL("declare csr_$csrnum cursor for $sql", @parms);
     for(;;) {
        $sth = $dbh->prepare("fetch 1000 from csr_$csrnum")
           or die "fetch 1000 from csr  $DBI::errstr\n";
        $sth->execute() or die "loopCursor fetch  $DBI::errstr\n";
        last if $sth->rows == 0;

        while ($row = $sth->fetchrow_hashref) {
         ++$count;
         &$proc($row);
        }
        $sth->finish();
     }
     runSQL("close csr_$csrnum");
     return $count;
   };
   die join(' ', "Error $@ during", $sql, @parms,
         $DBI::errstr||'',"\n") if $@;
}

sub runSQL
{
   my ($sql, @parms) = @_;
   my $sth;
   eval {
     $sth = $dbh->prepare($sql);
     $sth->execute(@parms) or die $DBI::errstr;
   };
   die "Error $@ during $sql @parms\n" if $@;
   $sth->finish();
   return $sth->rows;
}