Thread: DBD::Pg large processor load.

DBD::Pg large processor load.

From
Chris Gamache
Date:
Linux 2.4.20 & PostgreSQL 7.2.3 & DBD::Pg 1.22.

I was using PgSQL and PgSQL::Cursor with decent results. However, it is no
longer supported, and was causing some problems. So! I switched to DBI. I
immediately noticed a jump in processor usage. I primarily use 
$db->prepare($sql), $rs->execute, and $rs->fetchrow_arrayref.

Are there any DBI experts out there with some advice to cut down on processor usage?

__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com



Re: DBD::Pg large processor load.

From
Rudy Lippan
Date:
On Fri, 18 Apr 2003, Chris Gamache wrote:

> Linux 2.4.20 & PostgreSQL 7.2.3 & DBD::Pg 1.22.
> 
> I was using PgSQL and PgSQL::Cursor with decent results. However, it is no
> longer supported, and was causing some problems. So! I switched to DBI. I
> immediately noticed a jump in processor usage. I primarily use 
> $db->prepare($sql), $rs->execute, and $rs->fetchrow_arrayref.
> 
> Are there any DBI experts out there with some advice to cut down on processor usage?
> 

How much of a processor load are you talking about? Is it a 2%, 20%, 200%
increase in processor usage?

1. Does PgSQL::Cursor uses Cursors?
2. Do you have code that shows the problem?


-r



Re: DBD::Pg large processor load.

From
Chris Gamache
Date:
Procps reports that the 6 simultaneous postmasters (spawned by 6 instances of
the same perl script) each have 20% load with DBI over 6% load each for 6
postmasters with PgSQL::Cursor. (Running Dual Xeons at 2GHz apiece w/o
hyperthreading) 

I can't seem to reproduce the problem with test code (posted in pgsql-general
with a similar thread. Hasn't made it to google yet, otherwise I'd include a
link) I think it has to do with the large dataset it is dealing with. 

As far as PgSQL::Cursor using cursors, I guess that might be the case. I was
hoping someone would suggest using a different DBI method or DBI setting, to
tone down DBI's (perceived) voracity for processor time! 

PgSQL::Cursor was alpha software when I started using it. It was simple to
impliment, and did the trick. It was eclipsed by DBI and as a result was not 
updated. PgSQL.pm didn't cause a problem until recently. Porting to DBI was
simple: different method calls yielding the same results. The only difference
that I can see is the extra load. I've never gone digging in the sources, so
I'm not sure how the two differ. 

I'll settle for extra processor load as a trade off for stability. I was hoping
there would be a well-known and quick fix. My next task is to turn on some
verbose debug logs to perhaps shed some light on what's going on, and let me
know why I can't reproduce this with my test code!

Thanks for the response. Let me know if I've sparked a clue or two...

--- Rudy Lippan <rlippan@remotelinux.com> wrote:
> On Fri, 18 Apr 2003, Chris Gamache wrote:
> 
> > Linux 2.4.20 & PostgreSQL 7.2.3 & DBD::Pg 1.22.
> > 
> > I was using PgSQL and PgSQL::Cursor with decent results. However, it is no
> > longer supported, and was causing some problems. So! I switched to DBI. I
> > immediately noticed a jump in processor usage. I primarily use 
> > $db->prepare($sql), $rs->execute, and $rs->fetchrow_arrayref.
> > 
> > Are there any DBI experts out there with some advice to cut down on
> processor usage?
> > 
> 
> How much of a processor load are you talking about? Is it a 2%, 20%, 200%
> increase in processor usage?
> 
> 1. Does PgSQL::Cursor uses Cursors?
> 2. Do you have code that shows the problem?
> 
> 
> -r
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com



Re: DBD::Pg large processor load.

From
Rudy Lippan
Date:
On Mon, 21 Apr 2003, Chris Gamache wrote:

CCing dbi-users.

> Procps reports that the 6 simultaneous postmasters (spawned by 6 instances of
> the same perl script) each have 20% load with DBI over 6% load each for 6
> postmasters with PgSQL::Cursor. (Running Dual Xeons at 2GHz apiece w/o
> hyperthreading) 

What type of load were you seeing with PgSQL?
> 
> I can't seem to reproduce the problem with test code (posted in pgsql-general
> with a similar thread. Hasn't made it to google yet, otherwise I'd include a
> link) I think it has to do with the large dataset it is dealing with. 
> 
Sample could would be nice ;)

> As far as PgSQL::Cursor using cursors, I guess that might be the case. I was
> hoping someone would suggest using a different DBI method or DBI setting, to
> tone down DBI's (perceived) voracity for processor time! 

It does not seem to. In fact PgSQL does not seem to do much at all -- 
which might explain why it is faster in some things; BTA, much of 
DBI & DBD::Pg are written in C which "should" make them faster :)

> 
> PgSQL::Cursor was alpha software when I started using it. It was simple to
> impliment, and did the trick. It was eclipsed by DBI and as a result was not 
> updated. PgSQL.pm didn't cause a problem until recently. Porting to DBI was
> simple: different method calls yielding the same results. The only difference
> that I can see is the extra load. I've never gone digging in the sources, so
> I'm not sure how the two differ. 
> 
> I'll settle for extra processor load as a trade off for stability. I was hoping
> there would be a well-known and quick fix. My next task is to turn on some
> verbose debug logs to perhaps shed some light on what's going on, and let me
> know why I can't reproduce this with my test code!
> 
1. You might want to try $sth->fetchall_arrayref(). This will suck
everthing into one big arrayref; which is not as bad as it sounds because,
if you are not using cursors, all of the data are returned by postgres and
fetch() just gives you the next row. Of course this will cause a
momentary increase in memory usage because you will have to duplicate all
of the rows instead of one at a time. -- I would be interested in seing if
this causing any increase in performance?

2. If you are doing many prepare()s/execute()s, You might want to try
runing the code out of CVS. DBD::Pg has to preparse the statement that you
give it changing '?' into :p1 and allocating memory and data structs for
handling placeholders.  The preparse code for prepare()/execute() in CVS 
is much leaner that what is in 1.22.  Also try and keep prepare()s out of 
loops, this will save a walk or two of the statement.

3. if memory usage is a problem try using DECLARE CURSOR and FETCH; this 
will give you a fetchrow_arrayref() that takes much less memory 
(depending on the size of the dataset), will use more CPU time. [I know 
you are worried about CPU time]

4. One way to get some pretty large performance increases if you are 
executing() in a tight loop is to use postgres's server side prepare. The 
Code in CVS can do this for you automatically; however the way it is 
writtin, it breaks anything that that cannot be prepared like PREPARE 
'DBD::Pg Statement 1' (varchar,varchar,varchar) AS SELECT * FROM t1 WHERE 
a=$1 and b = $2, and c=$3.  Basically the (varchars) will cause some SQL 
statments to break (eg. WHERE foo BETWEEN $1 AND $2).  Right now CVS head 
has this feature disabled because it breaks too many things, but if you 
want I can make it configurable?


You also might want to look at the DBI profiler. 

If you are interested in seeing what DBI is doing try taking a look at the
output of DBI->trace(9).

-r



Re: DBD::Pg large processor load.

From
Chris Gamache
Date:
Thank you very much for the response. I'll follow your lead and CC to dbi-users
as well.

That sample code never did make it to google... Here it is:

[------TABLE SCHEMA-----]

CREATE TABLE queue ( id int4 DEFAULT nextval('"queue_id_seq"'::text) NOT NULL,  my_type varchar(50),  my_data
varchar(50), status int8 DEFAULT 0,  CONSTRAINT queue_id_key UNIQUE (id)
 
) WITH OIDS;

[---END TABLE SCHEMA----]


[----BEGIN TEST CODE----]

#!/usr/local/bin/perl

#
# Set process id...
#

my $procid = $$;

use DBI;

my $rs;
my $dbh = DBI->connect("dbi:Pg:dbname=mydb","myuser","mypass");

my $finished = false;

while ($finished ne true) {
 # # Begin Transaction... #
 # # Claim a record for ourselves # my $mysql_update = <<EOS;   BEGIN;   LOCK TABLE queue IN EXCLUSIVE MODE;   UPDATE
queue    set status=$procid   WHERE id = (SELECT min(id)      FROM queue      WHERE status=0   );   COMMIT;
 
EOS my $rs_update = $dbh->prepare($mysql_update); $rs_update->execute;
 # # Get the row from the batch_trans_queue # my $mysql_get = <<EOS;   SELECT id, my_type, my_data   FROM queue   WHERE
status=$procid;
EOS my $rs_get = $dbh->prepare($mysql_get); $rs_get->execute;
 # # We got a record... #
 while ($row_get = $rs_get->fetchrow_arrayref) {   #   # Get queue values   #   my @row = @$row_get if $row_get;   my
$id= @row[0];   my $my_type = @row[1];   my $my_data = @row[2];
 
   print "VALUES: $my_type $my_data\n";
   #   # Set record completed   #   $mysql_update = "update queue set status=1 where id=$id;";   $rs_update =
$dbh->prepare($mysql_update);  $rs_update->execute; } $rs_get->finish; sleep(1);
 
}

#
# disconnect from server
#
$dbh->disconnect;

[----END TEST CODE----]

I'm really not "preparing" any of my statements... It seemed that in my
conversion to DBI, prepare() was a required formality. What is the alternative
to executing a pre-prepared SQL statement other than prepare; execute;?

I'm also not cleaning up after my non-row-returning SQL executes. Should I be
cleaning them up somehow?

Here are the things on my list to attempt:1. Use $dbi->trace(9)2. Convert $sth->fetchrow_arrayref() to
$sth->fetchall_arrayref()and see what
 
happens.3. Look into the DBI Profiler you mentioned...

Any other thoughts?

CG

--- Rudy Lippan <rlippan@remotelinux.com> wrote:
> On Mon, 21 Apr 2003, Chris Gamache wrote:
> 
> CCing dbi-users.
> 
> > Procps reports that the 6 simultaneous postmasters (spawned by 6 instances
> of
> > the same perl script) each have 20% load with DBI over 6% load each for 6
> > postmasters with PgSQL::Cursor. (Running Dual Xeons at 2GHz apiece w/o
> > hyperthreading) 
> 
> What type of load were you seeing with PgSQL?
> > 
> > I can't seem to reproduce the problem with test code (posted in
> pgsql-general
> > with a similar thread. Hasn't made it to google yet, otherwise I'd include
> a
> > link) I think it has to do with the large dataset it is dealing with. 
> > 
> Sample could would be nice ;)
> 
> > As far as PgSQL::Cursor using cursors, I guess that might be the case. I
> was
> > hoping someone would suggest using a different DBI method or DBI setting,
> to
> > tone down DBI's (perceived) voracity for processor time! 
> 
> It does not seem to. In fact PgSQL does not seem to do much at all -- 
> which might explain why it is faster in some things; BTA, much of 
> DBI & DBD::Pg are written in C which "should" make them faster :)
> 
> > 
> > PgSQL::Cursor was alpha software when I started using it. It was simple to
> > impliment, and did the trick. It was eclipsed by DBI and as a result was
> not 
> > updated. PgSQL.pm didn't cause a problem until recently. Porting to DBI was
> > simple: different method calls yielding the same results. The only
> difference
> > that I can see is the extra load. I've never gone digging in the sources,
> so
> > I'm not sure how the two differ. 
> > 
> > I'll settle for extra processor load as a trade off for stability. I was
> hoping
> > there would be a well-known and quick fix. My next task is to turn on some
> > verbose debug logs to perhaps shed some light on what's going on, and let
> me
> > know why I can't reproduce this with my test code!
> > 
> 1. You might want to try $sth->fetchall_arrayref(). This will suck
> everthing into one big arrayref; which is not as bad as it sounds because,
> if you are not using cursors, all of the data are returned by postgres and
> fetch() just gives you the next row. Of course this will cause a
> momentary increase in memory usage because you will have to duplicate all
> of the rows instead of one at a time. -- I would be interested in seing if
> this causing any increase in performance?
> 
> 2. If you are doing many prepare()s/execute()s, You might want to try
> runing the code out of CVS. DBD::Pg has to preparse the statement that you
> give it changing '?' into :p1 and allocating memory and data structs for
> handling placeholders.  The preparse code for prepare()/execute() in CVS 
> is much leaner that what is in 1.22.  Also try and keep prepare()s out of 
> loops, this will save a walk or two of the statement.
> 
> 3. if memory usage is a problem try using DECLARE CURSOR and FETCH; this 
> will give you a fetchrow_arrayref() that takes much less memory 
> (depending on the size of the dataset), will use more CPU time. [I know 
> you are worried about CPU time]
> 
> 4. One way to get some pretty large performance increases if you are 
> executing() in a tight loop is to use postgres's server side prepare. The 
> Code in CVS can do this for you automatically; however the way it is 
> writtin, it breaks anything that that cannot be prepared like PREPARE 
> 'DBD::Pg Statement 1' (varchar,varchar,varchar) AS SELECT * FROM t1 WHERE 
> a=$1 and b = $2, and c=$3.  Basically the (varchars) will cause some SQL 
> statments to break (eg. WHERE foo BETWEEN $1 AND $2).  Right now CVS head 
> has this feature disabled because it breaks too many things, but if you 
> want I can make it configurable?
> 
> 
> You also might want to look at the DBI profiler. 
> 
> If you are interested in seeing what DBI is doing try taking a look at the
> output of DBI->trace(9).
> 
> -r
> 



__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com