Thread: Perl/DBI vs Native

Perl/DBI vs Native

From

Valentin Bogdanov

Date:

21 July 2008, 08:19:49

Hi,

I have ran quite a few tests comparing how long a query takes to execute from Perl/DBI as compared to psql/pqlib. No
matterhow many times I run the test the results were always the same. 

I run a SELECT all on a fairly big table and enabled the log_min_duration_statement option. With psql postgres
consistentlylogs half a second while the exact same query executed with Perl/DBI takes again consistently 2 seconds. 

If I were timing the applications I would have been too much surprised by these results, obviously, processing with
Perlwould be slower than a native application. But it's the postmaster that gives these results. Could it be because
theDBI module is slower at assimilating the data? 

Any light on the subject would be greatly appreciated.


Regards,

Val


      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo!
http://uk.docs.yahoo.com/ymail/new.html

Re: Perl/DBI vs Native

From

Craig Ringer

Date:

21 July 2008, 10:49:29

Valentin Bogdanov wrote:
> Hi,
>
> I have ran quite a few tests comparing how long a query takes to execute from Perl/DBI as compared to psql/pqlib. No
matterhow many times I run the test the results were always the same. 
>
> I run a SELECT all on a fairly big table and enabled the log_min_duration_statement option. With psql postgres
consistentlylogs half a second while the exact same query executed with Perl/DBI takes again consistently 2 seconds. 
>
> If I were timing the applications I would have been too much surprised by these results, obviously, processing with
Perlwould be slower than a native application. But it's the postmaster that gives these results. Could it be because
theDBI module is slower at assimilating the data? 
>
> Any light on the subject would be greatly appreciated.

Random guess: Perl's DBI is using parameterized prepared statements,
preventing the optimizer from using its knowledge about common values in
the table to decide whether or not index use is appropriate. When you're
writing the query in psql, you're not using prepared statements so the
optimizer can be cleverer.

Try comparing:

SELECT statement

to

PREPARE test(params) AS statement;
EXECUTE test(params);

eg:

SELECT x + 44 FROM t;

vs:

PREPARE test(int) AS x + $1 FROM t;
EXECUTE test(44);

Use EXPLAIN ANALYZE to better understand the changes in the query plan.

--
Craig Ringer

Re: Perl/DBI vs Native

From

Rusty Conover

Date:

21 July 2008, 10:57:47

On Jul 21, 2008, at 5:19 AM, Valentin Bogdanov wrote:

> Hi,
>
> I have ran quite a few tests comparing how long a query takes to
> execute from Perl/DBI as compared to psql/pqlib. No matter how many
> times I run the test the results were always the same.
>
> I run a SELECT all on a fairly big table and enabled the
> log_min_duration_statement option. With psql postgres consistently
> logs half a second while the exact same query executed with Perl/DBI
> takes again consistently 2 seconds.
>
> If I were timing the applications I would have been too much
> surprised by these results, obviously, processing with Perl would be
> slower than a native application. But it's the postmaster that gives
> these results. Could it be because the DBI module is slower at
> assimilating the data?

Hi Val,

Yes, DBI can be slower then the native C interface.  The speed depends
on how the data is being returned inside of Perl. Returning hashes is
a slower method then returning arrays from what I've found due to the
overhead in the creation of the objects in Perl.

So:

my $r = $dbh->selectall_arrayref("select * from table", { Columns =>
{}});

Is slower then:

my $r = $dbh->selectall_arrayref("select * from table", undef);

Secondarily, if you're returning a lot of rows you may want to look
into using a cursor, so that you can fetch the rows a 1000 at a time
in a tight loop then discard them once you are done with them.  This
will hopefully prevent the system from having continually allocate
memory for all of your rows.  For each field in each row Perl
allocates memory to store the value from Postgres, so if you have many
fields on your table this can be a large number of allocations
depending on the number of rows.  Any userland profile tool should
help you debug what's going on here.

Cheers,

Rusty
--
Rusty Conover
InfoGears Inc.
http://www.infogears.com - http://www.gearbuyer.com - http://www.footwearbuyer.com

Re: Perl/DBI vs Native

From

Craig James

Date:

21 July 2008, 14:24:12

Valentin Bogdanov wrote:
> I have ran quite a few tests comparing how long a query takes to
> execute from Perl/DBI as compared to psql/pqlib. No matter how many
> times I run the test the results were always the same.
>
> I run a SELECT all on a fairly big table and enabled the
> log_min_duration_statement option. With psql postgres consistently
> logs half a second while the exact same query executed with Perl/DBI
> takes again consistently 2 seconds.

The problem may be that your two tests are not equivalent.  When Perl executes a statement, it copies the *entire*
resultset back to the client before it returns the first row.  The following program might appear to just be fetching
thefirst row: 

  $sth = $dbh->prepare("select item from mytable");
  $sth->execute();
  $item = $sth->fetchrow_array();

But in fact, before Perl returns from the $sth->execute() statement, it has already run the query and copied all of the
rowsinto a hidden, client-side cache.  Each $sth->fetchrow_array() merely copies the data from the hidden cache into
yourlocal variable. 

By contrast, psql executes the query, and starts returning the data a page at a time.  So it may appear to be much
faster.

This also means that Perl has trouble with very large tables.  If the "mytable" in the above example is very large, say
ahundred billion rows, you simply can't execute this statement in Perl.  It will try to copy 100 billion rows into
memorybefore returning the first answer. 

The reason for Perl's behind-the-scenes caching is because it allows multiple connections to a single database, and
multiplestatements on each database handle.  By executing each statement completely, it gives the appearance that
multipleconcurrent queries are supported.  The downside is that it can be a huge memory hog. 

Craig

Re: Perl/DBI vs Native

From

Tom Lane

Date:

21 July 2008, 15:46:15

Craig James <craig_james@emolecules.com> writes:
> Valentin Bogdanov wrote:
>> I have ran quite a few tests comparing how long a query takes to
>> execute from Perl/DBI as compared to psql/pqlib. No matter how many
>> times I run the test the results were always the same.
>>
>> I run a SELECT all on a fairly big table and enabled the
>> log_min_duration_statement option. With psql postgres consistently
>> logs half a second while the exact same query executed with Perl/DBI
>> takes again consistently 2 seconds.

> The problem may be that your two tests are not equivalent.  When Perl
> executes a statement, it copies the *entire* result set back to the
> client before it returns the first row.

Sure, but so does psql (unless you've turned on the magic FETCH_COUNT
setting).  I think the theories about prepared versus literal statements
were more promising; but I don't know DBI well enough to know exactly
what it was sending to the server.

            regards, tom lane

Re: Perl/DBI vs Native

From

"Greg Sabino Mullane"

Date:

21 July 2008, 17:37:50

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Tom Lane wrote:
> Sure, but so does psql (unless you've turned on the magic FETCH_COUNT
> setting).  I think the theories about prepared versus literal statements
> were more promising; but I don't know DBI well enough to know exactly
> what it was sending to the server.

Almost certainly a prepared_statement unless no placeholders were being
used at all. Another way to test (from the DBI side) is to set
$sth->{pg_server_prepare} = 0, which will send the SQL directly to the
backend, just as if you've typed it in at a command prompt. You can also
use the tracing mechanism of DBI to see what's going on behind the scenes.
For example:

$dbh->trace('SQL');

$dbh->do("SELECT 1234 FROM pg_class WHERE relname = 'bob'");
$dbh->do("SELECT 1234 FROM pg_class WHERE relname = ?", undef, 'mallory');

$sth = $dbh->prepare("SELECT 4567 FROM pg_class WHERE relname = ?");
$sth->execute('alice');
$sth->{pg_server_prepare} = 0;
$sth->execute('eve1');
$sth->{pg_server_prepare} = 1;
$sth->execute('eve2');

$dbh->commit;

Outputs:

===

begin;

SELECT 1234 FROM pg_class WHERE relname = 'bob';

EXECUTE SELECT 1234 FROM pg_class WHERE relname = $1 (
$1: mallory
);

PREPARE dbdpg_p22988_1 AS SELECT 4567 FROM pg_class WHERE relname = $1;

EXECUTE dbdpg_p22988_1 (
$1: alice
);

SELECT 4567 FROM pg_class WHERE relname = 'eve1';

EXECUTE dbdpg_p22988_1 (
$1: eve2
);

commit;

DEALLOCATE dbdpg_p22988_1;

===

You can even view exactly which libpq calls are being used at each point with:

$dbh->trace('SQL,libpq');

To get back to the original poster's complaint, you may want to figure out why
the difference is so great for a prepared plan. It may be that you need to
cast the placeholder(s) to a specific type, for example.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200807211637
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkiE83wACgkQvJuQZxSWSsiGrwCdGMLgauGwR2UzfoMPrTH/mrRg
nxsAnjx14goMV23a9yRjtSw+ixJWQkuI
=gjVE
-----END PGP SIGNATURE-----

Re: Perl/DBI vs Native

From

Valentin Bogdanov

Date:

22 July 2008, 08:32:37

Thanks to everyone who replied. There were some really good points.

However, I found what is causing the difference. The perl program was connecting to the database via a TCP socket while
theC version was using Unix socket. I changed the connect in my perl script, so that it now uses Unix sockets as well.
Runthe tests again and got identical results for both programs. 

In case someone is wondering, the way to force DBI to use unix sockets is by not specifying a host and port in the
connectcall. 


Cheers,

Val


      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo!
http://uk.docs.yahoo.com/ymail/new.html

Re: Perl/DBI vs Native

From

"Greg Sabino Mullane"

Date:

22 July 2008, 17:28:30

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> In case someone is wondering, the way to force DBI to use unix
> sockets is by not specifying a host and port in the connect call.

Actually, the host defaults to the local socket. Using the port
may still be needed: if you leave it out, it simply uses the default
value (5432) if left out. Thus, for most purposes, just leaving
the host out is enough to cause a socket connection on the default
port.

For completeness in the archives, you can also specify the complete
path to the unix socket directory in the host line, for those cases in
which the socket is not where you expect it to be:

$dbh = DBI->connect('dbi:Pg:dbname=test;host=/var/local/sockets',
 $user, $pass, {AutoCommit=>0, RaiseError=>1});

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200807221248
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkiGD1UACgkQvJuQZxSWSsjHTQCfYbGnh3dvs9ggZX0FCSwMro81
sJsAoOUcDyu6vQM43EJOGAay/vXyKWES
=hYLf
-----END PGP SIGNATURE-----

Re: Perl/DBI vs Native

From

"Jeffrey Baker"

Date:

22 July 2008, 18:03:04

On Tue, Jul 22, 2008 at 9:48 AM, Greg Sabino Mullane <greg@turnstep.com> wrote:
>> In case someone is wondering, the way to force DBI to use unix
>> sockets is by not specifying a host and port in the connect call.
>
> Actually, the host defaults to the local socket. Using the port
> may still be needed: if you leave it out, it simply uses the default
> value (5432) if left out. Thus, for most purposes, just leaving
> the host out is enough to cause a socket connection on the default
> port.

For the further illumination of the historical record, the best
practice here is probably to use the pg_service.conf file, which may
or may not live in /etc depending on your operating system.  Then you
can connect in DBI using dbi:Pg:service=whatever, and change the
definition of "whatever" in pg_service.conf.  This has the same
semantics as PGSERVICE=whatever when using psql.  It's a good idea to
keep these connection details out of your program code.

-jwb

Re: Perl/DBI vs Native

From

Valentin Bogdanov

Date:

23 July 2008, 07:47:07

Thanks Guys, this is really useful, especially the pg_service.conf. I have got an app where the connection parameters
haveto be set in 3 different places I was thinking of writing something myself but now that I know of pg_service.conf,
problemsolved. 

Regards,
Val


--- On Tue, 22/7/08, Jeffrey Baker <jwbaker@gmail.com> wrote:

> From: Jeffrey Baker <jwbaker@gmail.com>
> Subject: Re: [PERFORM] Perl/DBI vs Native
> To: "Greg Sabino Mullane" <greg@turnstep.com>
> Cc: pgsql-performance@postgresql.org
> Date: Tuesday, 22 July, 2008, 9:35 PM
> On Tue, Jul 22, 2008 at 9:48 AM, Greg Sabino Mullane
> <greg@turnstep.com> wrote:
> >> In case someone is wondering, the way to force DBI
> to use unix
> >> sockets is by not specifying a host and port in
> the connect call.
> >
> > Actually, the host defaults to the local socket. Using
> the port
> > may still be needed: if you leave it out, it simply
> uses the default
> > value (5432) if left out. Thus, for most purposes,
> just leaving
> > the host out is enough to cause a socket connection on
> the default
> > port.
>
> For the further illumination of the historical record, the
> best
> practice here is probably to use the pg_service.conf file,
> which may
> or may not live in /etc depending on your operating system.
>  Then you
> can connect in DBI using dbi:Pg:service=whatever, and
> change the
> definition of "whatever" in pg_service.conf.
> This has the same
> semantics as PGSERVICE=whatever when using psql.  It's
> a good idea to
> keep these connection details out of your program code.
>
> -jwb
>
> --
> Sent via pgsql-performance mailing list
> (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo!
http://uk.docs.yahoo.com/ymail/new.html