Thread: DBD::Pg performance bites compared to DBD::mysql?

DBD::Pg performance bites compared to DBD::mysql?

From
Tom Lane
Date:
In my copious free time ;-), I've been poking at MySQL's "sql-bench"
benchmark and trying to understand why it makes us look so bad.
There are a number of things involved, but one item that I just realized
tonight is that a pretty substantial part of the problem is on the
client side.  The test driver is written in Perl and depends on DBI/DBD
to connect to the database.  Here is a Perl Devel::DProf trace for a
run of 300000 simple INSERT commands (plus a few CREATE TABLE and other
ilk):

PG:

Total Elapsed Time = 231.6619 Seconds User+System Time = 132.5019 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name104.   4.140 138.81 300007   0.0000 0.0005  DBI::db::do101.   27.09
134.67300007   0.0001 0.0004  DBD::Pg::db::do44.2   3.050 58.638 300008   0.0000 0.0002  DBI::db::prepare41.9   9.660
55.589300008   0.0000 0.0002  DBD::Pg::db::prepare29.9   8.270 39.629 300008   0.0000 0.0001  DBI::_new_sth26.5   35.16
35.160300008   0.0001 0.0001  DBI::st::execute23.6   14.17 31.359 300013   0.0000 0.0001  DBI::_new_handle11.0   14.66
14.660300013   0.0000 0.0000  DBI::_setup_handle4.75   6.300  6.300 300008   0.0000 0.0000  DBD::Pg::st::_prepare4.75
6.300 6.300 300004   0.0000 0.0000  DBI::st::rows4.04   5.350  5.350 600016   0.0000 0.0000  DBI::st::DESTROY1.91
2.530 2.530 300013   0.0000 0.0000  DBI::st::TIEHASH1.61   2.130  2.130 300011   0.0000 0.0000
DBD::_mem::common::DESTROY0.14  0.109  0.188      7   0.0156 0.0268  main::BEGIN0.02   0.020  0.030      1   0.0199
0.0297 DBI::install_driver
 

MySQL:

Total Elapsed Time = 115.0148 Seconds User+System Time = 31.19480 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name69.1   21.57 21.570 300006   0.0001 0.0001  DBI::db::do0.57   0.099
0.177     7   0.0142 0.0254  main::BEGIN0.06   0.020  0.020      6   0.0033 0.0033  DynaLoader::dl_load_file0.06
- 0.020      5        - 0.0040  Cwd::BEGIN0.06       -  0.020      3        - 0.0066  DynaLoader::bootstrap0.06   0.010
0.020      1   0.0100 0.0197  DBI::install_driver0.06   0.010  0.020     12   0.0008 0.0016  DBI::BEGIN0.06       -
0.020     2        - 0.0098  DBI::connect0.06       -  0.019      2        - 0.0097  db_MySQL::connect0.06       -
0.019     1        - 0.0194  db_MySQL::version0.03   0.010  0.010      2   0.0050 0.0050  AutoLoader::import0.03
- 0.010      2        - 0.0050  POSIX::BEGIN0.03       -  0.010      1        - 0.0100  POSIX::import0.03   0.010
0.010    10   0.0010 0.0010  vars::import0.03       -  0.010      1        - 0.0100  Benchmark::import
 

The server-side time isn't all that much different: about 100 sec for
us, 85 for them.  But there is something way wrong on the Perl side.

I'm not any kind of Perl module guru, but it looks to me like the
explanation for the discrepancy is that much of DBD::mysql is written
in C while the equivalent code in DBD::Pg is written in Perl.  This is
killing us for any app written in Perl :-( --- the above trace only
shows a factor of 2 total penalty, but I've seen up to a factor of 4 in
other cases.  There is no way that a client-side driver should be taking
several times longer than the backend to process a SQL command :-(

Can anyone who knows more about Perl confirm or deny?
Anyone want to step up to fix it?
        regards, tom lane


Re: DBD::Pg performance bites compared to DBD::mysql?

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> I'm not any kind of Perl module guru, but it looks to me like the
> explanation for the discrepancy is that much of DBD::mysql is written
> in C while the equivalent code in DBD::Pg is written in Perl.  This is
> killing us for any app written in Perl :-( --- the above trace only
> shows a factor of 2 total penalty, but I've seen up to a factor of 4 in
> other cases.  There is no way that a client-side driver should be taking
> several times longer than the backend to process a SQL command :-(

What version of DBD::Pg did you test with? Development was pretty moribund for
a while but has picked up recently, just in time for 8.0 and the new binary
prepared query support.

Incidentally, the DBD::Pg mailing list is dbdpg-general@gborg.postgresql.org

-- 
greg



Re: DBD::Pg performance bites compared to DBD::mysql?

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> ... There is no way that a client-side driver should be taking
>> several times longer than the backend to process a SQL command :-(

> What version of DBD::Pg did you test with?

DBD::Pg 1.40, also perl-DBI-1.40 and perl-5.8.5.  Greg Mullane
promised he'd take a closer look after he gets done moving to
Florida... in the meantime, I have a self-contained test case
if someone else wants to dig into it.

(Greg indicated that he thought the coding style in the Perl
test script pretty well sucked, which I don't doubt, but the
question remains why we are so much slower than MySQL on the
same code.)
        regards, tom lane


Re: DBD::Pg performance bites compared to DBD::mysql?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

                                                                                
 
>> the above trace only shows a factor of 2 total penalty, but I've
>> seen up to a factor of 4 in other cases.  There is no way that a
>> client-side driver should be taking several times longer than the
>> backend to process a SQL command :-(

> What version of DBD::Pg did you test with?

This is a problem with the newer versions as well. Maybe even worse,
as we had to do some contortions when the PQprepare 7.4/8.0 mess
appeared. However, we can definitely speed up that part of the code.
I'll see about doing this soon.

I still maintain the original test is brain-dead, however. :) It was
testing for "insert speed" by basically doing this:

for (1..1000) {       $dbh->do("INSERT INTO testtable(a,b) VALUES (1,2)");
}

It's as if the people who wrote it *cough*mysql*cough* never heard
of prepare/execute and did not read the DBI docs.

It's very similar to the script on this page:

http://www.innodb.com/bench.php


- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200506121438
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFCrILHvJuQZxSWSsgRAkqnAKCoU1XAfE9nsQf11U0hM2P7Vt/uOACfSXmN
Dt9h3ACXGLEAXiIdjcXMLRg=
=8fiG
-----END PGP SIGNATURE-----




Re: DBD::Pg performance bites compared to DBD::mysql?

From
Greg Stark
Date:
"Greg Sabino Mullane" <greg@turnstep.com> writes:

> I still maintain the original test is brain-dead, however. :) It was
> testing for "insert speed" by basically doing this:
> 
> for (1..1000) {
>         $dbh->do("INSERT INTO testtable(a,b) VALUES (1,2)");
> }
> 
> It's as if the people who wrote it *cough*mysql*cough* never heard
> of prepare/execute and did not read the DBI docs.


Sure but wouldn't that show up as more time spent in the server in the parser
and planner? I thought Tom was complaining that it was spending too much time
in the client not waiting on server responses.

I wonder if Postgres is being bitten by doing lots of extra round trips. Maybe
the MySQL driver detects when there are no placeholders and no parameters and
uses a different protocol method that only requires a single round trip? Or
given your example maybe it even notices no result is needed and doesn't even
wait for a response?

-- 
greg



Re: DBD::Pg performance bites compared to DBD::mysql?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> I wonder if Postgres is being bitten by doing lots of extra round trips. Maybe
> the MySQL driver detects when there are no placeholders and no parameters and
> uses a different protocol method that only requires a single round trip?

We're not making any extra round trips per se, but we are not taking advantage of
the fact that we can know up front that the statement will not take any
parameters, and we can avoid parsing it for placeholders. My current idea is to treat
do(statement,args) as it currently is (sent to prepare/execute using PQexec (<7.4) or
PQexecParams (newer servers)), but to make a special fast path for simple do(statement)
that basically just throws the whole thing in a quick PQexec and returns. We've been
doing a lot of rewriting of stuff to support the new 7.4/8.0 v3 protocol stuff
such as PQprepare: now we can start the optimizing. :)

> Or given your example maybe it even notices no result is needed and doesn't even
> wait for a response?

That would be a neat trick, but they are not doing that. I don't even think it's
currently possible to derive that information via DBI the way things are now. Would
make a nice long-term micro-optimization though. :)

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200506142215
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iD8DBQFCr4+uvJuQZxSWSsgRApw5AJ9g3tQfihVWj6yOV0sHp5/Md15r4QCgps0Y
f1j8nPU6EqioSh20tckBsE0=
=X4WB
-----END PGP SIGNATURE-----




Re: DBD::Pg performance bites compared to DBD::mysql?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> I'm not any kind of Perl module guru, but it looks to me like the
> explanation for the discrepancy is that much of DBD::mysql is written
> in C while the equivalent code in DBD::Pg is written in Perl.  This is
> killing us for any app written in Perl :-( --- the above trace only
> shows a factor of 2 total penalty, but I've seen up to a factor of 4 in
> other cases.  There is no way that a client-side driver should be taking
> several times longer than the backend to process a SQL command :-(
>
> Can anyone who knows more about Perl confirm or deny?
> Anyone want to step up to fix it?

I can't imagine that it was causing a factor of 2 difference, but I've
changed the code in DBD::Pg to make do() a lot smarter, for brain-dead
benchmarks such as MySQL's that use do() instead of prepare/execute
like they should. In short, DBD::Pg will short-circuit a lot of the usual
work and call PQexec directly if it can get away with it, which should
make the profile Tom posted look a lot better. It also avoids overhead of
going through Pg.pm, and is all C/XS code now. Hopefully, this should
make the benchmarking results a little more fair. :) This is now in
cvs, and will be part of the upcoming version 1.44 of DBD::Pg.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200508212016
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFDCRoBvJuQZxSWSsgRAlb7AJ9IfY1fmxRoib1Bct4HCHRi6WoCCwCfQjmk
oqhEPn8SzwcEe9SouwEUD2s=
=eKgZ
-----END PGP SIGNATURE-----




Re: DBD::Pg performance bites compared to DBD::mysql?

From
Tom Lane
Date:
"Greg Sabino Mullane" <greg@turnstep.com> writes:
> I can't imagine that it was causing a factor of 2 difference, but I've
> changed the code in DBD::Pg to make do() a lot smarter, for brain-dead
> benchmarks such as MySQL's that use do() instead of prepare/execute
> like they should. In short, DBD::Pg will short-circuit a lot of the usual
> work and call PQexec directly if it can get away with it, which should
> make the profile Tom posted look a lot better. It also avoids overhead of
> going through Pg.pm, and is all C/XS code now. Hopefully, this should
> make the benchmarking results a little more fair. :) This is now in
> cvs, and will be part of the upcoming version 1.44 of DBD::Pg.

Cool.  Thanks for doing that.  Do you have an idea when 1.44 will
be out?
        regards, tom lane


Re: DBD::Pg performance bites compared to DBD::mysql?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
 
> Cool.  Thanks for doing that.  Do you have an idea when 1.44 will
> be out?

Not really, I'm afraid. I was tracking down a quoting issue this weekend
and have now gone down a twisty path that induced ripping out and
rewriting most of the "type" and quoting code, which is now lying in
pieces in my work directory. :) If I get the time to finish it soon, maybe
a release in two weeks or so. Otherwise, I might just back out the changes
and release the new do() as 1.44.

And if everyone reading this who uses DBD::Pg could test the code in cvs
right now, that would be greatly appreciated: there were some substantial
changes that I'd love to catch any bugs on before rather than after the
next release. I'll even roll up a tarball for your testing pleasure:

http://www.gtsm.com/DBD-Pg-1.43_1.tar.gz

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200508212142
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFDCS6KvJuQZxSWSsgRAkzKAKCvCzjpvpzVSdqCrL4VLy0vkDj5rwCgwYz6
TNfaqozZ4aUUuaY4YeuFGsI=
=YpRL
-----END PGP SIGNATURE-----