Thread: insert performance riddle

insert performance riddle

From
"Ed L."
Date:
I have two identical servers giving abysmal INSERT performance in
pgsql 7.3.4, 7.4.8, and 8.1devel under no load or I/O contention
at all (no dumps, no vacuums, no apps, etc).  Any suggested
investigations appreciated...

Metric:  I'm measuring average insert speed on the following
table with the following psuedo-loop via DBI 1.48 and DBD::Pg
1.43 (older versions of DBD::Pg/DBI are slow, too):

  create table foo (id serial, msg varchar);
  for i in 1..1000 do
    insert into foo (msg, "this is a test message");

Results:  The slow servers are doing 6...count'em, SIX...that's
right, I said S-I-X inserts per second (QPS) on average.
Measuring on 4 other systems, some of them heavily loaded, the
measured range is 60-1200 QPS.

Hardware:  The slow servers are identical HP DL740s running RHEL
AS 3.0 with kernel 2.4.21-4.0.2.ELsmp (RHLE AS 3.0), 12gb of
RAM, RAID5 via Compaq Smart Array 5i/532 RAID controller.

The systems I'm comparing to include:

4-cpu Itanium HP rx4650, HPUX 11.23, SAN (1000 QPS)...
4-cpu PA-RISC HP rp5470, HPUX 11.00, SAN (190 QPS)...
2-cpu Intel Dell 2650, Linux 2.4.20-30.7, RAID5 (350 QPS)...
1-cpu IBM T41 1.4ghz laptop (7200RPM IDE drives...I think they're
lying because they report 1100 QPS)...

What I've Tried:  I've measured the raw I/O via bonnie and dd,
and the speeds are comparable across these boxes, definitely not
10-100x differences.  Some of the faster QPS boxes are slower on
dd/bonnie times.  I tested multiple versions of pgsql with same
essential results.  I tested multiple versions of DBD and
DBD::Pg, same results.  I've tested on old clusters and tested
with newly initdb'd clusters, same results.

Our next/last guess is that there is some sort of major slowdown
in one of the system libraries, so we're thinking of wiping it
clean and trying a different OS/version.  Ideas?

Ed

Re: insert performance riddle

From
Michael Fuhr
Date:
On Wed, Aug 10, 2005 at 05:02:46PM -0600, Ed L. wrote:
> I have two identical servers giving abysmal INSERT performance in
> pgsql 7.3.4, 7.4.8, and 8.1devel under no load or I/O contention
> at all (no dumps, no vacuums, no apps, etc).  Any suggested
> investigations appreciated...
>
> Metric:  I'm measuring average insert speed on the following
> table with the following psuedo-loop via DBI 1.48 and DBD::Pg
> 1.43 (older versions of DBD::Pg/DBI are slow, too):
>
>   create table foo (id serial, msg varchar);
>   for i in 1..1000 do
>     insert into foo (msg, "this is a test message");
>
> Results:  The slow servers are doing 6...count'em, SIX...that's
> right, I said S-I-X inserts per second (QPS) on average.

Have you done any tests that eliminate the client code and client-server
communcation?  Something like the following:

CREATE TABLE foo (id serial, msg varchar);

CREATE FUNCTION insert_foo(integer) RETURNS void AS '
DECLARE
    i  integer;
BEGIN
    FOR i IN 1 .. $1 LOOP
        INSERT INTO foo (msg) VALUES (''this is a test message'');
    END LOOP;

    RETURN;
END;
' LANGUAGE plpgsql VOLATILE STRICT;

EXPLAIN ANALYZE SELECT insert_foo(1000);

--
Michael Fuhr

Re: insert performance riddle

From
"Ed L."
Date:
On Wednesday August 10 2005 6:03 pm, Michael Fuhr wrote:
> On Wed, Aug 10, 2005 at 05:02:46PM -0600, Ed L. wrote:
> > I have two identical servers giving abysmal INSERT
> > performance in pgsql 7.3.4, 7.4.8, and 8.1devel under no
> > load or I/O contention at all (no dumps, no vacuums, no
> > apps, etc).  Any suggested investigations appreciated...
> >
> > Results:  The slow servers are doing 6...count'em,
> > SIX...that's right, I said S-I-X inserts per second (QPS) on
> > average.
>
> Have you done any tests that eliminate the client code and
> client-server communcation?  Something like the following:
>
> CREATE FUNCTION insert_foo(integer) RETURNS void AS '
> DECLARE
>     i  integer;
> BEGIN
>     FOR i IN 1 .. $1 LOOP
>         INSERT INTO foo (msg) VALUES (''this is a test
> message''); END LOOP;
>
>     RETURN;
> END;
> ' LANGUAGE plpgsql VOLATILE STRICT;
>
> EXPLAIN ANALYZE SELECT insert_foo(1000);

Michael, you seem to have nailed it.  The local inserts (via Unix
domain sockets?) that were running at 6 QPS ran at 6800 to 41000
QPS in a PL/pgSQL function.

So, given the same DBI/DBD::Pg code is performing 2 orders of
magnitude better on other similar systems, where would you look
next?  The local socket system call code?  Where else?  My next
thought is to gprof the client code to identify the slowdown...

Thanks,
Ed



Re: insert performance riddle

From
"Ed L."
Date:
On Thursday August 11 2005 12:36 pm, Ed L. wrote:
> On Wednesday August 10 2005 6:03 pm, Michael Fuhr wrote:
> > On Wed, Aug 10, 2005 at 05:02:46PM -0600, Ed L. wrote:
> > > I have two identical servers giving abysmal INSERT
> > > performance in pgsql 7.3.4, 7.4.8, and 8.1devel under no
> > > load or I/O contention at all (no dumps, no vacuums, no
> > > apps, etc).  Any suggested investigations appreciated...
> >
> > Have you done any tests that eliminate the client code and
> > client-server communcation?  Something like the following:
> >
>
> Michael, you seem to have nailed it.  The local inserts (via
> Unix domain sockets?) that were running at 6 QPS ran at 6800
> to 41000 QPS in a PL/pgSQL function.

Here's another part of the riddle.  The query durations for the
individual inserts as logged in the server log are ranging
100ms-500ms per insert when connecting via DBI/DBD::Pg from
localhost.  Inside pl/pgsql, they're clearly averaging faster
than 1ms per insert.  I would have thought those times were
independent of the client communication layers...?

Ed



Re: insert performance riddle

From
Michael Fuhr
Date:
On Thu, Aug 11, 2005 at 12:59:32PM -0600, Ed L. wrote:
> > Michael, you seem to have nailed it.  The local inserts (via
> > Unix domain sockets?) that were running at 6 QPS ran at 6800
> > to 41000 QPS in a PL/pgSQL function.
>
> Here's another part of the riddle.  The query durations for the
> individual inserts as logged in the server log are ranging
> 100ms-500ms per insert when connecting via DBI/DBD::Pg from
> localhost.  Inside pl/pgsql, they're clearly averaging faster
> than 1ms per insert.  I would have thought those times were
> independent of the client communication layers...?

Have you done any client-side tests that eliminate Perl?  I'd suggest
writing a little C program so you can measure libpq's performance
without the extra layers of Perl and DBI/DBD::Pg.  Test both local
(Unix socket) and network (IPv4 or IPv6 socket) connections.

A couple of months ago, Tom Lane noticed that benchmarks using
DBD::Pg make PostgreSQL look worse than it deserves:

http://archives.postgresql.org/pgsql-interfaces/2005-06/msg00003.php

--
Michael Fuhr

Re: insert performance riddle

From
"Ed L."
Date:
On Thursday August 11 2005 1:37 pm, Michael Fuhr wrote:
> Have you done any client-side tests that eliminate Perl?  I'd
> suggest writing a little C program so you can measure libpq's
> performance without the extra layers of Perl and DBI/DBD::Pg.
>  Test both local (Unix socket) and network (IPv4 or IPv6
> socket) connections.

Michael, you nailed it again.  My libpq test C program delivered
between 2400 QPS and 5000 QPS vs ~10 QPS for DBI/DBD::Pg on this
box.

It remains unclear to me why the same DBI/DBD::Pg client code
would deliver performance 2-3 orders of magnitude better on
other roughly comparable or inferior boxes.

What would be the recommended replacement for DBI/DBD::Pg as a
perl interface?

Thanks,
Ed


Re: insert performance riddle

From
Steve Wormley
Date:
>
> Michael, you nailed it again.  My libpq test C program delivered
> between 2400 QPS and 5000 QPS vs ~10 QPS for DBI/DBD::Pg on this
> box.
>
> It remains unclear to me why the same DBI/DBD::Pg client code
> would deliver performance 2-3 orders of magnitude better on
> other roughly comparable or inferior boxes.
>

I'm using DBD::Pg for some large inserts(well, they were large) most of it
was pretty reasonable, there were some quirky things, like last_insert_id
is/was incredibly slow(it's very smart, but the way it does things makes the
performance horrible) doing the currval call yourself is much faster.
I found that using the Perl profiler identified a couple things like that
and let me get things to a reasonable speed.

-Steve



Re: insert performance riddle

From
Michael Fuhr
Date:
On Thu, Aug 11, 2005 at 03:29:29PM -0600, Ed L. wrote:
> Michael, you nailed it again.  My libpq test C program delivered
> between 2400 QPS and 5000 QPS vs ~10 QPS for DBI/DBD::Pg on this
> box.
>
> It remains unclear to me why the same DBI/DBD::Pg client code
> would deliver performance 2-3 orders of magnitude better on
> other roughly comparable or inferior boxes.

I'd suggest contacting the module's maintainer -- see the AUTHORS
section of the DBD::Pg manual page for contact info.

--
Michael Fuhr

Re: insert performance riddle

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> It remains unclear to me why the same DBI/DBD::Pg client code
> would deliver performance 2-3 orders of magnitude better on
> other roughly comparable or inferior boxes.

You need to see exactly what is going on to make things so slow.
You can try setting the trace level for very verbose detail
(e.g. $dbh->trace(10)) or you could look at using DBI::Profile
or DBI::ProfileDumper.

If it is the same code and same databases on different boxes, there
is definitely something weird going on. Make sure that Perl, DBI,
DBD::Pg, Postgres, and the Postgres libraries are all the same
version. Running "make test" for DBD::Pg will display the current
settings for all of the above. (or just copy the code from t/01connect.t).

If all versions are indeed the same, see if you can create a small test
script that illustrates the problem, then run the above mentioned
debugging aids and see if you can spot a difference.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200508120941
https://www.biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEARECAAYFAkL8p0oACgkQvJuQZxSWSsjlowCeOS6GkhISN4d6oWaRQQfD91QA
Eg8AoPJz55hEwmOcuuHC7GpLMf0cDstt
=ZZME
-----END PGP SIGNATURE-----



Re: insert performance riddle

From
"Ed L."
Date:
On Thursday August 11 2005 6:20 pm, Michael Fuhr wrote:
> On Thu, Aug 11, 2005 at 03:29:29PM -0600, Ed L. wrote:
> > Michael, you nailed it again.  My libpq test C program
> > delivered between 2400 QPS and 5000 QPS vs ~10 QPS for
> > DBI/DBD::Pg on this box.
> >
> > It remains unclear to me why the same DBI/DBD::Pg client
> > code would deliver performance 2-3 orders of magnitude
> > better on other roughly comparable or inferior boxes.
>
> I'd suggest contacting the module's maintainer -- see the
> AUTHORS section of the DBD::Pg manual page for contact info.

Well, just as I thought I had this one pinned, my test results
have become wildly inconsistent, eroding all confidence in my
prior conclusions about DBI slowness, etc.  I've seen at least
1000+ QPS performance via DBI/DBD::Pg, Pg, and C/libpq.  I'm now
investigating the possibility of simple old intermittent I/O
congestion...

Ed


Re: insert performance riddle

From
Michael Fuhr
Date:
On Fri, Aug 12, 2005 at 05:20:49PM -0600, Ed L. wrote:
> Well, just as I thought I had this one pinned, my test results
> have become wildly inconsistent, eroding all confidence in my
> prior conclusions about DBI slowness, etc.  I've seen at least
> 1000+ QPS performance via DBI/DBD::Pg, Pg, and C/libpq.  I'm now
> investigating the possibility of simple old intermittent I/O
> congestion...

How consistent were the results before?  I got the impression that
you saw consistently bad performance with DBD::Pg when other methods
performed well.

--
Michael Fuhr

Re: insert performance riddle

From
"Ed L."
Date:
On Friday August 12 2005 5:27 pm, Michael Fuhr wrote:
> On Fri, Aug 12, 2005 at 05:20:49PM -0600, Ed L. wrote:
> > Well, just as I thought I had this one pinned, my test
> > results have become wildly inconsistent, eroding all
> > confidence in my prior conclusions about DBI slowness, etc.
> > I've seen at least 1000+ QPS performance via DBI/DBD::Pg,
> > Pg, and C/libpq.  I'm now investigating the possibility of
> > simple old intermittent I/O congestion...
>
> How consistent were the results before?  I got the impression
> that you saw consistently bad performance with DBD::Pg when
> other methods performed well.

Results were very consistent until this morning.  Now dbi/dbd::pg
are intermittently showing performance 1-2 orders of magnitude
faster.   :/

Ed

Re: insert performance riddle

From
Michael Fuhr
Date:
On Fri, Aug 12, 2005 at 05:37:22PM -0600, Ed L. wrote:
> On Friday August 12 2005 5:27 pm, Michael Fuhr wrote:
> > How consistent were the results before?  I got the impression
> > that you saw consistently bad performance with DBD::Pg when
> > other methods performed well.
>
> Results were very consistent until this morning.  Now dbi/dbd::pg
> are intermittently showing performance 1-2 orders of magnitude
> faster.   :/

Has anything changed on the system since the results were consistently
slow?  New hardware, new versions of anything, reboot, etc.?  Did
you do any profiling when DBD::Pg was consistently slow to see where
the bottleneck was?

--
Michael Fuhr

Re: insert performance riddle

From
"Ed L."
Date:
On Friday August 12 2005 6:11 pm, Michael Fuhr wrote:
> Has anything changed on the system since the results were
> consistently slow?  New hardware, new versions of anything,
> reboot, etc.?  Did you do any profiling when DBD::Pg was
> consistently slow to see where the bottleneck was?

All good questions.  Obviously, something has changed, but I'm
hard-pressed to identify what it might have been.  No, I didn't
do any profiling, since I found the presumed dbi/libpq
difference first.  I will measure perf of dbi, Pg, and libpq
over time and see if I can combine with I/O measures to see
trends.

Ed

Re: insert performance riddle

From
Michael Fuhr
Date:
On Fri, Aug 12, 2005 at 06:20:27PM -0600, Ed L. wrote:
> On Friday August 12 2005 6:11 pm, Michael Fuhr wrote:
> > Has anything changed on the system since the results were
> > consistently slow? New hardware, new versions of anything,
> > reboot, etc.? Did you do any profiling when DBD::Pg was
> > consistently slow to see where the bottleneck was?
>
> All good questions.  Obviously, something has changed, but I'm
> hard-pressed to identify what it might have been.  No, I didn't
> do any profiling, since I found the presumed dbi/libpq
> difference first.  I will measure perf of dbi, Pg, and libpq
> over time and see if I can combine with I/O measures to see
> trends.

I meant profiling of DBD::Pg, as Greg Sabino Mullane suggested.
Here's his message in case you missed it:

http://archives.postgresql.org/pgsql-general/2005-08/msg00655.php

--
Michael Fuhr

Re: insert performance riddle

From
"Ed L."
Date:
On Friday August 12 2005 6:29 pm, Michael Fuhr wrote:
> I meant profiling of DBD::Pg, as Greg Sabino Mullane
> suggested. Here's his message in case you missed it:

No, I didn't miss that, and will do that as a next step.  Thanks,
Michael.

Ed