Thread: Problem: libpq, network traffic, memory usage

Problem: libpq, network traffic, memory usage

From

Alexander Scholz

Date:

07 December 2005, 13:01:56

Dear NG,

I have a serious performance flaw, when using postgresql 8.1 (other
versions haven't been tested) with libpq. When executing a

    select * from "xyz"   1)

and "xyz" contains 300'000 records, it takes more than 60 seconds for
the query just to complete. First I thought it is a performance problem
of the server, but then we noticed an enormous amount of TCP/IP packets
that are beeing received in that (waiting)time. The memory usage of my
process, while waiting for the query to complete, bumps up to more than
300MB, the CPU usage goes up to 100%! It seems as if libpq would be
receiving the complete resultset from the server _before_ returning from
the query execution! (Even then 300MB seems for me to be too much, but
that's not the point!) Just to ensure: I didn't perform ANY _fetch_ so
far! It's just about executing the query.

Does anybody have any idea, what's going on here?

When doing the query in pgAdmin III it also takes that awful long time
and pgAdmin III memory usage also goes up to over 300 MB. We specified
in pgAdmin III to fetch only 1 row, thus _filling_ the grid cannot be
the reason there. The query there took 115 seconds just to complete. :-(

Can anybody help me out?!? I'd appreciate even just a hint or reference
to somewhere else. We already googled "memory and libpq" but the first
200 hits (sigh) didn't give any useful hint in that respect...

Thanx in advance,

Alexander

PS:

1) Remark: yes, this SELECT * query is neccessary and cannot be further
restricted by a where. The query retrieves ALL records of a specific
table (e.g. invoices) to be watched on the screen by a user. Of course
only data of the currently visible records in the grid are being fetched.

Re: Problem: libpq, network traffic, memory usage

From

Scott Marlowe

Date:

07 December 2005, 13:05:07

On Wed, 2005-12-07 at 11:01, Alexander Scholz wrote:
> Dear NG,
>
> I have a serious performance flaw, when using postgresql 8.1 (other
> versions haven't been tested) with libpq. When executing a
>
>     select * from "xyz"   1)
>
> and "xyz" contains 300'000 records, it takes more than 60 seconds for
> the query just to complete. First I thought it is a performance problem
> of the server, but then we noticed an enormous amount of TCP/IP packets
> that are beeing received in that (waiting)time. The memory usage of my
> process, while waiting for the query to complete, bumps up to more than
> 300MB, the CPU usage goes up to 100%! It seems as if libpq would be
> receiving the complete resultset from the server _before_ returning from
> the query execution! (Even then 300MB seems for me to be too much, but
> that's not the point!) Just to ensure: I didn't perform ANY _fetch_ so
> far! It's just about executing the query.

Yep, libpq is doing exactly what you asked it to do, no more, no less.

If you're like your result set to be in a cursor, you'll need to declare
one.

try

declare cursor xyz as select ...
fetch 100;

and see how that works.

Re: Problem: libpq, network traffic, memory usage

From

"A. Kretschmer"

Date:

07 December 2005, 13:14:02

am  07.12.2005, um 18:01:51 +0100 mailte Alexander Scholz folgendes:
> Dear NG,
>
> I have a serious performance flaw, when using postgresql 8.1 (other
> versions haven't been tested) with libpq. When executing a
>
>     select * from "xyz"   1)
>
> and "xyz" contains 300'000 records, it takes more than 60 seconds for the

The same question was yesterday on [pgsql-de-allgemein] ;-)


> query just to complete. First I thought it is a performance problem of the
> server, but then we noticed an enormous amount of TCP/IP packets that are
> beeing received in that (waiting)time. The memory usage of my process,
> while waiting for the query to complete, bumps up to more than 300MB, the
> CPU usage goes up to 100%! It seems as if libpq would be receiving the
> complete resultset from the server _before_ returning from the query
> execution! (Even then 300MB seems for me to be too much, but that's not the
> point!) Just to ensure: I didn't perform ANY _fetch_ so far! It's just
> about executing the query.
>
> Does anybody have any idea, what's going on here?

You have selected _ALL_ records, you got all records. Thats the point.

> When doing the query in pgAdmin III it also takes that awful long time and
> pgAdmin III memory usage also goes up to over 300 MB. We specified in
> pgAdmin III to fetch only 1 row, thus _filling_ the grid cannot be the
> reason there. The query there took 115 seconds just to complete. :-(
>
> Can anybody help me out?!? I'd appreciate even just a hint or reference to
> somewhere else. We already googled "memory and libpq" but the first 200
> hits (sigh) didn't give any useful hint in that respect...

You can use a CURSOR to walk through the result.


HTH, Andreas
--
Andreas Kretschmer    (Kontakt: siehe Header)
Heynitz:  035242/47212,      D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
 ===    Schollglas Unternehmensgruppe    ===

Re: Problem: libpq, network traffic, memory usage

From

Volkan YAZICI

Date:

07 December 2005, 13:30:28

On Dec 07 06:01, Alexander Scholz wrote:
> I have a serious performance flaw, when using postgresql 8.1 (other
> versions haven't been tested) with libpq. When executing a
>
>    select * from "xyz"   1)
>
> and "xyz" contains 300'000 records, it takes more than 60 seconds for
> the query just to complete. First I thought it is a performance problem
> of the server, but then we noticed an enormous amount of TCP/IP packets
> that are beeing received in that (waiting)time. The memory usage of my
> process, while waiting for the query to complete, bumps up to more than
> 300MB, the CPU usage goes up to 100%! It seems as if libpq would be
> receiving the complete resultset from the server _before_ returning from
> the query execution! (Even then 300MB seems for me to be too much, but
> that's not the point!) Just to ensure: I didn't perform ANY _fetch_ so
> far! It's just about executing the query.

libpq just works as expected. The query you sent returned 300 megs of data
and libpq read returned data from the socket then stored it in the memory.
There is nothing special in this behaviour. What would you expect? To
PostgreSQL hold 300 megs of data waiting for you in the server side memory?

As Alexander Scholz said, use CURSORs to achieve a functionality as you
expected.

If you're interested in queries returning incremental results, you can
take a look at mailing-list archives for related topic. Furthermore,
there's a TODO item for this too.

Regards.

--
"We are the middle children of history, raised by television to believe
that someday we'll be millionaires and movie stars and rock stars, but
we won't. And we're just learning this fact," Tyler said. "So don't
fuck with us."

Re: Problem: libpq, network traffic, memory usage

From

Alexander Scholz

Date:

07 December 2005, 13:36:10

Hi Scott,

> try
>
> declare cursor xyz as select ...
> fetch 100;
>
> and see how that works.

sorry for probably asking such a stupid question, but we are using a
PQexec(). Where should I specify that cursor declaration?

BTW: When executing

BEGIN WORK;
DECLARE "test" CURSOR FOR SELECT * FROM "TEST";
FETCH FORWARD 100 from "test";
CLOSE "test";
COMMIT WORK;

in pgAdmin III, then no data is being displayed and the following
message (just a rough translation, I have a German frontend) is being
displayed in the message log: "Query result with 100 rows has been
supressed. Query successfully completed after 50ms. No rows returned."

How do I get the values?

Another question: Does the ODBC oder OLEDB driver being provided for
Postgres already support this? Then we might switch from libpq over to
ODBC or OLEDB.

Thanx in advance,

Alexander

Re: Problem: libpq, network traffic, memory usage

From

Volkan YAZICI

Date:

07 December 2005, 13:47:54

On Dec 07 06:36, Alexander Scholz wrote:
> sorry for probably asking such a stupid question, but we are using a
> PQexec(). Where should I specify that cursor declaration?

Just like as you're querying with pgAdmin:

PQexec(conn, "BEGIN");
PQexec(conn, "DECLARE \"test\" CURSOR FOR SELECT * FROM \"TEST\"");
for (...)
{
    PQexec(conn, "FETCH FORWARD 100 from \"test\"");
    /* Use just retrieved result. */

    /* Don't forget to clear used PGresult. */
}
PQexec(conn, "...");

--
"We are the middle children of history, raised by television to believe
that someday we'll be millionaires and movie stars and rock stars, but
we won't. And we're just learning this fact," Tyler said. "So don't
fuck with us."

Re: Problem: libpq, network traffic, memory usage

From

Alexander Scholz

Date:

07 December 2005, 13:55:57

Hi Andreas,

> The same question was yesterday on [pgsql-de-allgemein] ;-)
Yes - I just found it. :-) But I don't know whether it is really the
same prob.

> You have selected _ALL_ records, you got all records. Thats the point.
I SELECTed them, that's right, but I didn't FETCHed them. I thought that
would be a difference. It's at least the way it works with MSSQL and
ODBC/OLEDB (that's where I am coming from...)

BTW: The ODBC driver for postgres has the same <sorry> flaw </sorry>.
The ODBC API provides a cursor model but the driver doesn't seem to
implement it, because the same libpq behaviour occurs.

Is there ANY driver (ODBC, OLEDB) around that uses the cursor model?

> You can use a CURSOR to walk through the result.
Sorry, but I don't know exactly how to do that, can you help me out? We
are working with a C++ library that uses PQExec(). I don't even get
pgAdmin III to display data fetched by a cursor. ->please refer to my
posting to Scott.

Thanx in advance,

Alexander

Re: Problem: libpq, network traffic, memory usage

From

Volkan YAZICI

Date:

07 December 2005, 14:16:42

On Dec 07 06:56, Alexander Scholz wrote:
> I SELECTed them, that's right, but I didn't FETCHed them. I thought that
> would be a difference. It's at least the way it works with MSSQL and
> ODBC/OLEDB (that's where I am coming from...)

IMHO, you can reach to a similar functionaly with using asynchronous query
functions. For instance:

PQsendQuery();  /* First of all, send related query. */
for (...)       /* In an appropriate loop */
{
                /* ... gather results gradually from socket. */
}
PQgetResult();  /* Form final PGresult object. */

But beware that, above method will not supply results gradually in a usable
form. You'll collect data read from socket and form a PGresult struct using
'em. After that, you can use that PGresult object.

Regards.

--
"We are the middle children of history, raised by television to believe
that someday we'll be millionaires and movie stars and rock stars, but
we won't. And we're just learning this fact," Tyler said. "So don't
fuck with us."