Thread: problem with PQsendQuery/PQgetResult and COPY FROM statement

problem with PQsendQuery/PQgetResult and COPY FROM statement

From
max.poletto@gmail.com
Date:
I attempt to execute a "COPY test FROM STDIN" using PQsendQuery.
Subsequently, after the appropriate poll()/PQconsumeInput/PQisBusy
calls, PQgetResult returns tens of millions of (invalid?) non-null
PGresults.  This behavior seems incorrect, and sometimes causes my
application to exhaust memory and crash.

The postgres version is 8.1.3.

I can reproduce the problem in about 50 lines of C.  I include below
(1) the code, (2) a psql dump of the table in question, (3) the code's
output.

I'd appreciate any insight or suggestions you may have.

Thanks,
max poletto

======================================================================
(1) code

// compile with: g++ -Wall -O -o pgtest pgtest.cc -lpq

#include <cerrno>
#include <cstdio>
#include <ctime>
#include <vector>
using namespace std;
#include <libpq-fe.h>
#include <sys/poll.h>

void retrieve(PGconn *conn, time_t timeout)
{ vector<PGresult *> res; while (1) {   int r;   do {     struct pollfd pfds[1];     pfds[0].fd = PQsocket(conn);
pfds[0].events= POLLIN;     pfds[0].revents = 0;     r = poll(pfds, sizeof(pfds) / sizeof(struct pollfd), 1000 *
 
timeout);   } while (r < 0 && errno == EINTR);   if (r <= 0 || !PQconsumeInput(conn))     return;   int i = 0;
PGresult*oldr = 0;   while (!PQisBusy(conn)) {     PGresult *r = PQgetResult(conn);     if (r) {
res.push_back(r);      if (++i % 5000000 == 0) { printf("%d results\n", i); }       if (r == oldr) { printf("r==oldr
(%p)\n",r); }       oldr = r;     } else {       printf("PQgetResult return 0 after %d results\n", i);       return;
}   } }
 
}

int main()
{ PGconn *conn = PQconnectdb("dbname=testdb user=postgres"); if (!conn)   return -1; if (PQsendQuery(conn, "COPY test
FROMSTDIN") > 0) {   retrieve(conn, 20);   if (PQputCopyEnd(conn, 0) < 0)     return -1; } PQfinish(conn); return 0;
 
}


======================================================================
(2) psql session

root@tm01-5% psql testdb postgres
Welcome to psql 8.1.3, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms      \h for help with SQL commands      \? for help with psql commands      \g
orterminate with semicolon to execute query      \q to quit
 

testdb=# \d test    Table "public.test"Column |  Type   | Modifiers
--------+---------+-----------a      | integer |b      | integer |c      | integer |

testdb=# select * from test;a | b | c
---+---+---1 | 2 | 3
(1 row)

testdb=#


======================================================================
(3) output

root@tm01-5% ./pgtest
5000000 results
10000000 results
15000000 results
20000000 results
25000000 results
PQgetResult return 0 after 25649299 results
4.640u 4.696s 0:09.34 99.8%     0+0k 0+0io 0pf+0w

In this toy example, the process VM size exceeds 2GB before PQgetResult
finally returns 0.  The real application, which has already allocated
~1GB prior to the query, runs out of memory (3GB limit on 32-bit linux)
before PQgetResult ever returns 0.



Re: problem with PQsendQuery/PQgetResult and COPY FROM statement

From
Volkan YAZICI
Date:
On May 19 11:51, max.poletto@gmail.com wrote:
>   if (PQsendQuery(conn, "COPY test FROM STDIN") > 0) {
>     retrieve(conn, 20);

Shouldn't you be send()'ing instead of retrieve()'ing? COPY tbl FROM
stdin, requests data from client to COPY FROM stdin TO tbl.


Regards.


Re: problem with PQsendQuery/PQgetResult and COPY FROM statement

From
Tom Lane
Date:
max.poletto@gmail.com writes:
> I attempt to execute a "COPY test FROM STDIN" using PQsendQuery.
> Subsequently, after the appropriate poll()/PQconsumeInput/PQisBusy
> calls, PQgetResult returns tens of millions of (invalid?) non-null
> PGresults.

You should fix your code to pay some attention to what those results
are.  I'm betting they are error results.

> This behavior seems incorrect, and sometimes causes my
> application to exhaust memory and crash.

Well, that's because you're not PQclear'ing a result when done with it.
        regards, tom lane


Re: problem with PQsendQuery/PQgetResult and COPY FROM statement

From
Martijn van Oosterhout
Date:
On Fri, May 19, 2006 at 11:51:00AM -0700, max.poletto@gmail.com wrote:
> I attempt to execute a "COPY test FROM STDIN" using PQsendQuery.
> Subsequently, after the appropriate poll()/PQconsumeInput/PQisBusy
> calls, PQgetResult returns tens of millions of (invalid?) non-null
> PGresults.  This behavior seems incorrect, and sometimes causes my
> application to exhaust memory and crash.

In addition to the issues other people have pointed out, when using
COPY you transfer the data with PQputCopyData()/PQgetCopyData().

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: problem with PQsendQuery/PQgetResult and COPY FROM statement

From
max.poletto@gmail.com
Date:
Thanks for all your replies, but I must clarify some things.

First, note that what I posted is just a small example that reproduces
behavior that appears incorrect.  The real code is a C++ wrapper
around libpq that supports non-blocking queries and reuses open
connections.

Volkan and Martijn: I know about PQ{put,get}CopyData, but my example
never gets to that point.  I must first determine whether the
(asynchronous) PQsendQuery() of "COPY test FROM STDIN" succeeded.
That's all that retrieve() tries to do in my example.

Tom: of course I should (and eventually do) use PQclear(), but I may
not want to right away, because I must return to the user a vector of
result objects (for example, all the result rows from a query).

I do not expect PQgetResult to return millions of non-null PGresult
objects after a PQsendQuery("COPY test FROM STDIN").  I expect exactly
one non-null result, with a result status of PGRES_COPY_IN.  Moreover,
the manual says:
  If a COPY command is issued via PQexec in a string that could
contain  additional commands, the application must continue fetching results  via PQgetResult after completing the COPY
sequence.Only when  PQgetResult returns NULL is it certain that the PQexec command
 
string  is done and it is safe to issue more commands.

I assumed this to be true for PQexec or "one of the equivalent
functions" mentioned in the manual, such as PQsendQuery.  However, if
I add the following switch statement to my example:
   while (!PQisBusy(conn)) {     PGresult *r = PQgetResult(conn);     if (r) {       switch (PQresultStatus(r)) {
casePGRES_COPY_IN:         break;       case PGRES_EMPTY_QUERY:         printf("PGRES_EMPTY_QUERY\n");         break;
   case PGRES_COMMAND_OK:         printf("PGRES_COMMAND_OK\n");         break;       case PGRES_TUPLES_OK:
printf("PGRES_TUPLES_OK\n");        break;       case PGRES_COPY_OUT:         printf("PGRES_COPY_OUT\n");
break;      case PGRES_BAD_RESPONSE:         printf("PGRES_BAD_RESPONSE\n");         break;       case
PGRES_NONFATAL_ERROR:        printf("PGRES_NONFATAL_ERROR\n");         break;       case PGRES_FATAL_ERROR:
printf("PGRES_FATAL_ERROR\n");        break;       }       res.push_back(r);       if (++i % 5000000 == 0) { printf("%d
results\n",i); }       if (r == oldr) { printf("r==oldr (%p)\n", r); }       oldr = r;     } else {
printf("PQgetResultreturn 0 after %d results\n", i);       return;     }   }
 

the code still prints only:

5000000 results
10000000 results
15000000 results
20000000 results
25000000 results
PQgetResult return 0 after 25649299 results

In other words, there are >25M distinct non-null results, and all of
them have status code PGRES_COPY_IN, and none of them have errors.

So it appears that I should check whether the first PGresult object
has a status code of PGRES_COPY_IN, and ignore subsequent PGresults
even if they are not NULL.  I don't object to this interface, but it
is not what I would conclude after RTFM.

max



Re: problem with PQsendQuery/PQgetResult and COPY FROM statement

From
Tom Lane
Date:
max.poletto@gmail.com writes:
> I do not expect PQgetResult to return millions of non-null PGresult
> objects after a PQsendQuery("COPY test FROM STDIN").  I expect exactly
> one non-null result, with a result status of PGRES_COPY_IN.

If you call it exactly once, it'll say that exactly once.  If you keep
calling it "millions of times", it'll keep saying that.

> Moreover, the manual says:

>    If a COPY command is issued via PQexec in a string that could
> contain
>    additional commands, the application must continue fetching results
>    via PQgetResult after completing the COPY sequence. Only when
>    PQgetResult returns NULL is it certain that the PQexec command
> string
>    is done and it is safe to issue more commands.

Indeed.  You forgot to "complete the COPY sequence" before returning
to the PQgetResult loop.  As long as the thing is in COPY mode,
PQgetResult will return a result saying PGRES_COPY_IN.  The point
of this paragraph is that you might want to consider doing more
PQgetResults *after* you've ended COPY mode.
        regards, tom lane


Re: problem with PQsendQuery/PQgetResult and COPY FROM statement

From
"Massimiliano Poletto"
Date:
> If you call it exactly once, it'll say that exactly once.  If you keep
> calling it "millions of times", it'll keep saying that.

OK.  I wonder, though, why at some point it does in fact return 0.

> Indeed.  You forgot to "complete the COPY sequence" before returning
> to the PQgetResult loop.  As long as the thing is in COPY mode,
> PQgetResult will return a result saying PGRES_COPY_IN.  The point

Got it.  Thanks.

max


Re: problem with PQsendQuery/PQgetResult and COPY FROM statement

From
Tom Lane
Date:
"Massimiliano Poletto" <max.poletto@gmail.com> writes:
>> If you call it exactly once, it'll say that exactly once.  If you keep
>> calling it "millions of times", it'll keep saying that.

> OK.  I wonder, though, why at some point it does in fact return 0.

Probably because you ran out of memory to create new PGresult structs.
        regards, tom lane