PGconn gets frozen ocassionally after select() timeout - Mailing list pgsql-general

From Marek Peca
Subject PGconn gets frozen ocassionally after select() timeout
Date
Msg-id Pine.LNX.4.64.0911131455440.3778@tesla
Whole thread Raw
Responses Re: PGconn gets frozen ocassionally after select() timeout
Re: PGconn gets frozen ocassionally after select() timeout
List pgsql-general
Dear programmers,

I am using the handy LISTEN/NOTIFY mechanism provided by PostgreSQL in my
client application, written in C with libpq. It is a simple
single-threaded single-process client (thanks to the NOTIFIcation!),
waiting for the notification using select() call with finite timeout.
After timeout, it performs some action, after notification something else.

The problem: most of time, everything works fine, hundreds of successful
or even timed-out selects() get handled without any problem. But time to
time (eg. after several hours), the select() call returns with a timeout
and then, a request to the opened PQconn (simple query) gets stuck, the
call hangs and never returns.

The simplified code snippet is:

s = PQsocket(pg);
for (;;) {
   FD_ZERO(&input_mask);
   FD_SET(s, &input_mask);
   rc = select(s + 1, &input_mask, NULL, NULL, select_timeout);

   if (rc == 0) {
     fprintf(stderr, "info: select timeout\n");
     pr = PQexec(pg, "SELECT now()");  /********* <= does not return ****/
     /* etc. */

   }
   /* etc. */
}

The client is connected to remote SQL server via TCP/IP, SSL may have been
involved in (not sure right now). I guess that the problem could be caused
by ocassional connection failure. However, I would expect, that PQexec()
or other functions should return an error in such a case. I tried to test
it using PQstatus(), but no success.

The workaround, which works somehow, is to call PQreset() *always* after
every timed-out select(). However, I am very unhappy with such a
"solution", because after the connection is reset, I need to LISTEN again
and I must expect, that DB state has changed during the reset. This is
natural in error recovery case, after a connection or other failure. BUT
since I need to reset the connection almost always, after _any_ timed-out
select(), it is really harmful to my application.


Please, can you tell me, what am I doing wrong?
Thank you very much.


Best regards,
Marek.

pgsql-general by date:

Previous
From: Gunnar Sønsteby
Date:
Subject: Cache lookup error
Next
From: Richard Huxton
Date:
Subject: Re: Cache lookup error