Dear programmers,
I am using the handy LISTEN/NOTIFY mechanism provided by PostgreSQL in my
client application, written in C with libpq. It is a simple
single-threaded single-process client (thanks to the NOTIFIcation!),
waiting for the notification using select() call with finite timeout.
After timeout, it performs some action, after notification something else.
The problem: most of time, everything works fine, hundreds of successful
or even timed-out selects() get handled without any problem. But time to
time (eg. after several hours), the select() call returns with a timeout
and then, a request to the opened PQconn (simple query) gets stuck, the
call hangs and never returns.
The simplified code snippet is:
s = PQsocket(pg);
for (;;) {
FD_ZERO(&input_mask);
FD_SET(s, &input_mask);
rc = select(s + 1, &input_mask, NULL, NULL, select_timeout);
if (rc == 0) {
fprintf(stderr, "info: select timeout\n");
pr = PQexec(pg, "SELECT now()"); /********* <= does not return ****/
/* etc. */
}
/* etc. */
}
The client is connected to remote SQL server via TCP/IP, SSL may have been
involved in (not sure right now). I guess that the problem could be caused
by ocassional connection failure. However, I would expect, that PQexec()
or other functions should return an error in such a case. I tried to test
it using PQstatus(), but no success.
The workaround, which works somehow, is to call PQreset() *always* after
every timed-out select(). However, I am very unhappy with such a
"solution", because after the connection is reset, I need to LISTEN again
and I must expect, that DB state has changed during the reset. This is
natural in error recovery case, after a connection or other failure. BUT
since I need to reset the connection almost always, after _any_ timed-out
select(), it is really harmful to my application.
Please, can you tell me, what am I doing wrong?
Thank you very much.
Best regards,
Marek.