Thread: Keep-alive support
Is there any keep alive support in libpq? I'm not really using libpq directly, I'm using libpqxx and there is no keep-alive support there, so I'm trying to use TCP's own keep-alive support, but I have a problem: libpq seems to reconnect the socket when the connection is lost. What I do is this: /* Connect libpq as in the Example 1 of the manual. */ /* Before any query is sent (linux 2.6.18) */ int sd = PQsocket(conn); // Use TCP keep-alive feature setsockopt(sd, SOL_SOCKET, SO_KEEPALIVE, 1); // Maximum keep-alive probes before asuming the connection is lost setsockopt(sd, IPPROTO_TCP, TCP_KEEPCNT, 5); // Interval (in seconds) between keep-alive probes setsockopt(sd, IPPROTO_TCP, TCP_KEEPINTVL, 2); // Maximum idle time (in seconds) before start sending keep-alive probes setsockopt(sd, IPPROTO_TCP, TCP_KEEPIDLE, 10); (see set_sock_opt() above, but is just a simple setsockopt wrapper) then I so a sleep(10) and continue with the Example 1 of the manual (which makes a simple transaction query). In the sleep time I unplug the network cable and monitor the TCP connection using netstat -pano, and found all the TCP keep-alive timers times out perfectly, closing the connection, but inmediatly I see a new connection (and without the keep-alive parameters, so it take forever to timeout again). So I guess libpq is re-opening the socket. This is making my life a nightmare =) Is there any way to avoid this behavior? Please tell me it is =) PS: This thread was originated in libpqxx's mailing list, but I'm moving it here because it looks like a libpq issue, if you want take a look to the original thread, you can find it here: http://gborg.postgresql.org/pipermail/libpqxx-general/2006-November/001511.html TIA --------8<--------8<--------8<--------8<--------8<--------8<-------- void set_sock_opt(int sd, int level, int name, int val) { if (setsockopt(sd, level, name, &val, sizeof(val)) == -1) { perror("setsockopt"); abort(); } } -------->8-------->8-------->8-------->8-------->8-------->8-------- -- Leandro Lucarella Integratech S.A. 4571-5252
Leandro Lucarella <llucarella@integratech.com.ar> writes: > libpq seems to reconnect the socket when the connection is lost. libpq does no such thing. Better recheck your own code (look for calls of PQreset(), perhaps). regards, tom lane
Leandro Lucarella napisal 2006-11-29 20:49: > Is there any keep alive support in libpq? I'm not really using libpq > directly, I'm using libpqxx and there is no keep-alive support there, > so I'm trying to use TCP's own keep-alive support, but I have a > problem: libpq seems to reconnect the socket when the connection is lost. <cut> > In the sleep time I unplug the network cable and monitor the TCP > connection using netstat -pano, and found all the TCP keep-alive > timers times out perfectly, closing the connection, but inmediatly I > see a new connection (and without the keep-alive parameters, so it > take forever to timeout again). So I guess libpq is re-opening the > socket. This is making my life a nightmare =) I used keepalive the same way as you (reconfiguring socket directly) and I don't remember libpq trying to reconnect itself. I think it's a libpqxx's behaviour - I didn't use it, but it looks like it is called "reactivation". Regards, Tomasz Myrta
On Thu, November 30, 2006 03:31, Tomasz Myrta wrote: > I used keepalive the same way as you (reconfiguring socket directly) and > I don't remember libpq trying to reconnect itself. I think it's a > libpqxx's behaviour - I didn't use it, but it looks like it is called > "reactivation". That's right. It's libpqxx, not libpq, that restores the connection. (It couldn't really be any other way because libpq doesn't have enough information to know it's safe--you could be in the middle of a transaction, or you could be losing a temp table). Automatic reactivation can also be disabled explicitly if you don't want it (or just *when* you don't want it--e.g. when you're working with temp tables). I do think that the long TCP timeouts are something that should be handled at the lower levels. We can't really do real keepalives, I guess, simply because libpq is synchronous to the application. But perhaps we could demand that the server at least acknowledge a request in some way within a particular time limit? It'd have to be at the lowest level possible and as "cheap" as possible, so it doesn't break when the server is merely very busy. Jeroen
Jeroen T. Vermeulen escribió: > On Thu, November 30, 2006 03:31, Tomasz Myrta wrote: > >> I used keepalive the same way as you (reconfiguring socket directly) and >> I don't remember libpq trying to reconnect itself. I think it's a >> libpqxx's behaviour - I didn't use it, but it looks like it is called >> "reactivation". > > That's right. It's libpqxx, not libpq, that restores the connection. (It > couldn't really be any other way because libpq doesn't have enough > information to know it's safe--you could be in the middle of a > transaction, or you could be losing a temp table). Automatic reactivation > can also be disabled explicitly if you don't want it (or just *when* you > don't want it--e.g. when you're working with temp tables). > > I do think that the long TCP timeouts are something that should be handled > at the lower levels. We can't really do real keepalives, I guess, simply > because libpq is synchronous to the application. But perhaps we could > demand that the server at least acknowledge a request in some way within a > particular time limit? It'd have to be at the lowest level possible and > as "cheap" as possible, so it doesn't break when the server is merely very > busy. Thanks all for your responses, but this is *not* a libpqxx issue, just because I'm doing the test using plain libpq. Anyways, I have a little more information about my problem and it's no libpq either =) The problem is shown when the time between the wire is unplugged and the use of the connection is not long enough to let the keep-alive kill the connection. Then the connection becomes active and the TCP timers looks like go back to the defaults, because there is data in the socket queue to send. So it's an OS/TCP issue. I don't see any way to control this without using an application-level keep-alive, so I appreciate any ideas and suggestions =) -- Leandro Lucarella Integratech S.A. 4571-5252
For pgsql-hackers, here is the original thread (I think this mail is appropriate for this list, correct me if I'm wrong): http://archives.postgresql.org/pgsql-interfaces/2006-11/msg00014.php Leandro Lucarella escribió: > Thanks all for your responses, but this is *not* a libpqxx issue, just > because I'm doing the test using plain libpq. Anyways, I have a little > more information about my problem and it's no libpq either =) > > The problem is shown when the time between the wire is unplugged and the > use of the connection is not long enough to let the keep-alive kill the > connection. Then the connection becomes active and the TCP timers looks > like go back to the defaults, because there is data in the socket queue > to send. So it's an OS/TCP issue. > > I don't see any way to control this without using an application-level > keep-alive, so I appreciate any ideas and suggestions =) Hi! It's me again =) I was thinking about solutions for my problem, and I've come up with (mainly) this 3 ideas: 1) Add TIPC[1] support to Postgresql. This is the cleaner solution, I think, but the the hardest and could take a lot of time, but if I use some of the other hacks in the meantime and if there is interest on adding this to Postgresql officially, I can evaluate working on this seriously. What I'm sure I don't want is to keep my own Postresql fork. So, what do you think about this? Or where should I ask? 2) Use a "monitor" dummy connection to postgres, do the TCP keep-alive tunning and select() the socket waiting for a disconnection. Since this socket will never be active (is that right? Or Postgresql sends any kind of control information on an idle connection?), the TCP keep-alive will be enough to determine if the connection is lost in a short period of time. If there is no problem with this, I think it could be a quick and not-so-nasty solution =) 3) Use Heartbeat[2] or make some other specific solution like it (probably using TIPC too). I don't like it at all, since I'm looking for a more self-contained solution, but it's another option. I really appreciate any thought on this, and any suggestions. TIA. [1] http://tipc.sourceforge.net/ [2] http://www.linux-ha.org/HeartbeatProgram -- Leandro Lucarella Integratech S.A. 4571-5252