Thread: BUG #5465: dblink TCP connection hangs blocking translation from being terminated
BUG #5465: dblink TCP connection hangs blocking translation from being terminated
From
Robert Voinea
Date:
Hi In our setup, we make use extensively of dblink. Due to the fact that some queries take some time to complete and that the link is over the internet, sometime the server process (the transaction that runs the dblink queries) hangs when the link goes down, keeping locks on several records plus some advisory locks and thus freezing the whole (most of) the database. What I have found is this bug, that is remarkably similar (if not identical) with what we are experiencing. http://postgresql.1045698.n5.nabble.com/BUG-5465-dblink-TCP-connection-hangs-blocking-translation-from-being-terminated-td2132419.html#a2132420 The bug dates from may 2010 and no update since. One of the comments states that there is work done for version 9.0 ...but I haven't seen anything in the changelog related to this in any version starting with the one we are using (9.1.12). <quote> I believe this is a known issue in dblink, where it's not possible to cancel it when it's waiting in the TCP layer in the kernel. Unfortunately, there is no fix ATM - there was some work towards it for 9.0 at one point, but I think this is actually the first real bug-report on the issue... </quote> Has there been any progress in this direction? Thank you! -- Robert Voinea Software Engineer +4 0740 467 262 Don't take life too seriously. You'll never get out of it alive. (Elbert Hubbard)
Re: BUG #5465: dblink TCP connection hangs blocking translation from being terminated
From
Tom Lane
Date:
Robert Voinea <rvoinea@gmail.com> writes: > In our setup, we make use extensively of dblink. > Due to the fact that some queries take some time to complete and that the link > is over the internet, sometime the server process (the transaction that runs > the dblink queries) hangs when the link goes down, keeping locks on several > records plus some advisory locks and thus freezing the whole (most of) the > database. > What I have found is this bug, that is remarkably similar (if not identical) > with what we are experiencing. > http://postgresql.1045698.n5.nabble.com/BUG-5465-dblink-TCP-connection-hangs-blocking-translation-from-being-terminated-td2132419.html#a2132420 That does not sound like a Postgres bug to me. What you are unhappy about is that the kernel isn't timing out a lost TCP connection more quickly. The default timeout is long (>1 hour probably), but that's required by Internet standards. The appropriate fix for this is to use aggressive keepalive parameters on the connection. You can set libpq's keepalive parameters in the connection string given to dblink. regards, tom lane
Re: BUG #5465: dblink TCP connection hangs blocking translation from being terminated
From
Robert Voinea
Date:
Hi On Friday 14 March 2014 09:45:22 Tom Lane wrote: > Robert Voinea <rvoinea@gmail.com> writes: > > In our setup, we make use extensively of dblink. > > Due to the fact that some queries take some time to complete and that the > > link is over the internet, sometime the server process (the transaction > > that runs the dblink queries) hangs when the link goes down, keeping > > locks on several records plus some advisory locks and thus freezing the > > whole (most of) the database. > > > > What I have found is this bug, that is remarkably similar (if not > > identical) with what we are experiencing. > > http://postgresql.1045698.n5.nabble.com/BUG-5465-dblink-TCP-connection-han > > gs-blocking-translation-from-being-terminated-td2132419.html#a2132420 > That does not sound like a Postgres bug to me. What you are unhappy about > is that the kernel isn't timing out a lost TCP connection more quickly. > The default timeout is long (>1 hour probably), but that's required by > Internet standards. The appropriate fix for this is to use aggressive > keepalive parameters on the connection. You can set libpq's keepalive > parameters in the connection string given to dblink. > > regards, tom lane I seem to have missed those parameters... and the fact that you actually need keep-alive on both client AND server, not only on the server. Thank you! -- Robert Voinea Software Engineer +4 0740 467 262 Don't take life too seriously. You'll never get out of it alive. (Elbert Hubbard)