Thread: Blocking connection and timeout problem
Hi, I managed to reproduce a locking problem, which is about blocking transactions and dead connections. How can this be avoided? What I did was: 1. Start a transaction on Client A 2. Execute a fcn doing an UPDATE on a certain record on client A 3. Execute a fcn doing an UPDATE on the same record on client B / as expected SQL on client B stalls untl client A's transaction ends/ 4. disconnect client A's network cable 5. kill the psql on client A / C.A's transaction still runs since two hours, C.B's SQL lasts for equal that time/ 6. wait on client B's statement to finish... 7. kill -TERM client A's postmaster / client B's statement succeeds Why is that? Why isn't client A's transaction detected as dead ? Yes, because the socked wasn't closed due to a network error. But this can happen all the time. What can I do?
=?iso-8859-1?Q?K=D6PFERL_Robert?= <robert.koepferl@sonorys.at> writes: > Why is that? Why isn't client A's transaction detected as dead ? It will be eventually, when the TCP connection times out. The standard timeout is generally an hour or two :-( regards, tom lane
|-----Original Message----- |From: Tom Lane [mailto:tgl@sss.pgh.pa.us] |Sent: Mittwoch, 03. August 2005 16:38 |To: KÖPFERL Robert |Cc: pgsql-admin@postgresql.org |Subject: Re: [ADMIN] Blocking connection and timeout problem | | |=?iso-8859-1?Q?K=D6PFERL_Robert?= <robert.koepferl@sonorys.at> writes: |> Why is that? Why isn't client A's transaction detected as dead ? | |It will be eventually, when the TCP connection times out. The standard |timeout is generally an hour or two :-( Seems not like so. I waited for two hours and it still existed. But non-the-less. Can this timeout be configured to be less? | | regards, tom lane |
Used to have this problem also with pgadmin connections, that didn't close
some sessions remain there, for a period, and you must kill them to continue
without a restart of the service.
scenery:
pgsql >= 7.4.3 , not tested on >=8
pgadmin <= 1.0.2
On other layer of this issue, this should be also related with some in deep tcp/ip settings.
ie, I know that in order to keep an <idle> ssh session open for a longer time, you shall touch some values somewhere *not* in the main ssh_config file. (a global parameter for the OS) And this will keep your session open for a longer time.
If this is particular for every OS, that could explain a longer time on your case, and 1 or 2 hours in Tom's case.
Anyway, I would like to hear a little bit about this, cause it's a little bit difficult to find such cases,
we always end on the problem lacking, without a previous warn. (and that is bad, for a DBA at least).
Regards,
Guido
--
"Adopting the position that you are smarter than an automatic
optimization algorithm is generally a good way to achieve less
performance, not more" - Tom Lane.
some sessions remain there, for a period, and you must kill them to continue
without a restart of the service.
scenery:
pgsql >= 7.4.3 , not tested on >=8
pgadmin <= 1.0.2
On other layer of this issue, this should be also related with some in deep tcp/ip settings.
ie, I know that in order to keep an <idle> ssh session open for a longer time, you shall touch some values somewhere *not* in the main ssh_config file. (a global parameter for the OS) And this will keep your session open for a longer time.
If this is particular for every OS, that could explain a longer time on your case, and 1 or 2 hours in Tom's case.
Anyway, I would like to hear a little bit about this, cause it's a little bit difficult to find such cases,
we always end on the problem lacking, without a previous warn. (and that is bad, for a DBA at least).
Regards,
Guido
On 8/3/05, KÖPFERL Robert <robert.koepferl@sonorys.at> wrote:
|-----Original Message-----
|From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
|Sent: Mittwoch, 03. August 2005 16:38
|To: KÖPFERL Robert
|Cc: pgsql-admin@postgresql.org
|Subject: Re: [ADMIN] Blocking connection and timeout problem
|
|
|KÖPFERL Robert <robert.koepferl@sonorys.at > writes:
|> Why is that? Why isn't client A's transaction detected as dead ?
|
|It will be eventually, when the TCP connection times out. The standard
|timeout is generally an hour or two :-(
Seems not like so. I waited for two hours and it still existed.
But non-the-less. Can this timeout be configured to be less?
|
| regards, tom lane
|
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
--
"Adopting the position that you are smarter than an automatic
optimization algorithm is generally a good way to achieve less
performance, not more" - Tom Lane.
Assuming you are on a linux box (Not sure if it's stated in your mail, sorry if I missed that) you could set /proc/sys/net/ipv4/tcp_keepalive_time to a lower value. That should do the trick. Hope this helps Greetings Dario On Wed, 2005-08-03 at 16:48 +0200, KÖPFERL Robert wrote: > > > > |-----Original Message----- > |From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > |Sent: Mittwoch, 03. August 2005 16:38 > |To: KÖPFERL Robert > |Cc: pgsql-admin@postgresql.org > |Subject: Re: [ADMIN] Blocking connection and timeout problem > | > | > |=?iso-8859-1?Q?K=D6PFERL_Robert?= <robert.koepferl@sonorys.at> writes: > |> Why is that? Why isn't client A's transaction detected as dead ? > | > |It will be eventually, when the TCP connection times out. The standard > |timeout is generally an hour or two :-( > > Seems not like so. I waited for two hours and it still existed. > > But non-the-less. Can this timeout be configured to be less? > > | > | regards, tom lane > | > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
Assuming you are on a linux box (Not sure if it's stated in your mail, sorry if I miss that) you could set /proc/sys/net/ipv4/tcp_keepalive_time to a lower value. That should do the trick :-) Hope it helps. Greetings Dario. On Wed, 2005-08-03 at 16:48 +0200, KÖPFERL Robert wrote: > > > > |-----Original Message----- > |From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > |Sent: Mittwoch, 03. August 2005 16:38 > |To: KÖPFERL Robert > |Cc: pgsql-admin@postgresql.org > |Subject: Re: [ADMIN] Blocking connection and timeout problem > | > | > |=?iso-8859-1?Q?K=D6PFERL_Robert?= <robert.koepferl@sonorys.at> writes: > |> Why is that? Why isn't client A's transaction detected as dead ? > | > |It will be eventually, when the TCP connection times out. The standard > |timeout is generally an hour or two :-( > > Seems not like so. I waited for two hours and it still existed. > > But non-the-less. Can this timeout be configured to be less? > > | > | regards, tom lane > | > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > > -- Atte: Dario Brignardello Planificacion y Desarrollo Tecnologia UOL Argentina S.A Florida 537 Piso 6, Buenos Aires, Argentina +54-11-4321-9110 ext 2533 PGP public key: http://webs.uolsinectis.com.ar/dbrignar/pgp.html
Attachment
Erm, no, it is a Solaris box. And as I could experience, there are rather long timeouts in solaris' tcp implementation. Thanks anyway. What about Postgres. Wouldn't it be better to have postgres sending keep alive or heart beat packages during that time? |-----Original Message----- |From: Dario Brignardello [mailto:dbrignar@uolsinectis.com] |Sent: Mittwoch, 03. August 2005 19:33 |To: KÖPFERL Robert |Cc: pgsql-admin@postgresql.org |Subject: Re: [ADMIN] Blocking connection and timeout problem | | |Assuming you are on a linux box (Not sure if it's stated in your mail, |sorry if I miss that) you could set | |/proc/sys/net/ipv4/tcp_keepalive_time | |to a lower value. That should do the trick :-) | | |Hope it helps. | |Greetings |Dario. | | | |On Wed, 2005-08-03 at 16:48 +0200, KÖPFERL Robert wrote: |> |> |> |> |-----Original Message----- |> |From: Tom Lane [mailto:tgl@sss.pgh.pa.us] |> |Sent: Mittwoch, 03. August 2005 16:38 |> |To: KÖPFERL Robert |> |Cc: pgsql-admin@postgresql.org |> |Subject: Re: [ADMIN] Blocking connection and timeout problem |> | |> | |> |=?iso-8859-1?Q?K=D6PFERL_Robert?= |<robert.koepferl@sonorys.at> writes: |> |> Why is that? Why isn't client A's transaction detected as dead ? |> | |> |It will be eventually, when the TCP connection times out. |The standard |> |timeout is generally an hour or two :-( |> |> Seems not like so. I waited for two hours and it still existed. |> |> But non-the-less. Can this timeout be configured to be less? |> |> | |> | regards, tom lane |> | |> |> ---------------------------(end of |broadcast)--------------------------- |> TIP 6: explain analyze is your friend |> |> |-- |Atte: |Dario Brignardello |Planificacion y Desarrollo |Tecnologia |UOL Argentina S.A | |Florida 537 Piso 6, Buenos Aires, Argentina |+54-11-4321-9110 ext 2533 |PGP public key: http://webs.uolsinectis.com.ar/dbrignar/pgp.html |
Taking the risk of running offtopic: That value is defined in Solaris as well. Use: ndd -get /dev/tcp tcp_keepalive_interval to retrieve it, and /usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 3600000 to set it to one hour. Check http://sunsolve.sun.com/search/document.do?assetkey=1-30-2876-1 for more information. Regarding the possibility of heartbeats in postgres ... have no idea how wise would it be ... there's people more qualified here that could answer that ;-D Hope this helps, anyway. Regards. Dario On Fri, 2005-08-05 at 09:43 +0200, KÖPFERL Robert wrote: > > Erm, no, it is a Solaris box. > And as I could experience, there are rather long timeouts in solaris' tcp > implementation. > > Thanks anyway. > > What about Postgres. Wouldn't it be better to have postgres sending keep > alive or heart beat packages during that time? > > |-----Original Message----- > |From: Dario Brignardello [mailto:dbrignar@uolsinectis.com] > |Sent: Mittwoch, 03. August 2005 19:33 > |To: KÖPFERL Robert > |Cc: pgsql-admin@postgresql.org > |Subject: Re: [ADMIN] Blocking connection and timeout problem > | > | > |Assuming you are on a linux box (Not sure if it's stated in your mail, > |sorry if I miss that) you could set > | > |/proc/sys/net/ipv4/tcp_keepalive_time > | > |to a lower value. That should do the trick :-) > | > | > |Hope it helps. > | > |Greetings > |Dario. > | > | > | > |On Wed, 2005-08-03 at 16:48 +0200, KÖPFERL Robert wrote: > |> > |> > |> > |> |-----Original Message----- > |> |From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > |> |Sent: Mittwoch, 03. August 2005 16:38 > |> |To: KÖPFERL Robert > |> |Cc: pgsql-admin@postgresql.org > |> |Subject: Re: [ADMIN] Blocking connection and timeout problem > |> | > |> | > |> |=?iso-8859-1?Q?K=D6PFERL_Robert?= > |<robert.koepferl@sonorys.at> writes: > |> |> Why is that? Why isn't client A's transaction detected as dead ? > |> | > |> |It will be eventually, when the TCP connection times out. > |The standard > |> |timeout is generally an hour or two :-( > |> > |> Seems not like so. I waited for two hours and it still existed. > |> > |> But non-the-less. Can this timeout be configured to be less? > |> > |> | > |> | regards, tom lane > |> | > |> > |> ---------------------------(end of > |broadcast)--------------------------- > |> TIP 6: explain analyze is your friend > |> > |> > |-- > |Atte: > |Dario Brignardello > |Planificacion y Desarrollo > |Tecnologia > |UOL Argentina S.A > | > |Florida 537 Piso 6, Buenos Aires, Argentina > |+54-11-4321-9110 ext 2533 > |PGP public key: http://webs.uolsinectis.com.ar/dbrignar/pgp.html > | > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org >