Re: psql connection reset failed - Mailing list pgsql-hackers

From amul sul
Subject Re: psql connection reset failed
Date
Msg-id 1368409916.74959.YahooMailNeo@web193501.mail.sg3.yahoo.com
Whole thread Raw
In response to psql connection reset failed  (amul sul <sul_amul@yahoo.co.in>)
List pgsql-hackers
Hi,

I tested this issue with  9.3beta1 , and same thing happening there too.

By making changes as done in above patch, its work fine as expected.

I am not sure, does this fixed is required to do?

If so, then what should min wait time should set as 5000 microSec is seted for test, is this fine or too big?
Is there any way to privide maxWaittime and minWaitTime, some other way by setting configuration parameters ?

Thanks and regards,
Amul Sul

From: amul sul <sul_amul@yahoo.co.in>
To: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Sent: Friday, 10 May 2013 5:14 PM
Subject: [HACKERS] psql connection reset failed

I have observed the following situation a few times now , with 8.4.5.
 Multiple PSQL clients are connected to server, some of them running  transaction and some of them are idle state.
 
 
 When one of the backend is killed or crashed (using kill -9 <backend-pid>).
 The connection reset attempt from the active clients( that is, which were running a  transaction and crashed in between) fails, since they immediately make the attempt while the server is in startup phase.
 
 As you can see from following:
 
 
 -----------------------
 ACTIVE CLIENT
 -----------------------
 [amul@localhost ~]$ psql -p 5432 postgres psql (8.4.5) Type "help" for help.
 
 postgres=# create table emp( id int,name varchar(20)); CREATE TABLE  postgres=# insert into emp values(generate_series(1,999999999),'XYZ');
 WARNING:  terminating connection because of crash of another server  process
 DETAIL:  The postmaster has commanded this server process to roll back the  current transaction and exit, because another server process exited  abnormally and possibly corrupted shared memory.
 HINT:  In a moment you should be able to reconnect to the database and  repeat your command.
 server closed the connection unexpectedly
         This probably means the server terminated abnormally
         before or while processing the request.
 The connection to the server was lost. Attempting reset: Failed.
 !
 
 -----------------------
 IDLE CLIENT
 -----------------------
 
 
 [amul@localhost ~]$ psql -p 5432 postgres  psql (8.4.5)  Type "help" for help.
 
 postgres=# select pg_backend_pid();
 server closed the connection unexpectedly
         This probably means the server terminated abnormally
         before or while processing the request.
 The connection to the server was lost. Attempting reset: Succeeded.
 postgres=#
 
 
 I just gone through and found following:
 
 1. When backend crashes , server goes into recovery mode and come in the  normal state to accept connection, it take little time.
 2. But at busy client(which was running transaction before crash),  immediately tries to reconnect to server which is under startup phase so it  gets a negative reply and fails to reconnect.
 
 So I thought, before sending reconnect request from client need to wait for  the server come to a state when it can accept connections  It should have some timeout wait.
 
 I am not sure is this correct way to code modification or does it have any other impact.
 I tried wait to client before sending reconnect request to server.
 For that added some sleep time for client in src/bin/psql/common.c (that is it changes things only  for psql clients)
 
 Please check the attached patch for the modification.
 
 
 


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


pgsql-hackers by date:

Previous
From: Jon Nelson
Date:
Subject: Re: corrupt pages detected by enabling checksums
Next
From: "Evan D. Hoffman"
Date:
Subject: Re: Re: [GENERAL] pg_upgrade fails, "mismatch of relation OID" - 9.1.9 to 9.2.4