Re: connection timeout with psycopg2 - Mailing list pgsql-general

From Adrian Klaver
Subject Re: connection timeout with psycopg2
Date
Msg-id 98404060-71d7-f4c6-9059-9d7dd4800a26@aklaver.com
Whole thread Raw
In response to connection timeout with psycopg2  (Vicente Juan Tomas Monserrat <vicens.tomas@bluekiri.com>)
List pgsql-general
On 10/16/19 2:29 AM, Vicente Juan Tomas Monserrat wrote:
> Hi there,
> 
> I have been testing out the following architecture for PostgreSQL HA.
> 
> |+---------+ +-----+ VIP +----+ | +---------+ | | | +------v-------+ 
> +------v-------+ | pgBouncer | | pgBouncer | | + | | + | | keepalived | 
> | keepalived | +------+-------+ +------+-------+ | | | | | | 
> +------v-------+ +------v-------+ | | | | | HAProxy | | HAProxy | | | | 
> | +------+-------+ +------+-------+ | | +--------------------+ | | | | 
> +----v----+ +----v----+ | | | | | | | | | PG01 | | PG02 | | | | | 
> |(patroni)| |(patroni)| | | | | +---------+ +---------+ |
> 
> I'm using this python script for checking the failover events in 
> pgBouncer, HAProxy and Patroni (PostgreSQL HA solution).
> 
> |#! /usr/bin/env python # -*- coding: utf-8 -*- # vim:fenc=utf-8 import 
> psycopg2 ISOLEVEL = psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT 
> import time from datetime import datetime user = 'postgres' password = 
> 'secretpassword' host = 'localhost' port = '6432' database = 'test' 
> LIMIT_RETRIES = 10 class DB(): def __init__(self, user, password, host, 
> port, database, reconnect): self.user = user self.password = password 
> self.host = host self.port = port self.database = database 
> self._connection = None self._cursor = None self.reconnect = reconnect 
> self.init() def connect(self,retry_counter=0): if not self._connection: 
> try: self._connection = psycopg2.connect(user = self.user, password = 
> self.password, host = self.host, port = self.port, database = 
> self.database, connect_timeout = 3) retry_counter = 0 
> self._connection.autocommit = True return self._connection except 
> psycopg2.OperationalError as error: if not self.reconnect or 
> retry_counter >= LIMIT_RETRIES: raise error else: retry_counter += 1 
> print("got error {}. reconnecting {}".format(str(error).strip(), 
> retry_counter)) time.sleep(5) self.connect(retry_counter) except 
> (Exception, psycopg2.Error) as error: raise error def cursor(self): if 
> not self._cursor or self._cursor.closed: if not self._connection: 
> self.connect() self._cursor = self._connection.cursor() return 
> self._cursor def execute(self, query, retry_counter=0): try: 
> self._cursor.execute(query) retry_counter = 0 except 
> (psycopg2.DatabaseError, psycopg2.OperationalError) as error: if 
> retry_counter >= LIMIT_RETRIES: raise error else: retry_counter += 1 
> print("got error {}. retrying {}".format(str(error).strip(), 
> retry_counter)) time.sleep(1) self.reset() self.execute(query, 
> retry_counter) except (Exception, psycopg2.Error) as error: raise error 
> def reset(self): self.close() self.connect() self.cursor() def 
> close(self): if self._connection: if self._cursor: self._cursor.close() 
> self._connection.close() print("PostgreSQL connection is closed") 
> self._connection = None self._cursor = None def init(self): 
> self.connect() self.cursor() db = DB(user=user, password=password, 
> host=host, port=port, database=database, reconnect=True) 
> db.execute("create table if not exists t1 (id integer);") i = 0 while 
> True: db.execute("insert into t1(id) values(1);") if i % 100 == 0: 
> print("%s: %d" % (datetime.now(), i)) i = i+1 |
> 
> When running this python script against the pgBouncer VIP it keeps 
> inserting data into the database. Then I stop one of the HAProxy 
> service (where the VIP lives) the connection it hangs and never goes on. 
> The VIP is on the other node but the client/app it doesn't notice and it 
> keeps waiting for 5 minutes and finally continues. I've been looking for 
> some default value of 5min with no luck.

Observations and comments:
1) I would point out there is a Psycopg list:

https://www.postgresql.org/list/psycopg/

2) I am not sure where the 5 minutes comes in. I see LIMIT_RETRIES = 10 
and a sleep of 5 sec between retries.

3) Where did this:

"got error server conn crashed?  "

come from? I don't see that in the code.




> 
> |$ python insert.py 2019-10-15 10:01:51.817585: 0 2019-10-15 
> 10:01:51.901091: 100 2019-10-15 10:01:52.031583: 200 2019-10-15 
> 10:01:52.126565: 300 2019-10-15 10:01:52.216502: 400 2019-10-15 
> 10:01:52.307157: 500 2019-10-15 10:01:52.400867: 600 2019-10-15 
> 10:01:52.497239: 700 2019-10-15 10:01:52.655689: 800 2019-10-15 
> 10:01:52.777883: 900 got error server conn crashed? <<<<<<<<<<<<<<<<< 
> HAProxy stopped manually to force the VIP to move to the other node 
> server closed the connection unexpectedly This probably means the server 
> terminated abnormally before or while processing the request.. retrying 
> 1 PostgreSQL connection is closed ^C^C^C^C^C <<<<<<<<<<<<<<<<<< The 
> connection gets stuck (kill PID) |
> 
> I've tried exactly the same code logic in Java (using PostgreSQL JDBC) 
> and dotnet core (using Npgsql) works fine with specifying this 
> parameters socketTimeout (Java) and 'Command Timeout' (dotnet) respectively.
> 
> |$ dotnet run connection initialized 2019-10-15T08:27:28.843 0 
> 2019-10-15T08:27:30.205 100 2019-10-15T08:27:31.566 200 got error: 
> Exception while reading from stream. Retrying 1 connection closed 
> connection initialized connection reset 2019-10-15T08:27:42.076 300 
> <<<<<<<<<<<< the app notices something wrong with the existing 
> connection and does a reset 2019-10-15T08:27:43.461 400 
> 2019-10-15T08:27:44.843 500 2019-10-15T08:27:46.244 600 
> 2019-10-15T08:27:47.637 700 2019-10-15T08:27:49.031 800 ^C |
> 
> In python and using psycopg2 (v2.8.3) I've not been able to inform in 
> any way the application to reset the connection, retry and keep going on.
> 
> As the psycopg2 it's a wrapper for libpq, I've seen that in libpq (for 
> postgresql 12) there is a new option called |tcp_user_timeout|. Note 
> this options doesn't exist in previous versions of libpq.
> So I've tried to install psycopg2 (instead of psycopg2-binary) with the 
> libpq (v12) installed, so the psycopg2 is build against libpq (v12). 
> Even so, it's not working as expected.
> 
> Environment:
> Ubuntu 16.04 LTS
> PostgreSQL 9.6.15
> Patroni 1.6.0
> pgBouncer 1.11.0
> keepalived 2.0.18
> HAProxy 1.6.3
> 
> I've also tried to catch different types of exceptions with no luck.
> 
> I would appreciate any guidance on this matter. I can give you more info 
> if needed.
> 
> Thank!
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



pgsql-general by date:

Previous
From: Lizeth Solis Aramayo
Date:
Subject: RE: CVE-2018-1058
Next
From: Matthias Apitz
Date:
Subject: Re: DBD::Pg exorts char columns with trailing blanks