Thread: psycopg3: a first report

psycopg3: a first report

From
Daniele Varrazzo
Date:
Hello,

I have started working on psycopg3. Database communication is up and
running, sync and asyncio, and now I'm busy with the definition of the
adaptation layer.

You can read more at
https://www.varrazzo.com/blog/2020/03/26/psycopg3-first-report/

Comments are welcome. Cheers!

-- Daniele



Re: psycopg3: a first report

From
Stefan Knecht
Date:
Ciao Daniele

Glad to see progress being made! Well written blog!

If I may, one thing that has been a constant troublemaker for us - running primarily on AWS - is timeouts. I would love to see a fundamental implementation of timeouts at the connection level, regardless of activity - e.g. in the middle of a query, between queries, etc.. If the connection hangs (because the server disappeared or failed over) or does not respond (particularly in the weird cases like when the destination IP just vanishes and the tcp timeout kicks in) - I'd like to know about it within a reasonable time and be able to react to that.

Stefan



On Fri, Mar 27, 2020 at 5:46 PM Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:
Hello,

I have started working on psycopg3. Database communication is up and
running, sync and asyncio, and now I'm busy with the definition of the
adaptation layer.

You can read more at
https://www.varrazzo.com/blog/2020/03/26/psycopg3-first-report/

Comments are welcome. Cheers!

-- Daniele




--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net | @zztat_oracle | fb.me/zztat | zztat.net/blog/

Re: psycopg3: a first report

From
Rory Campbell-Lange
Date:
On 30/03/20, Stefan Knecht (knecht.stefan@gmail.com) wrote:
> If I may, one thing that has been a constant troublemaker for us - running
> primarily on AWS - is timeouts. 

If the issue is disappearing hosts or general connectivity problems,
isn't that problem best dealt with by pg_bouncer?

https://www.pgbouncer.org/config.html#connection-sanity-checks-timeouts

Specifically you may wish to refer to server_connect_timeout.

Rory




Re: psycopg3: a first report

From
Stefan Knecht
Date:
Rory, this is about established connections, not new connections - psycopg2 already offers a connection timeout, but that is a different thing. I don't want to drift too far off topic - but we are already using pgbouncer, and the problem isn't detected by it, either. I'm not a developer but I believe the problem is the generic nature of some blocking socket calls, which may hang under some odd circumstances, and they remain hanging until some odd ssl timeout is reached (15 minutes+ which is a very long time for any application to be hanging in limbo, but more so for our own monitoring tools which are written in Python).

Having the client provide general handling for in-flight timeouts would be the overall best solution imho.

On Mon, Mar 30, 2020 at 2:48 PM Rory Campbell-Lange <rory@campbell-lange.net> wrote:
On 30/03/20, Stefan Knecht (knecht.stefan@gmail.com) wrote:
> If I may, one thing that has been a constant troublemaker for us - running
> primarily on AWS - is timeouts.

If the issue is disappearing hosts or general connectivity problems,
isn't that problem best dealt with by pg_bouncer?

https://www.pgbouncer.org/config.html#connection-sanity-checks-timeouts

Specifically you may wish to refer to server_connect_timeout.

Rory



--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net | @zztat_oracle | fb.me/zztat | zztat.net/blog/

Re: psycopg3: a first report

From
Daniele Varrazzo
Date:
Hi Stefan,

On Mon, 30 Mar 2020, 17:27 Stefan Knecht, <knecht.stefan@gmail.com> wrote:

If I may, one thing that has been a constant troublemaker for us - running primarily on AWS - is timeouts. I would love to see a fundamental implementation of timeouts at the connection level, regardless of activity - e.g. in the middle of a query, between queries, etc.. If the connection hangs (because the server disappeared or failed over) or does not respond (particularly in the weird cases like when the destination IP just vanishes and the tcp timeout kicks in) - I'd like to know about it within a reasonable time and be able to react to that.

It would be interesting to set up integration tests wirh connections being disrupted in various ways and at different moments during connection and query processes. I'd rather do them in a test framework rather than one-off.

Mitmproxy could be a nice tool to build them.


-- Daniele

Re: psycopg3: a first report

From
Daniele Varrazzo
Date:
Having the client provide general handling for in-flight timeouts would be the overall best solution imho. 

I agree on this: the client should be aware of timeouts, both on connection and querying.

If we keep the waiting loops in Python and only use non-blocking C functions I think making sure every operation has a timeout is achievable.

-- Daniele 

Re: psycopg3: a first report

From
Rory Campbell-Lange
Date:
On 30/03/20, Stefan Knecht (knecht.stefan@gmail.com) wrote:
> Rory, this is about established connections, not new connections - psycopg2
> already offers a connection timeout, but that is a different thing. I don't
> want to drift too far off topic - but we are already using pgbouncer, and
> the problem isn't detected by it, either. I'm not a developer but I believe
> the problem is the generic nature of some blocking socket calls, which may
> hang under some odd circumstances, and they remain hanging until some odd
> ssl timeout is reached (15 minutes+ which is a very long time for any
> application to be hanging in limbo, but more so for our own monitoring
> tools which are written in Python).

My apologies for the misunderstanding.

This sounds like the hanging tcp/ip problem one can get with
disappearing web server clients. We've had that problem with mobiles
dropping out presumably because they go out of range, and the provider
not dropping the connection because they might come back.

We deal with this in apache by having a fairly aggressive
KeepAliveTimeout parameter.

Having this problem in one's own stack sounds scary.

I'm very intrigued by Daniele's suggestion that having a per-operation
timeout on the client is achievable.

Rory





Re: psycopg3: a first report

From
Daniele Varrazzo
Date:
On Mon, 30 Mar 2020 at 17:27, Stefan Knecht <knecht.stefan@gmail.com> wrote:

> I would love to see a fundamental implementation of timeouts at the connection level, regardless of activity

By the way, did you try playing with the libpq socket options, such as
tcp_user_timeout?


-- Daniele