Re: libpq: Which functions may hang due to network issues? - Mailing list pgsql-general

From Daniel Frey
Subject Re: libpq: Which functions may hang due to network issues?
Date
Msg-id AAD962D1-48E1-4CA9-9AFD-E6288EFB4A4A@gmx.de
Whole thread Raw
In response to Re: libpq: Which functions may hang due to network issues?  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: libpq: Which functions may hang due to network issues?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
> On 4. Dec 2021, at 22:43, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Fri, 2021-12-03 at 21:33 +0100, Daniel Frey wrote:
>> But the real issue, at least for me, is PQfinish(). Considering that my application is not
>> allowed to hang (or crash, leak, ...), what should I do in case of a timeout?
>
> I am tempted to say that you shouldn't use TCP with the requirement that it should not hang.

We actually use UDP in a lot of places, specifically Radius. But the DB connection is supposed to be TCP, no?

>> I have existing
>> connections and at some point the network connections stop working (e.g. due to a firewall
>> issue/reboot), etc. If I don't want a resource leak, I *must* call PQfinish(), correct?
>> But I have no idea whether it might hang. If you don't want to guarantee that PQfinish()
>> will not hang, then please advise how to use libpq properly in this situation. If there
>> some asynchronous version of PQfinish()? Or should I handle hanging connections differently?
>
> You could start a separate process that has your PostgreSQL connection and kill it if it
> times out.  But then you'd have a similar problem communicating with that process.

Shifting the problem somewhere else (and adding even more complexity to the system) doesn't solve it.

> A normal thing to do when your database call times out or misbehaves in other ways is
> to give up, report an error and die (after some retries perhaps).

Our software is expected to run 24/7 without dying just because some other system has a  (temporary) outage. And when
databaseconnections die, we issue an alarm and we regularly check if we can open new ones in a rate limited manner, so
wedon't flood the network and the DB with connection requests. We then clear the alarm once DB connectivity comes up
again.Our software includes fallback logic on how to minimize customer impact while DB connectivity is down or when
anothersystems is temporarily unavailable, this is a defined and controlled scenario. If we were to simply crash, what
wouldthe next system up the chain do? See that we are not responsing, so it would also crash? (BTW, I'm working for a
bigtelco company in Germany, just to give some idea/perspective what kind of systems we are talking about). 

With all that said, I think that PostgreSQL/libpq should have a clear, documented way to get rid of a connection that
isguaranteed to not hang. It has something similar for almost all other methods like opening connections, sending
request,retrieving results. Why stop there? 




pgsql-general by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Max connections reached without max connections reached
Next
From: Dilip Kumar
Date:
Subject: Re: Max connections reached without max connections reached