Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds
Date
Msg-id 20240209004005.xu3lxvackgccpglg@awork3.anarazel.de
Whole thread Raw
In response to Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hi,

On 2024-02-08 19:20:35 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I might be missing something here, but leaving the concrete crash aside, why
> > is it ok for pgfdw_get_cleanup_result() etc to block during abort processing?
>
> It's not pretty, for sure.  I thought briefly about postponing the
> cleanup until we next try to use the connection, but I fear the
> semantic side-effects of that would be catastrophic.  We can't leave
> the remote's query sitting open long after the local transaction has
> been canceled --- that risks undetected deadlocks, at the least.
> I think all we can do is try to reduce the risk of failure during
> transaction cleanup.

I agree that we can't just delay cleanup till, potentially, much later , but I
don't think that means that we have to wait 30s for each connection,
one-by-one.


One way we could fix the issue at hand would be to make postgres fdw reserve
one FD, for all connections, and release it before the WaitLatchOrSocket() and
reacquire it after. That way we can make sure that there's an FD available.

OTOH, as waiting for connections one-by-one isn't good, perhaps we should just
rewrite the code to create one WES for all connections and wait in parallel,
processing cancel/aborts completing as they complete.  That'd make the abort
less slow and it'd make the reserve-one-fd-for-postgres-fdw approach a bit
less ugly.  But unfortunately that's a bit big for a bugfix...

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds
Next
From: Tom Lane
Date:
Subject: Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds