Re: Use-after-free issue in postgres_fdw - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Use-after-free issue in postgres_fdw
Date
Msg-id CAPmGK16-L3UhkrMGUhL1DO3tFRrFaZ_8QSHE62ToJWQU_Gu0UQ@mail.gmail.com
Whole thread Raw
In response to Re: Use-after-free issue in postgres_fdw  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: Use-after-free issue in postgres_fdw
List pgsql-hackers
Hi Chao,

On Fri, Mar 20, 2026 at 5:27 PM Chao Li <li.evan.chao@gmail.com> wrote:
> I can reproduce the server crash following your procedure, and I traced the problem.
>
> The issue is that, during select * from ft1, pgfdw_reject_incomplete_xact_state_change() calls
disconnect_pg_server(),which destroys conn and sets ConnCacheEntry->conn = NULL, but does not update
PgFdwScanState->conn.As a result, when "close c1" is executed later, PgFdwScanState->conn points to stale memory with
randomcontents. 

That is right.

> I am not sure we should still allow further commands to run after select * from ft1, given that it has already
raised:"ERROR:  connection to server "loopback" was lost”. Maybe we should not keep going as if the connection were
stillthere. 

The ERROR is thrown within a subtransaction, not within the main
transaction, so we *should* allow commands (that don't use the
connection) to run *after* rolling back to the subtransaction.

(A transaction, like the example one, whose subtransactions failed
abort cleanup due to a connection issue needs to abort in the end, as
it can't commit the remote transaction anymore.  Considering this
limitation, the above behavior might be surprising to users.  We might
need to do something about this, but I think that that is another
issue and thus should be discussed separately.)

> I am not very familiar with the FDW code, so I am not ready to suggest a concrete fix. But it seems wrong to let
laterpaths keep using PgFdwScanState->conn after select * from ft1 has already failed with connection loss. 

Why?  While we can't use the connection any further, with the patch we
can still use the cashed PGconn, as pointed out by Matheus downthread.

> My guess is that we either need to invalidate all dependent state when disconnect_pg_server() runs, or otherwise
preventlater cleanup paths from touching the cached PGconn *. 

I don't think that that is a good idea, because that 1) would require
invasive changes to the existing code, which isn't good especially for
back-patching, as also mentioned by Matheus downthread, and 2) would
result in something that is essentially the same as what I proposed;
in other words, that would be a lot of effort with little result.

Thanks for looking at this!

Best regards,
Etsuro Fujita



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [19] CREATE SUBSCRIPTION ... SERVER
Next
From: Etsuro Fujita
Date:
Subject: Re: Use-after-free issue in postgres_fdw