Home > mailing lists

Re: Use-after-free issue in postgres_fdw - Mailing list pgsql-hackers

From	Chao Li
Subject	Re: Use-after-free issue in postgres_fdw
Date	March 23 05:48:25
Msg-id	CDA8E688-A4F6-415C-9165-D899480EBE95@gmail.com Whole thread Raw
In response to	Re: Use-after-free issue in postgres_fdw (Etsuro Fujita <etsuro.fujita@gmail.com>)
List	pgsql-hackers

Tree view


> On Mar 21, 2026, at 19:40, Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>
> Hi Chao,
>
> On Fri, Mar 20, 2026 at 5:27 PM Chao Li <li.evan.chao@gmail.com> wrote:
>> I can reproduce the server crash following your procedure, and I traced the problem.
>>
>> The issue is that, during select * from ft1, pgfdw_reject_incomplete_xact_state_change() calls
disconnect_pg_server(),which destroys conn and sets ConnCacheEntry->conn = NULL, but does not update
PgFdwScanState->conn.As a result, when "close c1" is executed later, PgFdwScanState->conn points to stale memory with
randomcontents. 
>
> That is right.
>
>> I am not sure we should still allow further commands to run after select * from ft1, given that it has already
raised:"ERROR:  connection to server "loopback" was lost”. Maybe we should not keep going as if the connection were
stillthere. 
>
> The ERROR is thrown within a subtransaction, not within the main
> transaction, so we *should* allow commands (that don't use the
> connection) to run *after* rolling back to the subtransaction.
>
> (A transaction, like the example one, whose subtransactions failed
> abort cleanup due to a connection issue needs to abort in the end, as
> it can't commit the remote transaction anymore.  Considering this
> limitation, the above behavior might be surprising to users.  We might
> need to do something about this, but I think that that is another
> issue and thus should be discussed separately.)

Thanks for the explanation. Makes sense to me.

>
>> I am not very familiar with the FDW code, so I am not ready to suggest a concrete fix. But it seems wrong to let
laterpaths keep using PgFdwScanState->conn after select * from ft1 has already failed with connection loss. 
>
> Why?  While we can't use the connection any further, with the patch we
> can still use the cashed PGconn, as pointed out by Matheus downthread.
>
>> My guess is that we either need to invalidate all dependent state when disconnect_pg_server() runs, or otherwise
preventlater cleanup paths from touching the cached PGconn *. 
>
> I don't think that that is a good idea, because that 1) would require
> invasive changes to the existing code, which isn't good especially for
> back-patching, as also mentioned by Matheus downthread, and 2) would
> result in something that is essentially the same as what I proposed;
> in other words, that would be a lot of effort with little result.
>

Makes sense. Then the patch looks good to me.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

pgsql-hackers by date:

From: Chao Li
Date: 23 March, 05:22:53
Subject: Re: Check some unchecked fclose() results

From: Tom Lane
Date: 23 March, 06:02:20
Subject: Re: pg_waldump: support decoding of WAL inside tarfile

Re: Use-after-free issue in postgres_fdw - Mailing list pgsql-hackers

Previous

Next