Thread: pgsql: postgres_fdw: Fix handling of pending asynchronous requests.

pgsql: postgres_fdw: Fix handling of pending asynchronous requests.

From
Etsuro Fujita
Date:
postgres_fdw: Fix handling of pending asynchronous requests.

A pending asynchronous request is handled by process_pending_request(),
which previously not only processed an in-progress remote query but
performed ExecForeignScan() to produce a tuple to return to the local
server asynchronously from the result of the remote query.  But that led
to a server crash when executing a query or led to an "InstrStartNode
called twice in a row" or "InstrEndLoop called on running node" failure
when doing EXPLAIN ANALYZE of it, in cases where the plan tree for it
contained multiple async-capable nodes accessing the same
initplan/subplan that contained multiple async-capable nodes scanning
the same foreign tables as for the parent async-capable nodes, as
reported by Andrey Lepikhov.  The reason is that the second step in
process_pending_request() invoked when executing the initplan/subplan
for one of the parent async-capable nodes caused recursive execution of
the initplan/subplan for another of the parent async-capable nodes.

To fix, split process_pending_request() into the two steps and postpone
the second step until ForeignAsyncConfigureWait() is called for each of
the pending asynchronous requests.  Also, in ExecAppendAsyncEventWait()
we assumed that FDWs would register at least one wait event in a
WaitEventSet created there when they were called from
ForeignAsyncConfigureWait() in that function, but allow FDWs to register
zero wait events in the WaitEventSet; modify ExecAppendAsyncEventWait()
to just return in that case.

Oversight in commit 27e1f1456.  Back-patch to v14 where that commit went
in.

Andrey Lepikhov and Etsuro Fujita

Discussion: https://postgr.es/m/fe5eaa19-1704-e4a4-76ee-3b9d37ade399@postgrespro.ru

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/1ec7fca8592178281cd5cdada0f27a340fb813fc

Modified Files
--------------
contrib/postgres_fdw/expected/postgres_fdw.out | 55 +++++++++++++++++++++-
contrib/postgres_fdw/postgres_fdw.c            | 65 ++++++++++++++++++++------
contrib/postgres_fdw/sql/postgres_fdw.sql      | 13 ++++--
src/backend/executor/nodeAppend.c              | 11 +++++
4 files changed, 126 insertions(+), 18 deletions(-)


Re: pgsql: postgres_fdw: Fix handling of pending asynchronous requests.

From
Michael Paquier
Date:
Hi Fujita-san,

On Fri, Jul 30, 2021 at 08:08:07AM +0000, Etsuro Fujita wrote:
> postgres_fdw: Fix handling of pending asynchronous requests.

You have angered many members of the buildfarm here, like:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mule&dt=2021-07-30%2011%3A30%3A04

This is crashing.
--
Michael

Attachment

Re: pgsql: postgres_fdw: Fix handling of pending asynchronous requests.

From
Etsuro Fujita
Date:
Hi Michael-san,

On Fri, Jul 30, 2021 at 9:45 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Fri, Jul 30, 2021 at 08:08:07AM +0000, Etsuro Fujita wrote:
> > postgres_fdw: Fix handling of pending asynchronous requests.
>
> You have angered many members of the buildfarm here, like:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mule&dt=2021-07-30%2011%3A30%3A04
>
> This is crashing.

I’ll look into this.Thanks!

Best regards,
Etsuro Fujita



Re: pgsql: postgres_fdw: Fix handling of pending asynchronous requests.

From
Etsuro Fujita
Date:
On Fri, Jul 30, 2021 at 10:00 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> On Fri, Jul 30, 2021 at 9:45 PM Michael Paquier <michael@paquier.xyz> wrote:
> > On Fri, Jul 30, 2021 at 08:08:07AM +0000, Etsuro Fujita wrote:
> > > postgres_fdw: Fix handling of pending asynchronous requests.
> >
> > You have angered many members of the buildfarm here, like:

> I’ll look into this.

Buildfarm members are causing this assertion failure:

TRAP: FailedAssertion("fsstate->conn_state->pendingAreq == areq",
File: "/home/pgbf/buildroot/HEAD/pgsql.build/../pgsql/contrib/postgres_fdw/postgres_fdw.c",
Line: 6900, PID: 744110)

I couldn’t reproduce this in my environment, but I noticed this, which
didn’t happen in my environment: the case of delivering notifications
to both of the parent async-capable nodes in ExecAppendAsyncEventWait
when processing the added test case query.  In that case, doing
postgresForeignAsyncNotify for the first parent async-capable node
would invoke process_pending_request on the second one when executing
its initplan, which would lead to the assertion failure when doing
postgresForeignAsyncNotify for the second one.  :-(  I think
postgresForeignAsyncNotify would need the same treatment as for
postgresForeignAsyncConfigureWait, like the attached.

Best regards,
Etsuro Fujita

Attachment
Etsuro Fujita <etsuro.fujita@gmail.com> writes:
> I couldn’t reproduce this in my environment, but I noticed this, which
> didn’t happen in my environment: the case of delivering notifications
> to both of the parent async-capable nodes in ExecAppendAsyncEventWait
> when processing the added test case query.  In that case, doing
> postgresForeignAsyncNotify for the first parent async-capable node
> would invoke process_pending_request on the second one when executing
> its initplan, which would lead to the assertion failure when doing
> postgresForeignAsyncNotify for the second one.  :-(  I think
> postgresForeignAsyncNotify would need the same treatment as for
> postgresForeignAsyncConfigureWait, like the attached.

I tried this patch on a spare machine that seems prone to showing
the failure: on HEAD, postgres_fdw's installcheck failed 3 times
in 4 attempts.  With the patch, it's gotten through 23 consecutive
tries without a failure.  So this is at least way better ...

            regards, tom lane



Re: pgsql: postgres_fdw: Fix handling of pending asynchronous requests.

From
Etsuro Fujita
Date:
On Sun, Aug 1, 2021 at 1:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I tried this patch on a spare machine that seems prone to showing
> the failure: on HEAD, postgres_fdw's installcheck failed 3 times
> in 4 attempts.  With the patch, it's gotten through 23 consecutive
> tries without a failure.  So this is at least way better ...

I retried to reproduce the failure in my environment but I couldn’t.
But I plan on applying the patch tomorrow.

Thanks for the testing!

Best regards,
Etsuro Fujita



Re: pgsql: postgres_fdw: Fix handling of pending asynchronous requests.

From
Etsuro Fujita
Date:
On Sun, Aug 1, 2021 at 4:59 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> But I plan on applying the patch tomorrow.

Done, in hopes of making the buildfarm back to green.

Best regards,
Etsuro Fujita