Home > mailing lists

Re: Asynchronous Append on postgres_fdw nodes. - Mailing list pgsql-hackers

From	Etsuro Fujita
Subject	Re: Asynchronous Append on postgres_fdw nodes.
Date	May 11, 2021 07:24:53
Msg-id	CAPmGK15CyEmstJyzkGsT_1OmArJwD6vQguBEwiz8oYx4zz6ssA@mail.gmail.com Whole thread
In response to	Re: Asynchronous Append on postgres_fdw nodes. (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Responses	Re: Asynchronous Append on postgres_fdw nodes.
List	pgsql-hackers

Tree view

On Tue, May 11, 2021 at 11:58 AM Andrey Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> Your patch fixes the problem. But I found two more problems:
>
> EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)
> SELECT * FROM (
>         (SELECT * FROM f1)
>                 UNION ALL
>         (SELECT * FROM f2)
>                 UNION ALL
>         (SELECT * FROM l3)
> ) q1 LIMIT 6709;
>                            QUERY PLAN
> --------------------------------------------------------------
>   Limit (actual rows=6709 loops=1)
>     ->  Append (actual rows=6709 loops=1)
>           ->  Async Foreign Scan on f1 (actual rows=1 loops=1)
>           ->  Async Foreign Scan on f2 (actual rows=1 loops=1)
>           ->  Seq Scan on l3 (actual rows=6708 loops=1)
>
> Here we scan 6710 tuples at low level but appended only 6709. Where did
> we lose one tuple?

The extra tuple, which is from f1 or f2, would have been kept in the
Append node's as_asyncresults, not returned from the Append node to
the Limit node.  The async Foreign Scan nodes would fetch tuples
before the Append node ask the tuples, so the fetched tuples may or
may not be used.

> 2.
> SELECT * FROM (
>         (SELECT * FROM f1)
>                 UNION ALL
>         (SELECT * FROM f2)
>                 UNION ALL
>         (SELECT * FROM f3 WHERE a > 0)
> ) q1 LIMIT 3000;
>                            QUERY PLAN
> --------------------------------------------------------------
>   Limit (actual rows=3000 loops=1)
>     ->  Append (actual rows=3000 loops=1)
>           ->  Async Foreign Scan on f1 (actual rows=0 loops=1)
>           ->  Async Foreign Scan on f2 (actual rows=0 loops=1)
>           ->  Foreign Scan on f3 (actual rows=3000 loops=1)
>
> Here we give preference to the synchronous scan. Why?

This would be expected behavior, and the reason is avoid performance
degradation; you might think it would be better to execute the async
Foreign Scan nodes more aggressively, but it would require
waiting/polling for file descriptor events many times, which is
expensive and might cause performance degradation.  I think there is
room for improvement, though.

Thanks!

Best regards,
Etsuro Fujita

pgsql-hackers by date:

From: Andrey Lepikhov
Date: 11 May 2021, 07:06:05
Subject: Re: Defer selection of asynchronous subplans until the executor initialization stage

From: Michael Paquier
Date: 11 May 2021, 07:29:28
Subject: Re: Multiple hosts in connection string failed to failover in non-hot standby mode

Re: Asynchronous Append on postgres_fdw nodes. - Mailing list pgsql-hackers

Previous

Next