Re: Asynchronous Append on postgres_fdw nodes. - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Asynchronous Append on postgres_fdw nodes. |
Date | |
Msg-id | CAPmGK14tsj-kYYb4z4CUVkiTRzWH0uAGW7a6STbYt9VMvdUKvQ@mail.gmail.com Whole thread Raw |
In response to | Re: Asynchronous Append on postgres_fdw nodes. (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
List | pgsql-hackers |
On Fri, Feb 12, 2021 at 5:30 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > It seems too specific to async Append so I look it as a PoC of the > mechanism. Are you saying that the patch only reconsiders async ForeignScans? > It creates a hash table that keyed by connection umid to record > planids run on the connection, triggerd by core planner via a dedicate > API function. It seems to me that ConnCacheEntry.state can hold that > and the hash is not needed at all. I think a good thing about the hash table is that it can be used by other FDWs that support async execution in a similar way to postgres_fdw, so they don’t need to create their own hash tables. But I’d like to know about the idea of using ConnCacheEntry. Could you elaborate a bit more about that? > | postgresReconsiderAsyncForeignScan(ForeignScanState *node, AsyncContext *acxt) > | { > | ... > | /* > | * If the connection used for the ForeignScan node is used in other parts > | * of the query plan tree except async subplans of the parent Append node, > | * disable async execution of the ForeignScan node. > | */ > | if (!bms_is_subset(fsplanids, asyncplanids)) > | return false; > > This would be a reasonable restriction. Cool! > | /* > | * If the subplans of the Append node are all async-capable, and use the > | * same connection, then we won't execute them asynchronously. > | */ > | if (requestor->as_nasyncplans == requestor->as_nplans && > | !bms_nonempty_difference(asyncplanids, fsplanids)) > | return false; > > It is the correct restiction? I understand that the currently > intending restriction is one connection accepts at most one FDW-scan > node. This looks somethig different... People put multiple partitions in a remote PostgreSQL server in sharding, so the patch allows multiple postgres_fdw ForeignScans beneath an Append that use the same connection to be executed asynchronously like this: postgres=# create table t1 (a int, b int, c text); postgres=# create table t2 (a int, b int, c text); postgres=# create table t3 (a int, b int, c text); postgres=# create foreign table p1 (a int, b int, c text) server server1 options (table_name 't1'); postgres=# create foreign table p2 (a int, b int, c text) server server2 options (table_name 't2'); postgres=# create foreign table p3 (a int, b int, c text) server server2 options (table_name 't3'); postgres=# create table pt (a int, b int, c text) partition by range (a); postgres=# alter table pt attach partition p1 for values from (10) to (20); postgres=# alter table pt attach partition p2 for values from (20) to (30); postgres=# alter table pt attach partition p3 for values from (30) to (40); postgres=# insert into p1 select 10 + i % 10, i, to_char(i, 'FM0000') from generate_series(0, 99) i; postgres=# insert into p2 select 20 + i % 10, i, to_char(i, 'FM0000') from generate_series(0, 99) i; postgres=# insert into p3 select 30 + i % 10, i, to_char(i, 'FM0000') from generate_series(0, 99) i; postgres=# analyze pt; postgres=# explain verbose select count(*) from pt; QUERY PLAN ------------------------------------------------------------------------------------------ Aggregate (cost=314.25..314.26 rows=1 width=8) Output: count(*) -> Append (cost=100.00..313.50 rows=300 width=0) -> Async Foreign Scan on public.p1 pt_1 (cost=100.00..104.00 rows=100 width=0) Remote SQL: SELECT NULL FROM public.t1 -> Async Foreign Scan on public.p2 pt_2 (cost=100.00..104.00 rows=100 width=0) Remote SQL: SELECT NULL FROM public.t2 -> Async Foreign Scan on public.p3 pt_3 (cost=100.00..104.00 rows=100 width=0) Remote SQL: SELECT NULL FROM public.t3 (9 rows) For this query, p2 and p3, which use the same connection, are scanned asynchronously! But if all the subplans of an Append are async postgres_fdw ForeignScans that use the same connection, they won’t be parallelized at all, and the overhead of async execution may cause a performance degradation. So the patch disables async execution of them in that case using the above code bit. Thanks for the review! Best regards, Etsuro Fujita
pgsql-hackers by date: