Thread: Is it possible for postgres_fdw to push down queries on co-located tables?

Is it possible for postgres_fdw to push down queries on co-located tables?

From

Jinhua Luo

Date:

18 September 2018, 13:55:38

That is, if table `foo` and table `bar` are both tables on the same
remote server, then when I do `select * from foo, bar`, can it
delegate the whole query on the remote side, rather than fetching rows
from both servers one by one and do merging on the local side?

For example:

```
foo=> explain select * from foreign_test2, foreign_test where m = id;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Merge Join  (cost=444.06..590.63 rows=9316 width=72)
   Merge Cond: (foreign_test2.m = foreign_test.id)
   ->  Sort  (cost=222.03..225.44 rows=1365 width=36)
         Sort Key: foreign_test2.m
         ->  Foreign Scan on foreign_test2  (cost=100.00..150.95
rows=1365 width=36)
   ->  Sort  (cost=222.03..225.44 rows=1365 width=36)
         Sort Key: foreign_test.id
         ->  Foreign Scan on foreign_test  (cost=100.00..150.95
rows=1365 width=36)
```

If the planning involving remote tables, could the planner push down
the queries once to gain the best efficiency?

Re: Is it possible for postgres_fdw to push down queries on co-located tables?

From

Tom Lane

Date:

18 September 2018, 16:43:28

Jinhua Luo <luajit.io@gmail.com> writes:
> That is, if table `foo` and table `bar` are both tables on the same
> remote server, then when I do `select * from foo, bar`, can it
> delegate the whole query on the remote side, rather than fetching rows
> from both servers one by one and do merging on the local side?

Reasonably recent releases can do that.  What version are you testing?

> foo=> explain select * from foreign_test2, foreign_test where m = id;
>                                      QUERY PLAN
> -------------------------------------------------------------------------------------
>  Merge Join  (cost=444.06..590.63 rows=9316 width=72)
>    Merge Cond: (foreign_test2.m = foreign_test.id)
>    ->  Sort  (cost=222.03..225.44 rows=1365 width=36)
>          Sort Key: foreign_test2.m
>          ->  Foreign Scan on foreign_test2  (cost=100.00..150.95
> rows=1365 width=36)
>    ->  Sort  (cost=222.03..225.44 rows=1365 width=36)
>          Sort Key: foreign_test.id
>          ->  Foreign Scan on foreign_test  (cost=100.00..150.95
> rows=1365 width=36)
> ```

I don't find this particular example to be very compelling.  Taking
the amount of data pulled from the foreign server as the main cost
factor, the plan as given requires pulling 1365*2 rows, whereas if
it were to push down the join, it'd have to retrieve 9316 rows
(or so the planner estimates, anyway).  So it's quite possible that
the planner just rejected the remote join as a net loss.  If you
think it isn't a net loss, you might want to twiddle the cost
parameters for this foreign server.

            regards, tom lane

Re: Is it possible for postgres_fdw to push down queries onco-located tables?

From

Jinhua Luo

Date:

18 September 2018, 17:20:56

I was testing PG10.

Sorry, the example is not so proper. I just think even if it's a
simple example, e.g. join two co-located tables, the planner should
work out to push down it. Can you confirm the postgresql could detect
co-located tables on the same foreign server and push down queries on
them? Could you give an actual example or point out the relevant
source code paths for reference?

(Let me clarify the context of this question, if the planner supports
co-located push-down, then it's meaningful for manual sharding via
partitioning to remote tables, where it's mostly necessary to join two
or more co-located parent tables in complex queries. If not, the
postgresql instance on which the parent tables are placed (let's say
it's a coordinator node) would be likely the bottleneck.)

Tom Lane <tgl@sss.pgh.pa.us> 于2018年9月18日周二 下午9:43写道：
>
> Jinhua Luo <luajit.io@gmail.com> writes:
> > That is, if table `foo` and table `bar` are both tables on the same
> > remote server, then when I do `select * from foo, bar`, can it
> > delegate the whole query on the remote side, rather than fetching rows
> > from both servers one by one and do merging on the local side?
>
> Reasonably recent releases can do that.  What version are you testing?
>
> > foo=> explain select * from foreign_test2, foreign_test where m = id;
> >                                      QUERY PLAN
> > -------------------------------------------------------------------------------------
> >  Merge Join  (cost=444.06..590.63 rows=9316 width=72)
> >    Merge Cond: (foreign_test2.m = foreign_test.id)
> >    ->  Sort  (cost=222.03..225.44 rows=1365 width=36)
> >          Sort Key: foreign_test2.m
> >          ->  Foreign Scan on foreign_test2  (cost=100.00..150.95
> > rows=1365 width=36)
> >    ->  Sort  (cost=222.03..225.44 rows=1365 width=36)
> >          Sort Key: foreign_test.id
> >          ->  Foreign Scan on foreign_test  (cost=100.00..150.95
> > rows=1365 width=36)
> > ```
>
> I don't find this particular example to be very compelling.  Taking
> the amount of data pulled from the foreign server as the main cost
> factor, the plan as given requires pulling 1365*2 rows, whereas if
> it were to push down the join, it'd have to retrieve 9316 rows
> (or so the planner estimates, anyway).  So it's quite possible that
> the planner just rejected the remote join as a net loss.  If you
> think it isn't a net loss, you might want to twiddle the cost
> parameters for this foreign server.
>
>                         regards, tom lane

Re: Is it possible for postgres_fdw to push down queries onco-located tables?

From

Dilip Kumar

Date:

18 September 2018, 19:10:16

On Tue, Sep 18, 2018 at 7:50 PM, Jinhua Luo <luajit.io@gmail.com> wrote:
> I was testing PG10.
>
> Sorry, the example is not so proper. I just think even if it's a
> simple example, e.g. join two co-located tables, the planner should
> work out to push down it. Can you confirm the postgresql could detect
> co-located tables on the same foreign server and push down queries on
> them? Could you give an actual example or point out the relevant
> source code paths for reference?
>

You can check "postgresGetForeignJoinPaths" function or you can refer
commit "e4106b2528727c4b48639c0e12bf2f70a766b910".

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Is it possible for postgres_fdw to push down queries on co-locatedtables?

From

Etsuro Fujita

Date:

20 September 2018, 12:16:30

(2018/09/18 23:20), Jinhua Luo wrote:
> Sorry, the example is not so proper. I just think even if it's a
> simple example, e.g. join two co-located tables, the planner should
> work out to push down it. Can you confirm the postgresql could detect
> co-located tables on the same foreign server and push down queries on
> them? Could you give an actual example or point out the relevant
> source code paths for reference?
>
> (Let me clarify the context of this question, if the planner supports
> co-located push-down, then it's meaningful for manual sharding via
> partitioning to remote tables, where it's mostly necessary to join two
> or more co-located parent tables in complex queries. If not, the
> postgresql instance on which the parent tables are placed (let's say
> it's a coordinator node) would be likely the bottleneck.)

You might want to check partitionwise join functionality as well, which 
we have in PG11 [1].

Best regards,
Etsuro Fujita

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f49842d1ee31b976c681322f76025d7732e860f3