Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins - Mailing list pgsql-general
| From | Adam Zegelin |
|---|---|
| Subject | Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins |
| Date | |
| Msg-id | F3D61D18-DFC7-44ED-A370-92A8730FBAD3@relational.io Whole thread Raw |
| In response to | Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins
|
| List | pgsql-general |
Tom,
Thank you for your prompt reply. Your advice has pointed me in the right direction.
I now have the wrapper identifying columns that are inputs to the web service, and thus parameterisable. The
ec_classes,left_join_clauses and right_join_clauses trees are scanned for Var exprs that match these attributes. If
theyare present, the relid is added to the required list of outer rels for the path -- this is done as an extension to
thelogic I posted previously.
In all cases this seems to work, except one. A join between 3 tables. The foreign table has 2 parameterised columns,
eachgiven a restriction based on one of the other two tables:
adam=# explain select * from l1, l2, foreign1 where foreign1.a = l1.a and foreign1.b = l2.a;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Merge Join (cost=5000704.96..5001278.44 rows=37822 width=168)
Merge Cond: (l2.a = foreign1.b)
-> Sort (cost=85.43..88.50 rows=1230 width=36)
Sort Key: l2.a
-> Seq Scan on l2 (cost=0.00..22.30 rows=1230 width=36)
-> Sort (cost=5000619.54..5000634.91 rows=6150 width=132)
Sort Key: foreign1.b
-> Merge Join (cost=5000135.26..5000232.51 rows=6150 width=132)
Merge Cond: (foreign1.a = l1.a)
-> Sort (cost=5000049.83..5000052.33 rows=1000 width=96)
Sort Key: foreign1.a
-> Foreign Scan on foreign1 (cost=5000000.00..5000000.00 rows=1000 width=96)
-> Sort (cost=85.43..88.50 rows=1230 width=36)
Sort Key: l1.a
-> Seq Scan on l1 (cost=0.00..22.30 rows=1230 width=36)
My path generation logic seems to work:
baserel->cheapest_parameterized_paths = (
{FOREIGNPATH
:pathtype 120
:parent_relids (b 3)
:required_outer (b 1 2)
:rows 500
:startup_cost 0.00
:total_cost 0.00
:pathkeys <>
:fdw_private <>
}
{FOREIGNPATH
:pathtype 120
:parent_relids (b 3)
:required_outer (b)
:rows 1000
:startup_cost 5000000.00
:total_cost 5000000.00
:pathkeys <>
:fdw_private <>
}
)
Yet the planner picks the non-parameterised path:
ForeignPath* best_path = {FOREIGNPATH
:pathtype 120
:parent_relids (b 3)
:required_outer (b)
:rows 1000
:startup_cost 5000000.00
:total_cost 5000000.00
:pathkeys <>
:fdw_private <>
}
I’ve tried adjusting planner tuneables to disable all join types except nested loop, and setting `join_collapse_limit`
to1 with no desirable outcome.
Yet, adding a restriction clause between the other two tables forces them to be scanned first:
adam=# explain select * from l1, l2, foreign1 where foreign1.a = l1.a and foreign1.b = l2.a and l1.b > l2.b;
QUERY PLAN
-------------------------------------------------------------------------
Nested Loop (cost=0.00..2544241.17 rows=12608 width=168)
-> Nested Loop (cost=0.00..22741.17 rows=504300 width=72)
Join Filter: (l1.b > l2.b)
-> Seq Scan on l1 (cost=0.00..22.30 rows=1230 width=36)
-> Materialize (cost=0.00..28.45 rows=1230 width=36)
-> Seq Scan on l2 (cost=0.00..22.30 rows=1230 width=36)
-> Foreign Scan on foreign1 (cost=0.00..0.00 rows=500 width=96)
Filter: ((a = l1.a) AND (b = l2.a))
ForeignPath* best_path = {FOREIGNPATH
:pathtype 120
:parent_relids (b 3)
:required_outer (b 1 2)
:rows 500
:startup_cost 0.00
:total_cost 0.00
:pathkeys <>
:fdw_private <>
}
On 18/03/2013, at 4:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Adam Zegelin <adam@relational.io> writes:
>> Some service endpoints have no concept of unqualified queries. In the example above, a ‘sequence scan’ of Bing is a
notpossible.
>
> In that case, you shouldn't be generating such a path. But keep in mind
> that this may lead to failure to produce any plan at all for some
> queries. If the foreign data source is really so broken that it can't
> do that, then you have little choice ... but you shouldn't be thinking
> of that as anything but a broken design decision on their part.
I tried adding a condition that would prevent the non-parameterised path from being generated if the service only
supportedparameterised scans. Postgres refuses to generate a plan: "ERROR: could not devise a query plan for the given
query".I did a bit of digging and this error is generated by pathnode.c:set_cheapest . As there is no non-parameterised
`cheapest_total_path`the error is raised (line 253).
For now, I just add an expensive non-pramerterised path and let the FDW throw an error if no qual is found involving
therequired columns.
Regards,
Adam
Attachment
pgsql-general by date: