Thread: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins

Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins

From
Adam Zegelin
Date:
Hello,

I’m in the process of writing a Postgres FDW that can interface with web service endpoints. Certain FDW columns would
actas web service parameters, while others would be the output. 

For example:

    adam=# select * from bing where query = 'xbox';
     query |               url               |     description
    -------+---------------------------------+---------------------
     xbox  | http://www.xbox.com/en-AU/index | Xbox Australia is
       ⋮   |                ⋮                |           ⋮

For the simple cases, extracting the quals (such as var [query] = const “xbox”) works perfectly.

I’d like to join FDW tables with other tables, possibly local or foreign.
ex: `select * from search_terms, bing where bing.query = search_terms.term`, where `search_terms` is a local table.

Without any parameterised paths, Postgres, as expected, will attempt to perform unfiltered foreign sequence scans
followedby a hash join of the two tables. Some service endpoints have no concept of unqualified queries. In the example
above,a ‘sequence scan’ of Bing is a not possible. 

I generate parameterised paths inside the FDW handler function `GetForeignPaths`. I call `create_foreignscan_path` with
aset of req_outer relids found by scanning PlannerInfo’s eq_classes, left_join_clauses and right_join_clauses. 

    Bitmapset* outer_relids = NULL;

        foreach(lc, root->eq_classes) {
            EquivalenceClass* ec = (EquivalenceClass *) lfirst(lc);
            if (ec->ec_members->length > 1)
                outer_relids = bms_union(outer_relids, ec->ec_relids);
        }

        foreach(lc, list_union(root->left_join_clauses, root->right_join_clauses)) {
            RestrictInfo *ri = (RestrictInfo *) lfirst(lc);
            outer_relids = bms_union(outer_relids, ri->outer_relids);
        }

        Bitmapset* req_outer = bms_difference(outer_relids,
                                              bms_make_singleton(baserel->relid));

    foreignPath = create_foreignscan_path(root, baserel, nbrows, 0, 0, NIL, req_outer, NULL);


For simple joins this works. `BeginForeignScan` can access a list of quals with params, and the planner generates an
appropriatenested loop join over the foreign table. 

    adam=# explain select * from search_terms, bing where bing.query = search_terms.term;
                                  QUERY PLAN
    -----------------------------------------------------------------------
     Nested Loop  (cost=0.00..49.30 rows=6550 width=96)
       ->  Seq Scan on search_terms  (cost=0.00..23.10 rows=1310 width=32)
       ->  Foreign Scan on bing  (cost=0.00..0.00 rows=2 width=64)
             Filter: (search_terms.term = query)

Even a query over two foreign tables works correctly:

    adam=# create foreign table ft1 (inp text, out text) server test;
    adam=# create foreign table ft2 (inp text, out text) server test;

    adam=# explain select * from ft1, ft2 where ft1.inp = ‘hello’ and ft2.inp = ft1.out;
                                  QUERY PLAN
    ----------------------------------------------------------------------
     Nested Loop  (cost=0.00..500020.00 rows=5000 width=128)
       ->  Foreign Scan on ft1  (cost=0.00..500000.00 rows=1000 width=64)
             Filter: (inp = 'hello'::text)
       ->  Foreign Scan on ft2  (cost=0.00..0.00 rows=2 width=64)
             Filter: (ft1."out" = inp)


But, on a more complex query consisting of multiple foreign tables the planner generates something a little less
desirable:

    adam=# create foreign table ft3 (inp text, out text) server test;

    adam=# explain select * from ft1, ft2, ft3 where ft1.inp = 'hello' and ft2.inp = ft1.out and ft3.inp = ft2.out;
                                        QUERY PLAN
    ----------------------------------------------------------------------------------
     Nested Loop  (cost=500012.50..1000290.00 rows=25000 width=192)
       ->  Hash Join  (cost=500012.50..1000190.00 rows=5000 width=128)
             Hash Cond: (ft1."out" = ft2.inp)
             ->  Foreign Scan on ft1  (cost=0.00..500000.00 rows=1000 width=64)
                   Filter: (inp = 'hello'::text)
             ->  Hash  (cost=500000.00..500000.00 rows=1000 width=64)
                   ->  Foreign Scan on ft2  (cost=0.00..500000.00 rows=1000 width=64)
       ->  Foreign Scan on ft3  (cost=0.00..0.00 rows=2 width=64)
             Filter: (ft2."out" = inp)

The high total costs are the result of my attempts to coerce the planner to select the parameterised paths and generate
filteredforeign scans rather than preferring unfiltered foreign scans. 

I’ve tried adjusting the query planner tuneables (enable_hashjoin, et al) and the path costs with some degree of
success,but often the generated plans will filter the tables in the wrong order -- the output column of table 1 will be
filteredby the input column of table 2 -- which is technically correct as the operation should be associative and
transitive,but in this case, table 2 must be filtered by the output of table 1, not vice-versa. 

Is there a way to convince Postgres to always generate nested loop joins without sub merges, hashes or materialisations
andin the correct order? 

I’m thinking that the current method I’m using to generate the foreign scan path doesn’t take into account the required
orderand which fields can and cannot be parameterised. I’m not sure how to proceed with this. 

Thanks,
Adam
Attachment
Adam Zegelin <adam@relational.io> writes:
> Some service endpoints have no concept of unqualified queries. In the example above, a ‘sequence scan’ of Bing is a
notpossible. 

In that case, you shouldn't be generating such a path.  But keep in mind
that this may lead to failure to produce any plan at all for some
queries.  If the foreign data source is really so broken that it can't
do that, then you have little choice ... but you shouldn't be thinking
of that as anything but a broken design decision on their part.

> I generate parameterised paths inside the FDW handler function `GetForeignPaths`. I call `create_foreignscan_path`
witha set of req_outer relids found by scanning PlannerInfo’s eq_classes, left_join_clauses and right_join_clauses. 

It looks to me like this logic will produce only one parameterized path
that demands the maximal set of outer relations.  You need to be a bit
more flexible than that.

>     adam=# explain select * from ft1, ft2, ft3 where ft1.inp = 'hello' and ft2.inp = ft1.out and ft3.inp = ft2.out;

In this example, your only parameterized path for ft2 will require both
ft1 and ft3 as inputs, since it sees both of the quals mentioning ft2
as potential join quals.  So there's no way to generate the plan you're
hoping for.  You need to have produced a path that requires only ft1.
Really, given this input, you should be producing three parameterized
paths for ft2 (one using only ft1, one using only ft3, one using both)
and then let the planner logic sort out which one to use.

            regards, tom lane


Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins

From
Adam Zegelin
Date:
Tom,

Thank you for your prompt reply. Your advice has pointed me in the right direction.

I now have the wrapper identifying columns that are inputs to the web service, and thus parameterisable. The
ec_classes,left_join_clauses and right_join_clauses trees are scanned for Var exprs that match these attributes. If
theyare present, the relid is added to the required list of outer rels for the path -- this is done as an extension to
thelogic I posted previously. 

In all cases this seems to work, except one. A join between 3 tables. The foreign table has 2 parameterised columns,
eachgiven a restriction based on one of the other two tables: 

    adam=# explain select * from l1, l2, foreign1 where foreign1.a = l1.a and foreign1.b = l2.a;
                                                 QUERY PLAN
    ----------------------------------------------------------------------------------------------------
     Merge Join  (cost=5000704.96..5001278.44 rows=37822 width=168)
       Merge Cond: (l2.a = foreign1.b)
       ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
             Sort Key: l2.a
             ->  Seq Scan on l2  (cost=0.00..22.30 rows=1230 width=36)
       ->  Sort  (cost=5000619.54..5000634.91 rows=6150 width=132)
             Sort Key: foreign1.b
             ->  Merge Join  (cost=5000135.26..5000232.51 rows=6150 width=132)
                   Merge Cond: (foreign1.a = l1.a)
                   ->  Sort  (cost=5000049.83..5000052.33 rows=1000 width=96)
                         Sort Key: foreign1.a
                         ->  Foreign Scan on foreign1  (cost=5000000.00..5000000.00 rows=1000 width=96)
                   ->  Sort  (cost=85.43..88.50 rows=1230 width=36)
                         Sort Key: l1.a
                         ->  Seq Scan on l1  (cost=0.00..22.30 rows=1230 width=36)

My path generation logic seems to work:

baserel->cheapest_parameterized_paths = (
   {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b 1 2)
   :rows 500
   :startup_cost 0.00
   :total_cost 0.00
   :pathkeys <>
   :fdw_private <>
   }
   {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b)
   :rows 1000
   :startup_cost 5000000.00
   :total_cost 5000000.00
   :pathkeys <>
   :fdw_private <>
   }
)

Yet the planner picks the non-parameterised path:

ForeignPath* best_path = {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b)
   :rows 1000
   :startup_cost 5000000.00
   :total_cost 5000000.00
   :pathkeys <>
   :fdw_private <>
   }

I’ve tried adjusting planner tuneables to disable all join types except nested loop, and setting `join_collapse_limit`
to1 with no desirable outcome. 

Yet, adding a restriction clause between the other two tables forces them to be scanned first:

    adam=# explain select * from l1, l2, foreign1 where foreign1.a = l1.a and foreign1.b = l2.a and l1.b > l2.b;
                                   QUERY PLAN
    -------------------------------------------------------------------------
     Nested Loop  (cost=0.00..2544241.17 rows=12608 width=168)
       ->  Nested Loop  (cost=0.00..22741.17 rows=504300 width=72)
             Join Filter: (l1.b > l2.b)
             ->  Seq Scan on l1  (cost=0.00..22.30 rows=1230 width=36)
             ->  Materialize  (cost=0.00..28.45 rows=1230 width=36)
                   ->  Seq Scan on l2  (cost=0.00..22.30 rows=1230 width=36)
       ->  Foreign Scan on foreign1  (cost=0.00..0.00 rows=500 width=96)
             Filter: ((a = l1.a) AND (b = l2.a))


ForeignPath* best_path = {FOREIGNPATH
   :pathtype 120
   :parent_relids (b 3)
   :required_outer (b 1 2)
   :rows 500
   :startup_cost 0.00
   :total_cost 0.00
   :pathkeys <>
   :fdw_private <>
   }


On 18/03/2013, at 4:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Adam Zegelin <adam@relational.io> writes:
>> Some service endpoints have no concept of unqualified queries. In the example above, a ‘sequence scan’ of Bing is a
notpossible. 
>
> In that case, you shouldn't be generating such a path.  But keep in mind
> that this may lead to failure to produce any plan at all for some
> queries.  If the foreign data source is really so broken that it can't
> do that, then you have little choice ... but you shouldn't be thinking
> of that as anything but a broken design decision on their part.

I tried adding a condition that would prevent the non-parameterised path from being generated if the service only
supportedparameterised scans. Postgres refuses to generate a plan: "ERROR:  could not devise a query plan for the given
query".I did a bit of digging and this error is generated by pathnode.c:set_cheapest . As there is no non-parameterised
`cheapest_total_path`the error is raised (line 253). 

For now, I just add an expensive non-pramerterised path and let the FDW throw an error if no qual is found involving
therequired columns. 

Regards,
Adam

Attachment
Adam Zegelin <adam@relational.io> writes:
> My path generation logic seems to work:

> baserel->cheapest_parameterized_paths = (
>    {FOREIGNPATH
>    :pathtype 120
>    :parent_relids (b 3)
>    :required_outer (b 1 2)
>    :rows 500
>    :startup_cost 0.00
>    :total_cost 0.00
>    :pathkeys <>
>    :fdw_private <>
>    }
>    {FOREIGNPATH
>    :pathtype 120
>    :parent_relids (b 3)
>    :required_outer (b)
>    :rows 1000
>    :startup_cost 5000000.00
>    :total_cost 5000000.00
>    :pathkeys <>
>    :fdw_private <>
>    }
> )

I think you missed my point: you should not be insisting on a maximal
set of required outer rels.

In this particular case, it won't generate a cross-product join of l1
and l2 because there's a heuristic that says that's unlikely to be a
good idea.  But in related cases, there could be join order restrictions
that *require* us not to do the joins in that order; so even if you
could talk us out of applying that heuristic, this code is still subject
to undesirable failures.  You really need to provide three paths using
the three possible combinations of outer rels.

            regards, tom lane


Re: Enforcing Parameterised Nested Loop Join Order for Foreign Table Joins

From
Adam Zegelin
Date:
Thanks for your assistance Tom.

On 19/03/2013, at 12:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I think you missed my point: you should not be insisting on a maximal
> set of required outer rels.

I’ve started generating multiple paths for the web services that support it, which works great.

> In this particular case, it won't generate a cross-product join of l1
> and l2 because there's a heuristic that says that's unlikely to be a
> good idea.  But in related cases, there could be join order restrictions
> that *require* us not to do the joins in that order; so even if you
> could talk us out of applying that heuristic, this code is still subject
> to undesirable failures.  You really need to provide three paths using
> the three possible combinations of outer rels.


Certain web services I’m connecting to explicitly require parameters, often multiple ones.
I check the quals and if any required values are not present I simply raise an error.
So really, it is a “good idea” in my case.
Queries that require different join orders will still fail because all required quals are not present.

Is this heuristic a tuneable parameter, or something that would require a logic change inside the planner itself?

Regards,
Adam



Attachment
Adam Zegelin <adam@relational.io> writes:
> On 19/03/2013, at 12:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In this particular case, it won't generate a cross-product join of l1
>> and l2 because there's a heuristic that says that's unlikely to be a
>> good idea.

> Is this heuristic a tuneable parameter, or something that would require a logic change inside the planner itself?

It's not tuneable, not so much because it would be hard to turn off as
because disabling it would lead to an exponential explosion in planning
time in an uncomfortably large fraction of cases.  (If you don't mind
running a locally-hacked version of PG, whacking around the logic in
joinrels.c should get you what you want.)

Rather than shutting off that heuristic as such, what I'd be inclined to
think about is exploiting the "join order restriction" logic so that
an FDW with the kind of issue you describe could mark its relation as
subject to a pseudo join-order restriction.  That could cause the
planner to explore join pathways it otherwise wouldn't, but only in
cases where it was really necessary to do so.  I'm handwaving a bit here
but I think something like that could be made to work without creating
an across-the-board planning penalty.

Anyway, changes like that will be material for 9.4 or 9.5.  I think in a
year or so we'll have a much clearer idea of what kinds of planner knobs
FDWs require than we do today.

            regards, tom lane