Re: Add semi-join pushdown to postgres_fdw - Mailing list pgsql-hackers

From Alexander Pyhalov
Subject Re: Add semi-join pushdown to postgres_fdw
Date
Msg-id 3dd9320e01232eb931b747ed75a26cfb@postgrespro.ru
Whole thread Raw
In response to Re: Add semi-join pushdown to postgres_fdw  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Add semi-join pushdown to postgres_fdw
List pgsql-hackers
Alexander Korotkov писал 2023-10-30 19:05:
> Hi, Alexander!
> 
> Thank you for working on this.  I believe this is a very interesting
> patch, which significantly improves our FDW-based distributed
> facilities.  This is why I decided to review this.
> 

Hi. Thanks for reviewing.

> +   /*
> +    * We can't push down join if its reltarget is not safe
> +    */
> +   if (!joinrel_target_ok(root, joinrel, jointype, outerrel,
> innerrel))
>         return false;
> 
> As I get joinrel_target_ok() function do meaningful checks only for
> semi join and always return false for all other kinds of joins.  I
> think we should call this only for semi join and name the function
> accordingly.

Done.

> 
> +   fpinfo->unknown_subquery_rels =
> bms_union(fpinfo_o->unknown_subquery_rels,
> +
> fpinfo_i->unknown_subquery_rels);
> 
> Should the comment before this code block be revised?

Updated comment.

> 
> +       case JOIN_SEMI:
> +           fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
> +                                             fpinfo_i->remote_conds);
> +           fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
> +                                              fpinfo->remote_conds);
> +           fpinfo->remote_conds = list_copy(fpinfo_o->remote_conds);
> +           fpinfo->unknown_subquery_rels =
> bms_union(fpinfo->unknown_subquery_rels,
> +                                           innerrel->relids);
> +           break;
> 
> I think that comment before switch() should be definitely revised.
> 
> + Relids hidden_subquery_rels; /* relids, which can't be referred to
> + * from upper relations */
> 
> Could this definition contain the positive part?  Can't be referred to
> from upper relations, but used internally for semi joins (or something
> like that)?

Made comment a bit more verbose.

> 
> Also, I think the machinery around the append_conds could be somewhat
> simpler if we turn them into a list (list of strings).  I think that
> should make code clearer and also save us some memory allocations.
> 

I've tried to rewrite it as managing lists.. to find out that these are 
not lists.
I mean, in deparseFromExprForRel() we replace lists from both side with 
one condition.
This allows us to preserve conditions hierarchy. We should merge these 
conditions
in the end of IS_JOIN_REL(foreignrel) branch, or we'll push them too 
high. And if we
deparse them in this place as StringInfo, I see no benefit to convert 
them to lists.


> In [1] you've referenced the cases, when your patch can't push down
> semi-joins.  It doesn't seem impossible to handle these cases, but
> that would make the patch much more complicated.  I'm OK to continue
> with a simpler patch to handle the majority of cases.  Could you
> please add the cases, which can't be pushed down with the current
> patch, to the test suite?
> 

There are several cases when we can't push down semi-join in current 
patch.

1) When target list has attributes from inner relation, which are 
equivalent to some attributes of outer
relation, we fail to notice this.

2) When we examine A join B and decide that we can't push it down, this 
decision is final - we state it in fdw_private of joinrel,
and so if we consider joining these relations in another order, we don't 
reconsider.
This means that if later examine B join A, we don't try to push it down. 
As semi-join can be executed as JOIN_UNIQUE_INNER or JOIN_UNIQUE_OUTER,
this can be a problem - we look at some of these paths and remember that 
we can't push down such join.





-- 
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Synchronizing slots from primary to standby
Next
From: Amit Kapila
Date:
Subject: Re: A recent message added to pg_upgade