Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
Date
Msg-id CAFjFpReoTXaAGU5J5eo0tY46v+-MnDem3HhrXM3iLvWG_mYPNg@mail.gmail.com
Whole thread Raw
In response to Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers


On Thu, Apr 14, 2016 at 8:42 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/03/29 23:20, Ashutosh Bapat wrote:
    I think the reason for that is in foreign_join_ok.  This in that
    function:

    wrongly pulls up remote_conds from joining relations in the FULL
    JOIN case.  I think we should not pull up such conditions in the
    FULL JOIN case.

Right. For a full outer join, since each joining relation acts as outer
for the other, we can not pull up the quals to either join clauses or
other clauses. So, in such a case, we will need to encapsulate the
joining relation with conditions into a subquery. Unfortunately, the
current deparse logic does not handle this encapsulation. Adding that
functionality so close to the feature freeze might be risky given the
amount of code changes required.

PFA patch with a quick fix. A full outer join with either of the joining
relations having WHERE conditions (or other clauses) is not pushed down.
In the particular case that was reported, the bug triggered because of
the way conditions are handled for an inner join. For an inner join, all
the conditions in ON as well as WHERE clause are treated like they are
part of WHERE clause. This allows pushing down a join even if there are
unpushable join clauses. But the pushable conditions can be put back
into the ON clause. This avoids using subqueries while deparsing.

I'm not sure that is a good idea.  As you mentioned, we could support FULL JOIN fully, by encapsulating a joining relation with conditions into a subquery.  And ISTM that it is relatively easy to do that, by borrowing some ideas from Hanada-san's original join pushdown patch.  If it's OK, I'll create a patch for that in a few days.

In his patch the deparsing targetlist and conditions required that the FROM clause entries were ready with the columns from base relations and joins realiased. That's no more true. We deparse every Var node as <relation alias>.<column name> where relation alias is nothing but rN; N being index of RangeTblEntry. So, Hanada-san's method to deparse recursively can not be applied as such now.

Here's what needs to be done:
When we identify that certain relation (base or join) needs a subquery to be deparsed (because the join relation above it could not pull the quals up), we remember it in the upper join relation. Before deparsing 1. we walk the join tree and collect targetlists of all such relations, 2. associate column aliases with those targetlists (save the column alises in resname?) and craft a relation alias  3. associate the relations alias, column aliases and targetlists with the base relations involved in such relations (may be creating a list similar to rtable). While deparsing a Var node, we check if corresponding base relation is itself or part of a relation deparsed as a subquery. If it is then we lookup that Var in the targetlist associated with the base relation and use corresponding relation and column alias for deparsing it. Otherwise, deparse it as <relation alias>.<column name> usually.

That looks like a big code change to go after feature freeze. So, we will leave it for 9.7, unless RMT allows us to introduce that change.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Detrimental performance impact of ringbuffers on performance
Next
From: Andres Freund
Date:
Subject: Re: [COMMITTERS] pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.