Re: Postgres_fdw join pushdown - wrong results with whole-row reference - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Postgres_fdw join pushdown - wrong results with whole-row reference
Date
Msg-id a1fa1c4c-bf96-8ea5-cff5-85b927298e73@lab.ntt.co.jp
Whole thread Raw
In response to Re: Postgres_fdw join pushdown - wrong results with whole-row reference  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: Postgres_fdw join pushdown - wrong results with whole-row reference  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On 2016/06/22 17:11, Amit Langote wrote:
> I wonder whether such a whole-row-var would arise from the nullable side
> of a join? I guess not.  Not that I'm saying we shouldn't account for that
> case at all since any and every whole-row-var in the targetlist currently
> gets that treatment, even those that are known non-nullable. Couldn't we
> have prevented the latter somehow?  IOW, only generate the CASE WHEN when
> a Var being deparsed is known nullable as the comment there says:
>
> deparse.c:
>
> 1639 /*
> 1640  * In case the whole-row reference is under an outer join then it has
> 1641  * to go NULL whenver the rest of the row goes NULL. Deparsing a join
> 1642  * query would always involve multiple relations, thus qualify_col
> 1643  * would be true.
> 1644  */
> 1645 if (qualify_col)
> 1646 {
> 1647     appendStringInfoString(buf, "CASE WHEN");
> 1648     ADD_REL_QUALIFIER(buf, varno);
> 1649     appendStringInfo(buf, "* IS NOT NULL THEN ");
> 1650 }

I think we could address this in another way once we support deparsing 
subqueries; rewrite the remote query into something that wouldn't need 
the CASE WHEN conversion.  For example, we currently have:

postgres=# explain verbose select ft2 from ft1 left join ft2 on ft1.a = 
ft2.a;                                                                QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
ForeignScan  (cost=100.00..110.04 rows=1 width=32)   Output: ft2.*   Relations: (public.ft1) LEFT JOIN (public.ft2)
RemoteSQL: SELECT CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, r2.b) 
 
END FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = r2.a))))
(4 rows)

However, if we support deparsing subqueries, the remote query in the 
above example could be rewritten into something like this:

SELECT ss.c2 FROM t1 LEFT JOIN (SELECT t2.a, ROW(a, b) FROM t2) ss(c1, 
c2) ON (t1.a = ss.c1);

So we would no longer need "CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a, 
r2.b) END" in the target list in the remote query.

For the CASE WHEN conversion for a system column other than ctid, we 
could also address this by replacing the whole-row reference in the IS 
NOT NULL condition in that conversion with the system column reference.

Best regards,
Etsuro Fujita





pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Postgres_fdw join pushdown - wrong results with whole-row reference
Next
From: Amit Kapila
Date:
Subject: Re: Hash Indexes