Re: Postgres_fdw join pushdown - wrong results with whole-row reference - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Postgres_fdw join pushdown - wrong results with whole-row reference
Date
Msg-id CAFjFpRcA6+g+F4R_khmdMKjV9beNFrj1ocmn=mZsucbmQ3BuUw@mail.gmail.com
Whole thread Raw
In response to Re: Postgres_fdw join pushdown - wrong results with whole-row reference  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Postgres_fdw join pushdown - wrong results with whole-row reference  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers


On Wed, Jun 22, 2016 at 3:57 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
On 2016/06/22 18:16, Ashutosh Bapat wrote:
On Wed, Jun 22, 2016 at 2:26 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

    I think we could address this in another way once we support
    deparsing subqueries; rewrite the remote query into something that
    wouldn't need the CASE WHEN conversion.  For example, we currently have:

    postgres=# explain verbose select ft2 from ft1 left join ft2 on
    ft1.a = ft2.a;

    QUERY PLAN
    ------------------------------------------------------------------------------------------------------------------------------------------
     Foreign Scan  (cost=100.00..110.04 rows=1 width=32)
       Output: ft2.*
       Relations: (public.ft1) LEFT JOIN (public.ft2)
       Remote SQL: SELECT CASE WHEN r2.* IS NOT NULL THEN ROW(r2.a,
    r2.b) END FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a =
    r2.a))))
    (4 rows)

    However, if we support deparsing subqueries, the remote query in the
    above example could be rewritten into something like this:

    SELECT ss.c2 FROM t1 LEFT JOIN (SELECT t2.a, ROW(a, b) FROM t2)
    ss(c1, c2) ON (t1.a = ss.c1);

    So we would no longer need "CASE WHEN r2.* IS NOT NULL THEN
    ROW(r2.a, r2.b) END" in the target list in the remote query.

Right. Although, it means that the query processor at the other end has
to do extra work for pulling up the subqueries.

Yeah, that's right.  But this approach seems not so ugly...

    For the CASE WHEN conversion for a system column other than ctid, we
    could also address this by replacing the whole-row reference in the
    IS NOT NULL condition in that conversion with the system column
    reference.

That would not work again as the system column reference would make
sense locally but may not be available at the foreign server e.g.
foreign table targeting a view a tableoid is requested.

Maybe I'm confused, but I think that in the system-column case it's the user's responsibility to specify system columns for foreign tables in a local query only when that makes sense on the remote end, as shown in the below counter example:

postgres=# create foreign table fv1 (a int, b int) server myserver options (table_name 'v1');
CREATE FOREIGN TABLE
postgres=# select ctid, * from fv1;
ERROR:  column "ctid" does not exist
CONTEXT:  Remote SQL command: SELECT a, b, ctid FROM public.v1

But a ctid, when available, would rightly get nullified when referenced as a column of table. What happens with the other system columns is we 0 them out locally, whether they are available at the foreign server or not. We never try to check whether they are available at the foreign server or not. If we use such column's column name to decide whether to report 0 or null and if that column is not available at the foreign table, we will get error. I think we should avoid that. Those column anyway do not make any sense.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Postgres_fdw join pushdown - wrong results with whole-row reference
Next
From: Robert Haas
Date:
Subject: Re: Hash Indexes