Re: Join push-down support for foreign tables - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Join push-down support for foreign tables
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F8010BA850@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: Join push-down support for foreign tables  (Thom Brown <thom@linux.com>)
List pgsql-hackers
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Thom Brown
> Sent: Monday, March 02, 2015 11:36 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Shigeru Hanada; Robert Haas; PostgreSQL-development
> Subject: Re: [HACKERS] Join push-down support for foreign tables
> 
> On 2 March 2015 at 14:07, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> 
> 
>     > I seem to be getting a problem with whole-row references:
>     >
>     > # SELECT p.name, c.country, e.pet_name, p FROM pets e INNER JOIN people
> p on
>     > e.person_id = p.id inner join countries c on p.country_id = c.id;
>     > ERROR:  table "r" has 3 columns available but 4 columns specified
>     > CONTEXT:  Remote SQL command: SELECT r.a_0, r.a_1, r.a_2, l.a_1 FROM
> (SELECT id,
>     > country FROM public.countries) l (a_0, a_1) INNER JOIN (SELECT id, name,
>     > country_id FROM public.people) r (a_0, a_1, a_2, a_3)  ON ((r.a_3 =
> l.a_0))
>     >
>     In this case, the 4th target-entry should be "l", not l.a_1.
> 
> This will no doubt be my naivety talking, but if we know we need the whole row,
> can we not request the row without additionally requesting individual columns?
>
I doubt whether local construction of whole-row reference is more efficient
than redundant copy. Tuple deform/projection is not a work we can ignore its
cost from my experience. Even if redundant column is copied over the network,
local CPU can skip tuple modification to fit remote query results for local
expectation. (NOTE: FDW driver is responsible to return a tuple according to
the tts_tupleDescriptor, so we don't need to transform it if target-list of
remote query is identical.)

>     > And the error message could be somewhat confusing.  This mentions table
> "r", but
>     > there's no such table or alias in my actual query.
>     >
>     However, do we have a mechanical/simple way to distinguish the cases when
>     we need relation alias from the case when we don't need it?
>     Like a self-join cases, we has to construct a remote query even if same
>     table is referenced multiple times in a query. Do you have a good idea?
> 
> Then perhaps all that's really needed here is to clarify that the error pertains
> to the remote execution plan rather than the query crafted by the user.  Or maybe
> I'm nitpicking.
>
I could understand your concern about this version. However, it is a cosmetic
feature that we can fix up later, and what we should focus on at this moment
is to ensure the design concept; that run foreign/custom-scan instead of built-
in join node and they performs as like a usual scan on materialized relations.
So, it is not a good idea to make Hanada-san improve this feature _at this moment_.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why are json <=> jsonb casts marked as explicit-only?
Next
From: Alvaro Herrera
Date:
Subject: Re: alter user/role CURRENT_USER