Re: pgsql_fdw, FDW for PostgreSQL server - Mailing list pgsql-hackers

From Shigeru Hanada
Subject Re: pgsql_fdw, FDW for PostgreSQL server
Date
Msg-id 4EE8AC74.3020200@gmail.com
Whole thread Raw
In response to Re: pgsql_fdw, FDW for PostgreSQL server  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgsql_fdw, FDW for PostgreSQL server
Re: pgsql_fdw, FDW for PostgreSQL server
List pgsql-hackers
(2011/12/13 14:46), Tom Lane wrote:
> Shigeru Hanada<shigeru.hanada@gmail.com>  writes:
>> Agreed.  How about to add a per-column boolean FDW option, say
>> "pushdown", to pgsql_fdw?  Users can tell pgsql_fdw that the column can
>> be pushed down safely by setting this option to true.
>
> [ itch... ] That doesn't seem like the right level of granularity.
> ISTM the problem is with whether specific operators have the same
> meaning at the far end as they do locally.  If you try to attach the
> flag to columns, you have to promise that *every* operator on that
> column means what it does locally, which is likely to not be the
> case ever if you look hard enough.  Plus, having to set the flag on
> each individual column of the same datatype seems pretty tedious.

Indeed, I too think that labeling on each columns is not the best way,
but at that time I thought that it's a practical way, in a way.  IOW, I
chose per-column FDW options as a compromise between never-push-down and
indiscriminate-push-down.

Anyway, ISTM that we should consider various mapping for
functions, operators and collations to support push-down in general
way, but it would be hard to accomplish in this CF.

Here I'd like to propose three incremental patches:

1) fdw_helper_funcs_v3.patch:  This is not specific to pgsql_fdw, but
probably useful for every FDWs which use FDW options.  This patch
provides some functions which help retrieving FDW options from catalogs.
 This patch also enhances document about existing FDW helper functions.

2) pgsql_fdw_v5.patch:  This patch provides simple pgsql_fdw
which does *NOT* support any push-down.  All data in remote table are
retrieved for each foreign scan, and conditions are always evaluated on
local side.  This is safe about semantics difference between local and
remote, but very inefficient especially for large remote tables.

3) pgsql_fdw_pushdown_v1.patch:  This patch adds limited push-down
capability to pgsql_fdw which is implemented by previous patch.  The
criteria for pushing down is little complex.  I modified pgsql_fdw to
*NOT* push down conditions which contain any of:

  a) expression whose result collation is valid
  b) expression whose input collation is valid
  c) expression whose result type is user-defined
  d) expression which uses user-defined function
  e) array expression whose elements has user-defined type
  f) expression which uses user-defined operator
  g) expression which uses mutable function

As the result, pgsql_fdw can push down very limited conditions such as
numeric comparisons, but it would be still useful.  I hope that these
restriction are enough to avoid problems about semantics difference
between remote and local.

To implement d), I added exprFunction to nodefuncs.c which returns Oid
of function which is used in the expression node, but I'm not sure that
it should be there.  Should we have it inside pgsql_fdw?

I'd like to thank everyone who commented on this topic!

Regards,
--
Shigeru Hanada

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: COUNT(*) and index-only scans
Next
From: Alvaro Herrera
Date:
Subject: Re: Command Triggers