Re: pgsql_fdw, FDW for PostgreSQL server - Mailing list pgsql-hackers

From Shigeru HANADA
Subject Re: pgsql_fdw, FDW for PostgreSQL server
Date
Msg-id 4F703AEC.5000100@gmail.com
Whole thread Raw
In response to Re: pgsql_fdw, FDW for PostgreSQL server  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Responses Re: pgsql_fdw, FDW for PostgreSQL server  ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
Re: pgsql_fdw, FDW for PostgreSQL server  (Thom Brown <thom@linux.com>)
List pgsql-hackers
(2012/03/15 23:06), Shigeru HANADA wrote:
> Although the patches are still WIP, especially in WHERE push-down part,
> but I'd like to post them so that I can get feedback of the design as
> soon as possible.

I've implemented pgsql_fdw's own deparser and enhanced some features
since last post.  Please apply attached patches in the order below:

* pgsql_fdw_v17.patch
    - Adds pgsql_fdw as contrib module
* pgsql_fdw_pushdown_v10.patch
    - Adds WHERE push down capability to pgsql_fdw
* pgsql_fdw_analyze_v1.patch
    - Adds pgsql_fdw_analyze function for updating local stats

Changes from previous version
=============================

1) Don't use remote EXPLAIN for cost/rows estimation, so now planner
estimates result rows and costs on the basis of local statistics such as
pg_class and pg_statistic.  To update local statistics, I added
pgsql_fdw_analyze() SQL function which updates local statistics of a
foreign table by retrieving remote statistics, such as pg_class and
pg_statistic, via libpq.  This would make the planning of pgsql_fdw
simple and fast.  This function can be easily modified to handle ANALYZE
command invoked for a foreign table (Fujita-san is proposing this as
common feature in another thread).

2) Defer planning stuffs as long as possible to clarify the role of each
function.  Currently GetRelSize just estimates result rows from local
statistics, and GetPaths adds only one path which represents SeqScan on
remote side.  As result of this change, PgsqlFdwPlanState struct is
obsolete.

3) Implement pgsql_fdw's own deparser which pushes down collation-free
and immutable expressions in local WHERE clause.  This means that most
of numeric conditions can be pushed down, but conditions using character
types are not.

Most of nodes are deparsed in straightforward way, but OpExpr is not.
OpExpr is deparsed with OPERATOR() notation to specify operator's
schema explicitly.  This would prevent us from possible search_path problem.

    [local query]
    WHERE -col = -1
    [remote query]
    WHERE ((OPERATOR(pg_class.-) col) OPERATOR(pg_class.=) 1)

4) Pushed down quals are not evaluated on local side again.  When
creating ForeignScan node for chosen best path, pushed down expressions
are removed from "qpqual" parameter of make_foreignscan, so a qualifier
is evaluated only once at local or remote.

5) EXPLAIN on pgsql_fdw foreign tables show simple SELECT statement.
DECLARE statement including cursor name is still available in VERBOSE
mode.  (I feel that showing DECLARE always is little noisy...)

Regards,
--
Shigeru HANADA

Attachment

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Speed dblink using alternate libpq tuple storage
Next
From: Robert Haas
Date:
Subject: Re: Command Triggers, v16