FDW for PostgreSQL - Mailing list pgsql-hackers

From Shigeru HANADA
Subject FDW for PostgreSQL
Date
Msg-id 50533E40.5090304@gmail.com
Whole thread Raw
Responses Re: FDW for PostgreSQL
Re: FDW for PostgreSQL
List pgsql-hackers
Hi all,

I'd like to propose FDW for PostgreSQL as a contrib module again.
Attached patch is updated version of the patch proposed in 9.2
development cycle.

For ease of review, I summarized what the patch tries to achieve.

Abstract
========
This patch provides FDW for PostgreSQL which allows users to access
external data stored in remote PostgreSQL via foreign tables.  Of course
external instance can be beyond network.  And I think that this FDW
could be an example of other RDBMS-based FDW, and it would be useful for
proof-of-concept of FDW-related features.

Note that the name has been changed from "pgsql_fdw" which was used in
last proposal, since I got a comment which says that most of existing
FDWs have name "${PRODUCT_NAME}_fdw" so "postgresql_fdw" or
"postgres_fdw" would be better.  For this issue, I posted another patch
which moves existing postgresql_fdw_validator into contrib/dblink with
renaming in order to reserve the name "postgresql_fdw" for this FDW.
Please note that the attached patch requires dblink_fdw_validator.patch
to be applied first.
http://archives.postgresql.org/pgsql-hackers/2012-09/msg00454.php

Query deparser
==============
Now postgresql_fdw has its own SQL query deparser inside, so it's free
from backend's ruleutils module.

This deparser maps object names when generic options below were set.

  nspname of foreign table: used as namespace (schema) of relation
  relname of foreign table: used as relation name
  colname of foreign column: used as column name

This mapping allows flexible schema design.

SELECT optimization
===================
postgresql_fdw always retrieves as much columns as foreign table from
remote to avoid overhead of column mapping.  However, often some of them
(or sometimes all of them) are not used on local side, so postgresql_fdw
uses NULL literal as such unused columns in SELECT clause of remote
query.  For example, let's assume one of pgbench workloads:

    SELECT abalance FROM pgbench_accounts WHERE aid = 1;

This query generates a remote query below.  In addition to bid and
filler, aid is replaced with NULL because it's already evaluated on
remote side.

    SELECT NULL, NULL, abalance, NULL FROM pgbench_accounts
     WHERE (aid OPERATOR(pg_catalog.=) 1);

This trick would improve performance notably by reducing amount of data
to be transferred.

One more example.  Let's assume counting rows.

    SELCT count(*) FROM pgbench_accounts;

This query requires only existence of row, so no actual column reference
is in SELECT clause.

    SELECT NULL, NULL, NULL, NULL FROM pgbench_accounts;

WHERE push down
===============
postgresql_fdw pushes down some of restrictions (IOW, top level elements
in WHERE clause which are connected with AND) which can be evaluated on
remote side safely.  Currently the criteria "safe" is declared as
whether an expression contains only:
  - column reference
  - constant of bult-in type (scalar and array)
  - external parameter of EXECUTE statement
  - built-in operator which uses built-in immutable function
    (operator cannot be collative unless it's "=" or "<>")
  - built-in immutable function

Some other elements might be also safe to be pushed down, but criteria
above seems enough for basic use cases.

Although it might seem odd, but operators are deparsed into OPERATOR
notation to avoid search_path problem.
  E.g.
    local query : WHERE col = 1
    remote query: WHERE (col OPERATOR(pg_catalog.=) 1)

Connection management
=====================
postgresql_fdw has its own connection manager.  Connection is
established when first foreign scan on a server is planned, and it's
pooled in the backend.  If another foreign scan on same server is
invoked, same connection will be used.  Connection pool is per-backend.
 This means that different backends never share connection.

postgresql_fdw_connections view shows active connections, and
postgresql_fdw_disconnect() allows users to discard particular
connection at arbitrary timing.

Transaction management
======================
If multiple foreign tables on same foreign server is used in a local
query, postgresql_fdw uses same connection to retrieve results in a
transaction to make results consistent.  Currently remote transaction is
closed at the end of local query, so following local query might produce
inconsistent result.

Costs estimation
================
To estimate costs and result rows of a foreign scan, postgresql_fdw
executes EXPLAIN statement on remote side, and retrieves costs and rows
values from the result.  For cost estimation, cost of connection
establishment and data transfer are added to the base costs.  Currently
these two factors is hard-coded, but making them configurable is not so
difficult.

Executing EXPLAIN is not cheap, but remote query itself is usually very
expensive, so such additional cost would be acceptable.

ANALYZE support
===============
postgresql_fdw supports ANALYZE to improve selectivity estimation of
filtering done on local side (WHERE clauses which could not been pushed
down.  The sampler function retrieves all rows from remote table and
skip some of them so that result fits requested size.  As same as
file_fdw, postgresql_fdw doesn't care order of result, because it's
important for only correlation, and correlation is important for only
index scans, which is not supported for this FDW.

Fetching Data
=============
postgresql_fdw uses single-row mode of libpq so that memory usage is
kept in low level even if the result is huge.

To cope with difference of encoding, postgresql_fdw automatically sets
client_encoding to server encoding of local database.

Future improvement
==================
I have some ideas for improvement:
  - Provide sorted result path (requires index information?)
  - Provide parameterized path
  - Transaction mapping between local and remotes (2PC)
  - binary transfer (only against servers with same PG major version?)
  - JOIN push-down (requires support by core)

Any comments and questions are welcome.
--
Shigeru HANADA

Attachment

pgsql-hackers by date:

Previous
From: "Devrim GUNDUZ"
Date:
Subject: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.
Next
From: Tom Lane
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.