pgsql_fdw, FDW for PostgreSQL server - Mailing list pgsql-hackers
From | Shigeru Hanada |
---|---|
Subject | pgsql_fdw, FDW for PostgreSQL server |
Date | |
Msg-id | 4EA67D24.2030707@gmail.com Whole thread Raw |
Responses |
Re: pgsql_fdw, FDW for PostgreSQL server
Re: pgsql_fdw, FDW for PostgreSQL server |
List | pgsql-hackers |
I'd like to propose pgsql_fdw, FDW for external PostgreSQL server, as a contrib module. I think that this module would be the basis of further SQL/MED development for core, e.g. join-push-down and ANALYZE support. I attached three patches for this new FDW. They should be applied in the order below. I separated these patches so that first (or first two) can be committed separately. * fdw_helper_doc.patch provides documents for FDW developers about helper functions existing in 9.1, so this can be back-patched. * fdw_helper_funcs.patch provides additional helper functions which would make manipulation of FDW options easier. * pgsql_fdw.patch provides new FDW for external PG server. Here are details of pgsql_fdw. Name of the wrapper =================== I used the name "pgsql_fdw" for the wrapper and its derivatives. I think it would be better to leave contrib/dblink and built-in postgresql_fdw_validator for backward compatibility, and use new name for new wrapper. Or, it might be OK to rename postgresql_fdw_validator to dblink_validator or something, or fix dblink to use validator of new wrapper. I'm not sure that dblink should be alone or integrated with pgsql_fdw... Connection management ===================== The pgsql_fdw establishes a new connection when a foreign server is accessed first for the local session. Established connection is shared between all foreign scans in the local query, and shared between even scans in following queries. Connections are discarded when the current transaction aborts so that unexpected failure won't cause connection leak. This is implemented with resource owner mechanism. User can see active connections via pgsql_fdw_connections view, and discard arbitrary connection via pgsql_fdw_disconnect() function. These can be done from only same local session. If local role has changed via SET ROLE or SET SESSION AUTHENTICATION, pgsql_fdw ignores old role's connections and looks up appropriate connection for the new role from the pool. If there wasn't suitable one, pgsql_fdw establishes new connection. When local role has changed to old role again, pooled connection will be used again. Unlike contrib/dblink, one foreign server can have only one connection at a time for one local role. This is because pgsql_fdw doesn't support named connections. Cost estimation =============== The pgsql_fdw executes an EXPLAIN command on remote side for each PlanForeignScan call. Returned costs and rows are used as local estimation for the Path with adding connection costs and data transfer costs. SELECT optimization =================== To reduce amount of data transferred from remote server, references to unnecessary columns are replaced with NULL literal in remote query. WHERE clause push-down ====================== Some kind of qualifiers in WHERE clause are pushed down to remote server so that the query result can be reduced. Currently qualifiers which include any volatile or stable element can't be pushed down. Even with these limitations, most qualifiers would be pushed down in usual cases. Cursor mode =========== The pgsql_fdw switches the way to retrieve result records according to estimated result rows; use simple SELECT for small result, and use cursor with DECLARE/FETCH statements for large result. The threshold is default to 1000, and configurable with FDW option "min_cursor_rows". In cursor mode, number of rows fetched at once can be controlled by FDW option "fetch_count". EXPLAIN output ============== The pgsql_fdw shows a remote query used for each foreign scan node in the output of EXPLAIN command with title "Remote SQL". If pgsql_fdw decided to use cursor for the scan, DECLARE statement is shown. Regards, -- Shigeru Hanada
Attachment
pgsql-hackers by date: