Re: review: FDW API - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: review: FDW API
Date
Msg-id 4D3D79BB.6070603@enterprisedb.com
Whole thread Raw
In response to Re: review: FDW API  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: review: FDW API  (Robert Haas <robertmhaas@gmail.com>)
Re: review: FDW API  (Robert Haas <robertmhaas@gmail.com>)
Re: review: FDW API  (Shigeru HANADA <hanada@metrosystems.co.jp>)
List pgsql-hackers
On 21.01.2011 17:57, Robert Haas wrote:
> On Fri, Jan 21, 2011 at 10:17 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com>  wrote:
>> On 18.01.2011 17:26, Shigeru HANADA wrote:
>>>
>>> 1) 20110118-no_fdw_perm_check.patch - This patch is not included in
>>> last post.  This had been proposed on 2011-01-05 first, but maybe has
>>> not been reviewd yet.  I re-propose this patch for SQL standard
>>> conformance.  This patch removes permission check that requires USAGE
>>> on the foreign-data wrapper at CREATE FOREIGN TABLE.
>>> Please see original post for details.
>>>
>>> http://archives.postgresql.org/message-id/20110105145206.30FD.6989961C@metrosystems.co.jp
>>
>> Committed this part.
>
> How much review have you done of parts (3) and (4)?  The key issue for
> all of the FDW work in progress seems to be what the handler API is
> going to look like, and so once we get that committed it will unblock
> a lot of other things.

I've gone through the code in a bit more detail now. I did a bunch of
cosmetic changes along the way, patch attached. I also added a few
paragraphs in the docs. We need more extensive documentation, but this
at least marks the places where I think the docs need to go.

Comments:

* How can a FDW fetch only the columns required by the scan? The file
FDW has to read the whole file anyhow, but it could perhaps skip calling
the input function for unnecessary columns. But more importantly, with
something like the postgresql_fdw you don't want to fetch any extra
columns across the wire. I gather the way to do it is to copy
RelOptInfo->attr_needed to private storage at plan stage, and fill the
not-needed attributes with NULLs in Iterate. That gets a bit awkward,
you need to transform attr_needed to something that can be copied for
starters. Or is that information somehow available at execution phase
otherwise?

* I think we need something in RelOptInfo to mark foreign tables. At the
moment, you need to call IsForeignTable() which does a catalog lookup.
Maybe a new RTEKind, or a boolean flag.

* Can/should we make ReScan optional? Could the executor just do
EndScan+BeginScan if there's no ReScan function?

* Is there any point in allowing a FDW without a handler? It's totally
useless, isn't it? We had the CREATE FOREIGN DATA WRAPPER syntax in
previous versions, and it allowed it, but it has always been totally
useless so I don't think we need to worry much about
backwards-compatibility here.

* Is there any use case for changing the handler or validator function
of an existign FDW with ALTER? To me it just seems like an unnecessary
complication.

* IMHO the "FDW-info" should always be displayed, without VERBOSE. In my
experience with another DBMS that had this feature, the SQL being sent
to the remote server was almost always the key piece of information that
I was looking for in the query plans.

* this check in expand_inherited_rtentry seems misplaced:

>         /*
>          * SELECT FOR UPDATE/SHARE is not allowed on foreign tables because
>          * they are read-only.
>          */
>         if (newrelation->rd_rel->relkind == RELKIND_FOREIGN_TABLE &&
>             lockmode != AccessShareLock)
>             ereport(ERROR,
>                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                      errmsg("SELECT FOR UPDATE/SHARE is not allowed with foreign tables")));

I don't understand why we'd need to do that for inherited tables in
particular. And it's not working for regular non-inherited foreign tables:

postgres=# SELECT * FROM filetbl2 FOR UPDATE;
ERROR:  could not open file "base/11933/16397": No such file or directory

* Need to document how the FDW interacts with transaction
commit/rollback. In particular, I believe EndScan is never called if the
transaction is aborted. That needs to be noted explicitly, and need to
suggest how to clean up any external resources in that case. For
example, postgresql_fdw will need to close any open cursors or result sets.

In general, I think the design is sound. What we need is more
documentation. It'd also be nice to see the postgresql_fdw brought back
to shape so that it works against this latest version of the api patch.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Vangelis Katsikaros
Date:
Subject: gist README
Next
From: Robert Haas
Date:
Subject: Re: review: FDW API