Re: [v9.3] writable foreign tables - Mailing list pgsql-hackers

From Kohei KaiGai
Subject Re: [v9.3] writable foreign tables
Date
Msg-id CADyhKSVPY8imSVo9frV8wm2v5gh+e_Ddq-zWpNd7Y1dPezH1LA@mail.gmail.com
Whole thread Raw
In response to Re: [v9.3] writable foreign tables  ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
Responses Re: [v9.3] writable foreign tables  ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
List pgsql-hackers
2012/11/16 Albe Laurenz <laurenz.albe@wien.gv.at>:
> Kohei KaiGai wrote:
>> The attached patch is just a refreshed version for clean applying to
>> the latest tree.
>>
>> As previous version doing, it makes pseudo enhancement on file_fdw
>> to print something about the supplied tuple on INSERT, UPDATE and
>> DELETE statement.
>
> Basics:
> -------
>
> The patch applies cleanly, compiles without warnings and passes
> regression tests.
>
> I think that the functionality is highly desirable; judging from
> the number of talks at pgConf EU about SQL/MED this is a hot
> topic, and further development is welcome.
>
> Testing the functionality:
> --------------------------
>
> I ran a few queries with the file_fdw and found this:
>
> $ cat flatfile
> 1,Laurenz,1968-10-20
> 2,Renée,1975-09-03
> 3,Caroline,2009-01-26
> 4,Ray,2011-03-09
> 5,Stephan,2011-03-09
>
> CREATE SERVER file FOREIGN DATA WRAPPER file_fdw;
>
> CREATE FOREIGN TABLE flat(
>    id integer not null,
>    name varchar(20) not null,
>    birthday date not null
> ) SERVER file
>   OPTIONS (filename 'flatfile', format 'csv');
>
> UPDATE flat SET name='' FROM flat f WHERE f.id = flat.id and f.name like '%e';
> ERROR:  bitmapset is empty
>
Hmm... I'll try to investigate the behavior.

> About the code:
> ---------------
>
> I am not so happy with GetForeignRelInfo:
> - The name seems ill-chosen from the FDW API side.
>   I guess that you chose the name because the function
>   is called from get_relation_info, but I think the name
>   should be more descriptive for the FDW API.
>   Something like PlanForeignRelRowid.
>
Indeed, GetForeignRelInfo might give misleading impression
as if this routine collects widespread information about
target relation. So, how about GetForeignRelWidth() instead?

> - I guess that every FDW that only needs "rowid" will
>   do exactly the same as your fileGetForeignRelInfo.
>   Why can't that be done in core?
>   The function could pass an AttrNumber for the rowid
>   to the FDW, and will receive a boolean return code
>   depending on whether the FDW plans to use rowid or not.
>   That would be more convenient for FDW authors.
>
This design tries to kill two-birds with one-stone.
It enables to add multiple number of pseudo-columns,
not only "rowid", and makes possible to push-down
complex calculation of target list into external computing
resource.

For example, when user gives the following query:
 SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable

it contains a complex calculation in the target-list,
thus, it also takes CPU cycles of local process.

If we can replace the "((c1 - c2) * (c2 - c3))^2" by
a reference to a pseudo-column that also references
the calculation result on external node, it effectively
off-load CPU cycles.

In this case, all we need to do is (1) acquire a slot
for pseudo-column at GetForeignRelInfo (2) replace
TargetEntry::expr by Var node that reference this
pseudo-column.

It makes sense for performance optimization, so I don't
want to restrict this handler for "rowid" only.

> - I guess the order is dictated by planner steps, but
>   it would be "nice to have" if GetForeignRelInfo were
>   not the first function to be called during planning.
>   That would make it easier for a FDW to support both
>   9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
>   will probably have to be created in the first API
>   function.
>
The baserel->fdw_private should be initialized to NULL,
so it can perform as a mark whether private data is already
constructed, or not.

> I also wonder why you changed the signature of functions in
> trigger.c.  It changes an exposed API unnecessarily, unless
> you plan to have trigger support for foreign tables.
> That is currently not supported, and I think that such
> changes should be part of the present patch.
>
You are right. It might be unneeded change right now.
I'll revert it.

> Other than that, I like the patch.
> I cannot give an educated opinion if the changes to planner
> and executor are well done or not.
>
> Other comments:
> ---------------
>
> I hope I find time to implement the API in oracle_fdw to
> be able to test it more thoroughly.
>
> There is an interdependence with the "FDW for PostgreSQL" patch
> in the commitfest.  If both these patches get committed, the
> FDW should be extended to support the new API.  If nothing else,
> this would greatly improve the ability to test the new API
> and find out if it is well designed.
>
It is the reason why I'd like to volunteer to review Hanada-san's patch.
I'll spent my time for reviewing his patch on this weekend.

>> Here is one other idea. My GPU acceleration module (PG-Strom)
>> implements column-oriented data store underlying foreign table.
>> It might make sense to cut out this portion for proof-of-concept of
>> writable foreign tables.
>>
>> Any ideas?
>
> The best would be to have a patch on top of the PostgreSQL FDW
> to be able to test.  It need not be perfect.
>
OK,

In addition, I noticed my patch didn't update documentation stuff.
I also add mention about new handlers.

Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>



pgsql-hackers by date:

Previous
From: Atri Sharma
Date:
Subject: Re: WIP patch for hint bit i/o mitigation
Next
From: Antonin Houska
Date:
Subject: Re: Materialized views WIP patch