Re: [v9.3] writable foreign tables - Mailing list pgsql-hackers

From Albe Laurenz
Subject Re: [v9.3] writable foreign tables
Date
Msg-id D960CB61B694CF459DCFB4B0128514C208B87FFC@exadv11.host.magwien.gv.at
Whole thread Raw
In response to Re: [v9.3] writable foreign tables  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
Responses Re: [v9.3] writable foreign tables  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
List pgsql-hackers
Kohei KaiGai wrote:
>> I am not so happy with GetForeignRelInfo:
>> - The name seems ill-chosen from the FDW API side.
>>   I guess that you chose the name because the function
>>   is called from get_relation_info, but I think the name
>>   should be more descriptive for the FDW API.
>>   Something like PlanForeignRelRowid.
>
> Indeed, GetForeignRelInfo might give misleading impression
> as if this routine collects widespread information about
> target relation. So, how about GetForeignRelWidth() instead?

That would be better for the function as it is now.

>> - I guess that every FDW that only needs "rowid" will
>>   do exactly the same as your fileGetForeignRelInfo.
>>   Why can't that be done in core?
>>   The function could pass an AttrNumber for the rowid
>>   to the FDW, and will receive a boolean return code
>>   depending on whether the FDW plans to use rowid or not.
>>   That would be more convenient for FDW authors.
>
> This design tries to kill two-birds with one-stone.
> It enables to add multiple number of pseudo-columns,
> not only "rowid", and makes possible to push-down
> complex calculation of target list into external computing
> resource.
>
> For example, when user gives the following query:
>
>   SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable
>
> it contains a complex calculation in the target-list,
> thus, it also takes CPU cycles of local process.
>
> If we can replace the "((c1 - c2) * (c2 - c3))^2" by
> a reference to a pseudo-column that also references
> the calculation result on external node, it effectively
> off-load CPU cycles.
>
> In this case, all we need to do is (1) acquire a slot
> for pseudo-column at GetForeignRelInfo (2) replace
> TargetEntry::expr by Var node that reference this
> pseudo-column.
>
> It makes sense for performance optimization, so I don't
> want to restrict this handler for "rowid" only.

I understand.

But I think that you still can do that with the change that
I suggest.  I suggest that GetForeignRelInfo (or whatever the
name ends up being) gets the AttrNumber of the proposed "rowid"
column in addition to the parameters you need for what
you want to do.

Then nothing would keep you from defining those
pseudo-columns.  But all the setup necessary for the "rowid"
column could be moved out of the FDW.  So for the 99% of all
FDW which are only interested in the rowid, things would
get much simpler and they don't all have to implement the
same code.

Did I make clear what I mean?
Would that be difficult?

>> - I guess the order is dictated by planner steps, but
>>   it would be "nice to have" if GetForeignRelInfo were
>>   not the first function to be called during planning.
>>   That would make it easier for a FDW to support both
>>   9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
>>   will probably have to be created in the first API
>>   function.
>
> The baserel->fdw_private should be initialized to NULL,
> so it can perform as a mark whether private data is already
> constructed, or not.

Right, if that pointer is pre-initialized to NULL, that
should work.  Forget my quibble.

> In addition, I noticed my patch didn't update documentation stuff.
> I also add mention about new handlers.

I didn't get into documentation, comment and spelling issues since
the patch was still called POC, but yes, eventually that would
be necessary.

Yours,
Laurenz Albe



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: gset updated patch
Next
From: Amit kapila
Date:
Subject: Re: [WIP PATCH] for Performance Improvement in Buffer Management