Re: Custom Scan APIs (Re: Custom Plan node) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Custom Scan APIs (Re: Custom Plan node)
Date
Msg-id 17716.1394047350@sss.pgh.pa.us
Whole thread Raw
In response to Re: Custom Scan APIs (Re: Custom Plan node)  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Responses Re: Custom Scan APIs (Re: Custom Plan node)
List pgsql-hackers
Kouhei Kaigai <kaigai@ak.jp.nec.com> writes:
>> * Please drop the whole register_custom_provider/get_custom_provider API.

> One thing I was worrying about is how copyObject() and nodeToString() support
> set of function pointer tables around custom-scan node, however, you suggested
> to change the assumption here. So, I think these functions become unnecessary.

If we allow the extension to control copyObject behavior, it can do what
it likes with the function-struct pointer.  I think the typical case would
be that it's a simple pointer to a never-copied static constant.  But you
could imagine that it's a pointer to a struct embedded further down in the
custom path or plan node, if the extension wants different functions for
different plans.

If we had to support stringToNode() for paths or plans, things would get
much more complicated, but we don't (and there are already lots of other
things that would be difficult for that).

> The reason why CustomScan is derived from Scan is, some of backend code
> wants to know rtindex of the relation to be referenced by this CustomScan.
> The scanrelid of Scan is used in three points: nodeCustom.c, setrefs.c and
> explain.c. The usage in nodeCustom.c is just for service routines, and the
> usage in setrefs.c can be moved to the extension according to your suggestion.
> We need to investigate the usage in explain.c; ExplainPreScanNode() walks
> around the nodes to collect relids referenced in this query. If we don't
> want to put a callback for this specific usage, it is a reasonable choice
> to show the backend the associated scanrelid of CustomScan.

I think we have to add another callback for that :-(.  It's a pain since
it's such a trivial point; but the existing code cannot support a custom
node referencing more than one RTE, which seems possible for join or
append type cases.

> Probably, the hunk around show_customscan_info() call can be entirely moved
> to the extension side. If so, I want ExplainNode() being an extern function,
> because it allows extensions to print underlying plan-nodes.

I haven't looked at what explain.c would have to expose to make this
workable, but yeah, we will probably have to export a few things.

> Are you saying the hard-wired portion in setrefs.c can be moved to the
> extension side? If fix_scan_expr() become extern function, I think it
> is feasible.

My recollection is that fix_scan_expr() is a bit specialized.  Not sure
exactly what we'd have to export there --- but we'd have to do it anyway.
What you had in the patch was a hook that could be called, but no way
for it to do what it would likely need to do.

>> * Get rid of the CUSTOM_VAR business, too (including custom_tupdesc).

> In case of Var-node that references joined-relations, it shall be replaced to
> either INNER_VAR or OUTER_VAR according the location of underlying
> relations. It eventually makes ExecEvalScalarVar() to reference either
> ecxt_innertuple or ecxt_outertuple, however, it is problematic if we already
> consolidated tuples come from the both side into one.

So?  If you did that, then you wouldn't have renumbered the Vars as
INNER/OUTER.  I don't believe that CUSTOM_VAR is necessary at all;
if it is needed, then there would also be a need for an additional
tuple slot in executor contexts, which you haven't provided.

> For example, the enhanced postgres_fdw fetches the result set of remote
> join query, thus a tuple contains the fields come from both side.
> In this case, what varno shall be suitable to put?

That would be a scan situation, and the vars could reference the scan
tuple slot.  Which in fact was the implementation you were using, so
how is CUSTOM_VAR adding anything?

>> * Why is there more than one call site for add_scan_path_hook?  I don't
>> see the need for the calling macro with the randomly inconsistent name,
>> either.

> Where is the best place to do? Even though I cannot imagine the situation
> to run sub-query or cte by extensions, its path is constructed during
> set_rel_size(), so I had to put the hook for each set_xxxx_pathlist()
> functions.

Hm.  We could still call the hook in set_rel_pathlist, if we were to
get rid of the individual calls to set_cheapest and do that in one
spot at the bottom of set_rel_pathlist (after the hook call).  Calling
set_cheapest in one place seems more consistent anyway.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [BUGS] BUG #9223: plperlu result memory leak
Next
From: Alvaro Herrera
Date:
Subject: Re: [BUG FIX] Compare returned value by socket() against PGINVALID_SOCKET instead of < 0