Re: [v9.5] Custom Plan API - Mailing list pgsql-hackers

From Kohei KaiGai
Subject Re: [v9.5] Custom Plan API
Date
Msg-id CADyhKSWD0d583rbbjZGZR4XkwLF3tzgEwkaZ_3DVgCaj2n4Zew@mail.gmail.com
Whole thread Raw
In response to [v9.5] Custom Plan API  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Responses Re: [v9.5] Custom Plan API  (Shigeru Hanada <shigeru.hanada@gmail.com>)
List pgsql-hackers
According to the discussion upthread, I revised the custom-plan patch
to focus on regular relation scan but no join support right now, and to
support DDL command to define custom-plan providers.

Planner integration with custom logic to scan a particular relation is
enough simple, unlike various join cases. It's almost similar to what
built-in logic are doing now - custom-plan provider adds a path node
with its cost estimation if it can offer alternative way to scan referenced
relation. (in case of no idea, it does not need to add any paths)

A new DDL syntax I'd like to propose is below:

  CREATE CUSTOM PLAN <name> FOR <class> PROVIDER <function_name>;

<name> is as literal, put a unique identifier.
<class> is workload type to be offered by this custom-plan provider.
"scan" is the only option right now, that means base relation scan.
<function_name> is also as literal; it shall perform custom-plan provider.

A custom-plan provider function is assumed to take an argument of
"internal" type to deliver a set of planner information that is needed to
construct custom-plan pathnode.
In case of "scan" class, pointer towards an customScanArg object
shall be delivered on invocation of custom-plan provider.

    typedef struct {
        uint32            custom_class;
        PlannerInfo    *root;
        RelOptInfo     *baserel;
        RangeTblEntry  *rte;
    } customScanArg;

In case when the custom-plan provider function being invoked thought
it can offer an alternative scan path on the relation of "baserel", things
to do is (1) construct a CustomPath (or its inherited data type) object
with a table of callback function pointers (2) put its own cost estimation,
and (3) call add_path() to register this path as an alternative one.

Once the custom-path was chosen by query planner, its CreateCustomPlan
callback is called to populate CustomPlan node based on the pathnode.
It also has a table of callback function pointers to handle various planner's
job in setrefs.c and so on.

Similarly, its CreateCustomPlanState callback is called to populate
CustomPlanState node based on the plannode. It also has a table of
callback function pointers to handle various executor's job during quey
execution.

Most of callback designs are not changed from the prior proposition in
v9.4 development cycle, however, here is a few changes.

* CustomPlan became to inherit Scan, and CustomPlanState became to
  inherit ScanState. Because some useful routines to implement scan-
  logic, like ExecScan, expects state-node has ScanState as its base
  type, it's more kindness for extension side. (I'd like to avoid each
  extension reinvent ExecScan by copy & paste!)
  I'm not sure whether it should be a union of Join in the future, however,
  it is a reasonable choice to have compatible layout with Scan/ScanState
  to implement alternative "scan" logic.

* Exporting static functions - I still don't have a graceful answer here.
  However, it is quite natural that extensions to follow up interface updates
  on the future version up of PostgreSQL.
  Probably, it shall become clear what class of functions shall be
  exported and what class of functions shall be re-implemented within
  extension side in the later discussion.
  Right now, I exported minimum ones that are needed to implement
  alternative scan method - contrib/ctidscan module.

Items to be discussed later:
* planner integration for relations join - probably, we may define new
  custom-plan classes as alternative of hash-join, merge-join and
  nest-loop. If core can know this custom-plan is alternative of hash-
  join, we can utilize core code to check legality of join.
* generic key-value style options in custom-plan definition - Hanada
  san proposed me off-list - like foreign data wrapper. It may enable
  to configure multiple behavior on a binary.
* ownership and access control of custom-plan. right now, only
  superuser can create/drop custom-plan provider definition, thus,
  it has no explicit ownership and access control. It seems to me
  a reasonable assumption, however, we may have a usecase that
  needs custom-plan by unprivileged users.

Thanks,

2014-05-12 10:09 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>> On 8 May 2014 22:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> >> We're past the prototyping stage and into productionising what we
>> >> know works, AFAIK. If that point is not clear, then we need to
>> >> discuss that first.
>> >
>> > OK, I'll bite: what here do we know works?  Not a damn thing AFAICS;
>> > it's all speculation that certain hooks might be useful, and
>> > speculation that's not supported by a lot of evidence.  If you think
>> > this isn't prototyping, I wonder what you think *is* prototyping.
>>
>> My research contacts advise me of this recent work
>>   http://www.ntu.edu.sg/home/bshe/hashjoinonapu_vldb13.pdf
>> and also that they expect a prototype to be ready by October, which I have
>> been told will be open source.
>>
>> So there are at least two groups looking at this as a serious option for
>> Postgres (not including the above paper's authors).
>>
>> That isn't *now*, but it is at least a time scale that fits with acting
>> on this in the next release, if we can separate out the various ideas and
>> agree we wish to proceed.
>>
>> I'll submerge again...
>>
> Through the discussion last week, our minimum consensus are:
> 1. Deregulated enhancement of FDW is not a way to go
> 2. Custom-path that can replace built-in scan makes sense as a first step
>    towards the future enhancement. Its planner integration is enough simple
>    to do.
> 3. Custom-path that can replace built-in join takes investigation how to
>    integrate existing planner structure, to avoid (3a) reinvention of
>    whole of join handling in extension side, and (3b) unnecessary extension
>    calls towards the case obviously unsupported.
>
> So, I'd like to start the (2) portion towards the upcoming 1st commit-fest
> on the v9.5 development cycle. Also, we will be able to have discussion
> for the (3) portion concurrently, probably, towards 2nd commit-fest.
>
> Unfortunately, I cannot participate PGcon/Ottawa this year. Please share
> us the face-to-face discussion here.
>
> Thanks,
> --
> NEC OSS Promotion Center / PG-Strom Project
> KaiGai Kohei <kaigai@ak.jp.nec.com>
>
--
KaiGai Kohei <kaigai@kaigai.gr.jp>

Attachment

pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Set new system identifier using pg_resetxlog
Next
From: Heikki Linnakangas
Date:
Subject: Re: WAL format and API changes (9.5)