> On Thu, Sep 17, 2015 at 9:01 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > I entirely agree with the idea of plan-node identifier, however,
> > uncertain whether the node-id shall represent physical location on
> > the dynamic shared memory segment, because
> > (1) Relatively smaller number of node type needs shared state,
> > thus most of array items are empty.
> > (2) Extension that tries to modify plan-tree using planner_hook
> > may need to adjust node-id also.
> >
> > Even though shm_toc_lookup() has to walk on the toc entries to find
> > out the node-id, it happens at once on beginning of the executor at
> > background worker side. I don't think it makes a significant problem.
>
> Yes, I was thinking that what would make sense is to have each
> parallel-aware node call shm_toc_insert() using its ID as the key.
> Then, we also need Instrumentation nodes. For those, I thought we
> could use some fixed, high-numbered key, and Tom's idea.
>
Hmm, indeed, run-time statistics are needed for every node.
If an array indexed by node-id would be a hash slot, we can treat
non-contiguous node-id with no troubles.
> Are there extensions that use planner_hook to do surgery on the plan
> tree? What do they do, exactly?
>
(Even though it will not work under Funnel,) PG-Strom often inject
a preprocessor node under Agg-node to produce partial aggregation
to reduce number of rows to be processed by CPU.
Also, I have seen a paper published by Fujitsu folks. Their module
modifies plan-tree to replace built-in scan node with their own
columnar storage scan node. http://db-event.jpn.org/deim2015/paper/195.pdf
This paper is written in Japanese, however, figure-3 in page.4 shows
what I explain above.
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>