Re: numbering plan nodes - Mailing list pgsql-hackers

From Robert Haas
Subject Re: numbering plan nodes
Date
Msg-id CA+Tgmoa7NwARTAtm=7vVthp=zNbY5m-jBNttfvu_qcm1Hg3nZA@mail.gmail.com
Whole thread Raw
In response to Re: numbering plan nodes  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: numbering plan nodes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Thanks much for the quick reply.

On Thu, Sep 17, 2015 at 2:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It would be nice if we could assign the integer IDs with no gaps, and
>> with the IDs of a node's children immediately following that of their
>> parent.
>
> I do not think this should be a goal, either explicitly or implicitly.
> For one thing, it will be very much in the eye of individual beholders
> which nodes are children-that-should-be-consecutive.  What about
> initplans, subplans, grandchild nodes?  Different use cases could easily
> have different opinions about what to do with those.

I was aiming for: the IDs of the descendents of any given node are
consecutive and have IDs which follow that's node ID, in any old order
you like.

>> The advantage of this is that if you want to have a data
>> structure for every node in the tree passed to some worker - like a
>> struct Instrumentation in dynamic shared memory - you can just create
>> an array and index it by (node ID) - (node ID of uppermost child
>> passed to worker), and every slot will be in use, so no memory is
>> wasted and no lookup table is needed.
>
> I think you could just waste the memory, or more likely you could cope
> with one level of indirection so that you only waste a pointer per
> node (ie, array of pointers indexed by node ID, only fill the ones that
> are relevant).

Hmm, I guess that wouldn't be too bad.  A straight join between N
relations will have N base relations and N - 1 joins and maybe a few
sorts or similar.  You could get a plan tree with a lot more nodes if
your query involves inheritance hierarchies - e.g. join 4 tables each
of which has 1000 children.  But even in that case you're only talking
about 32kB of memory for the indirect table, and that's not going to
be very high on your list of problems in that case.

>> My main concern with this design is how future-proof it is.  I know
>> Tom is working on ripping the planner apart and putting it back
>> together again to allow the upper levels of the planner to use paths,
>> and the discussion of the SS_finalize_plan() changes made some
>> reference to possibly wanting to mess with set_plan_references().
>
> I can't imagine I'd be doing anything that would break the simple case
> of "give every node a distinct ID".  If you are building in weird
> assumptions about traversal order, that might indeed be a problem.

Good to know, thanks.  With the design you proposal above, we can make
this insensitive to traversal order.  The only thing that would cause
trouble is if you somehow ended up with massive gaps in the numbering
sequence - e.g. if the final plan had 6 nodes, but the IDs were all 5
digit numbers, you'd waste a silly amount of memory relative to the
size of the plan.  But a few gaps (e.g. for removed SubqueryScan
nodes) won't be a problem.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: numbering plan nodes
Next
From: Pavel Stehule
Date:
Subject: Re: [patch] Proposal for \rotate in psql