On 2017-03-15 17:33:46 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2017-03-15 16:07:14 -0400, Tom Lane wrote:
> >> As for ExecHashGetHashValue, it's most likely going to be working from
> >> virtual tuples passed up to the join, which won't benefit from
> >> predetermination of the last column to be accessed. The
> >> tuple-deconstruction would have happened while projecting in the scan
> >> node below.
>
> > I think the physical tuple stuff commonly thwarts that argument? On
> > master for tpch's Q5 you can e.g. see the following profile (master):
>
> Hmmm ... I think you're mistaken in fingering the physical-tuple
> optimization per se, but maybe skipping ExecProject at the scan level
> would cause this result?
I think those are often related (i.e. we replace a smaller targetlist
with a "physical" one, which then allows to skip ExecProject()).
> I've thought for some time that it was dumb to have the executor
> reverse-engineering this info at plan startup anyway.
Yea, it'd be good if this (and some similar tasks like building interim
tuple descriptors) could be moved to the planner. But:
> We could make the planner mark each table scan node with the highest
> column number that the plan will access, and use that to drive a
> slot_getsomeattrs call in advance of any access to tuple contents.
probably isn't sufficient - we build non-virtual tuples in a good number
of places (sorts, tuplestore using stuff like nodeMaterial, nodeHash.c
output, ...). I suspect it'd have measurable negative consequences if
we removed the deforming logic for all expressions/projections above
such nodes. I guess we could just do such a logic for every Plan node?
- Andres