Re: Review: UNNEST (and other functions) WITH ORDINALITY - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Review: UNNEST (and other functions) WITH ORDINALITY
Date
Msg-id 1408.1374629895@sss.pgh.pa.us
Whole thread Raw
In response to Re: Review: UNNEST (and other functions) WITH ORDINALITY  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Review: UNNEST (and other functions) WITH ORDINALITY
List pgsql-hackers
Stephen Frost <sfrost@snowman.net> writes:
> * Andrew Gierth (andrew@tao11.riddles.org.uk) wrote:
>> If you have legitimate technical concerns then let's hear them, now.

> Fine- I'd have been as happy leaving this be and letting Greg commit it,
> but if you'd really like to hear my concerns, I'd start with pointing
> out that it's pretty horrid to have to copy every record around like
> this and that the hack of CreateTupleDescCopyExtend is pretty terrible
> and likely to catch people by surprise.  Having FunctionNext() operate
> very differently depending on WITH ORDINALITY is ugly and the cause of
> the bug that was found.  All-in-all, I'm not convinced that this is
> really a good approach and feel it'd be better off implemented at a
> different level, eg a node type instead of a hack on top of the existing
> SRF code.

I took the time to read through this patch, finally.  i think the $64
question is pretty much what Stephen says above: do we like tying this
behavior to FunctionScan, as opposed to having it be some kind of
expression node?  You could certainly imagine a WithOrdinality
expression node that takes in values of a set-returning expression,
and returns them with an extra column tacked on.  This would resolve
the problem that was mentioned awhile back that the current approach
can't support SRFs in targetlists.

If it weren't that we've been speculating for years about deprecating
SRFs-in-tlists once we had LATERAL, I would personally consider this
patch DOA in this form.  If we do think we'll probably deprecate that
feature, then not extending WITH ORDINALITY to such cases is at least
defensible.  On the other hand, considering that we've yet to ship a
release supporting LATERAL, it's probably premature to commit to such
deprecation --- we don't really know whether people will find LATERAL
to be a convenient and performant substitute.

As far as performance goes, despite Stephen's gripe above, I think this
way is likely better than any alternative.  The reason is that once
we've disassembled the function result tuple and tacked on the counter,
we have a reasonable shot at things staying like that (in the form of
a virtual TupleTableSlot).  The next higher level of evaluation can
probably use the column Datums as-is.  A WithOrdinality expression node
would have to disassemble the input tuple and then make a new tuple,
which the next higher expression level would likely pull apart again :-(.
Also, any other approach would lead to needing to store the ordinality
values inside the FunctionScan's tuplestore, bloating that storage with
rather-low-value data.

The other big technical issue I see is representation of the rowtype of
the result.  If we did it with a WithOrdinality expression node, the
result would always be of type RECORD, and we'd have to use blessed tuple
descriptors to keep track of exactly which record type a particular
instance emits.  This is certainly do-able (see RowExpr for precedent).
Attaching the functionality to FunctionScan reduces the need for that
because the system mostly avoids trying to associate any type OID with
the rowtype of a FROM item.  Instead though, we've got a lot of ad-hoc
code that deals with RangeTblEntry type information.  A big part of the
patch is dealing with extending that code, and frankly I've got about
zero confidence that the patch has found everything that needs to be
found in that line.  A patch using an expression node would probably
need to touch only a much more localized set of places to handle the
type description issue.

Anyway, on balance I'm satisfied with this overall approach, though it's
not without room for debate.

I am fairly dissatisfied with the patch at a stylistic level, though.
It seems way too short on comments.  People griped about the code in
nodeFunctionscan in particular, but I think all the changes in ad-hoc
type-management code elsewhere are even more deserving of comments,
and they mostly didn't get that.  Likewise, there's no documentation
anywhere that I can see of the new interrelationships of struct fields,
such as that if a RangeTblEntry has funcordinality set, then its various
column-related fields such as eref->colnames include a trailing INT8
column for the ordinality.  Also, maybe I'm misreading the patch (have
I mentioned lately that large patches in -u format are practically
illegible?), but it sure looks like it flat out removed several existing
regression-test cases and a few existing comments as well.  How is that
considered acceptable?

FWIW, I concur with the gripe I remember seeing upthread that the
default name of the added column ought not be "?column?".
        regards, tom lane



pgsql-hackers by date:

Previous
From: Quan Zongliang
Date:
Subject: Re: improve Chinese locale performance
Next
From: Noah Misch
Date:
Subject: Re: Preventing tuple-table leakage in plpgsql