Thread: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
pythonesque@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      9227
Logged by:          Joshua Yanovski
Email address:      pythonesque@gmail.com
PostgreSQL version: 9.3.2
Operating system:   Ubuntu 12.0.4
Description:

# SELECT ROW (1) OVERLAPS ROW (2);
ERROR:  XX000: unrecognized node type: 656
LOCATION:  transformExprRecurse, parse_expr.c:359

I'm not sure why this even parses, as it's not documented to do so.  I do
notice, however, that in makeOverlaps in gram.y creates a recursive list
when largs or args has only one argument, which seems wrong to me.

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Joshua Yanovski
Date:
This becomes clearer when I perform this in a PREPARE:

# PREPARE foo AS SELECT ROW(1) OVERLAPS ROW(2);
ERROR:  54001: stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth"
(currently 2048kB), after ensuring the platform's stack depth limit is
adequate.
LOCATION:  check_stack_depth, postgres.c:3090

On Fri, Feb 14, 2014 at 7:55 PM,  <pythonesque@gmail.com> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      9227
> Logged by:          Joshua Yanovski
> Email address:      pythonesque@gmail.com
> PostgreSQL version: 9.3.2
> Operating system:   Ubuntu 12.0.4
> Description:
>
> # SELECT ROW (1) OVERLAPS ROW (2);
> ERROR:  XX000: unrecognized node type: 656
> LOCATION:  transformExprRecurse, parse_expr.c:359
>
> I'm not sure why this even parses, as it's not documented to do so.  I do
> notice, however, that in makeOverlaps in gram.y creates a recursive list
> when largs or args has only one argument, which seems wrong to me.
>
>
>



--
Josh

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Tom Lane
Date:
pythonesque@gmail.com writes:
> # SELECT ROW (1) OVERLAPS ROW (2);
> ERROR:  XX000: unrecognized node type: 656
> LOCATION:  transformExprRecurse, parse_expr.c:359

> I'm not sure why this even parses, as it's not documented to do so.  I do
> notice, however, that in makeOverlaps in gram.y creates a recursive list
> when largs or args has only one argument, which seems wrong to me.

Huh, yeah:

    if (list_length(largs) == 1)
        largs = lappend(largs, largs);
    else if (list_length(largs) != 2)
        ereport(ERROR,
                (errcode(ERRCODE_SYNTAX_ERROR),
                 errmsg("wrong number of parameters on left side of OVERLAPS expression"),
                 parser_errposition(location)));

(and similarly for rargs).  What this is evidently trying to do is
convert

    (foo) overlaps ...

to

    (foo, foo) overlaps ...

which is perhaps sensible on the grounds that a point in time can be
considered as a zero-width interval.  However, as you say, this behavior is
undocumented, and what's more I can find no support for it in any version
of the SQL standard.

My guess is that this code might've originally worked as intended, but it
got broken when Neil Conway rewrote our List implementation, which was a
darn long time ago now.

A quick browse in our git history shows that OVERLAPS support
was introduced in commit 64568100787a5d03d036e70b32147385a35245e2 of
2000-03-14, and the self-append-the-same-list behavior was in that
version though it looked rather different.  Neil's work dates to 2004.

[ thinks some more ... ]  Actually, even back when Lists didn't have
headers, it's not immediately clear to me that this code could've worked.
In the current code we end up with a List that includes itself, which
surely ain't gonna work.

While it would be a reasonably simple change to make this work as
originally intended, I'm strongly tempted to just rip it out instead,
and only support the SQL-mandated syntax.  If anyone was using this
undocumented feature, you'd think they'd have complained sometime in
the last ten years.  And making it work would really entail adding
documentation and a regression test case, so it'd be substantially
more effort than just killing the "list_length == 1" case.

            regards, tom lane

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Joshua Yanovski
Date:
On Mon, Feb 17, 2014 at 4:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> While it would be a reasonably simple change to make this work as
> originally intended, I'm strongly tempted to just rip it out instead,
> and only support the SQL-mandated syntax.  If anyone was using this
> undocumented feature, you'd think they'd have complained sometime in
> the last ten years.  And making it work would really entail adding
> documentation and a regression test case, so it'd be substantially
> more effort than just killing the "list_length == 1" case.
>
>                         regards, tom lane

For what it's worth, that seems reasonable to me as well.  I would
never have found this if I weren't actually looking through gram.y
anyway.

--
Josh

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Joshua Yanovski
Date:
On Mon, Feb 17, 2014 at 5:07 PM, Joshua Yanovski <pythonesque@gmail.com> wrote:
> On Mon, Feb 17, 2014 at 4:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> While it would be a reasonably simple change to make this work as
>> originally intended, I'm strongly tempted to just rip it out instead,
>> and only support the SQL-mandated syntax.  If anyone was using this
>> undocumented feature, you'd think they'd have complained sometime in
>> the last ten years.  And making it work would really entail adding
>> documentation and a regression test case, so it'd be substantially
>> more effort than just killing the "list_length == 1" case.
>>
>>                         regards, tom lane
>
> For what it's worth, that seems reasonable to me as well.  I would
> never have found this if I weren't actually looking through gram.y
> anyway.
>
> --
> Josh

Attaching a patch to just remove the n == 1 case.

--
Josh

Attachment

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Joshua Yanovski
Date:
Sorry, the MIME type seems to have gotten screwed up, hopefully this
will work better.

On Mon, Feb 17, 2014 at 9:33 PM, Joshua Yanovski <pythonesque@gmail.com> wrote:
> On Mon, Feb 17, 2014 at 5:07 PM, Joshua Yanovski <pythonesque@gmail.com> wrote:
>> On Mon, Feb 17, 2014 at 4:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> While it would be a reasonably simple change to make this work as
>>> originally intended, I'm strongly tempted to just rip it out instead,
>>> and only support the SQL-mandated syntax.  If anyone was using this
>>> undocumented feature, you'd think they'd have complained sometime in
>>> the last ten years.  And making it work would really entail adding
>>> documentation and a regression test case, so it'd be substantially
>>> more effort than just killing the "list_length == 1" case.
>>>
>>>                         regards, tom lane
>>
>> For what it's worth, that seems reasonable to me as well.  I would
>> never have found this if I weren't actually looking through gram.y
>> anyway.
>>
>> --
>> Josh
>
> Attaching a patch to just remove the n == 1 case.
>
> --
> Josh



--
Josh

Attachment

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Tom Lane
Date:
Joshua Yanovski <pythonesque@gmail.com> writes:
> On Mon, Feb 17, 2014 at 4:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> While it would be a reasonably simple change to make this work as
>> originally intended, I'm strongly tempted to just rip it out instead,
>> and only support the SQL-mandated syntax.  If anyone was using this
>> undocumented feature, you'd think they'd have complained sometime in
>> the last ten years.  And making it work would really entail adding
>> documentation and a regression test case, so it'd be substantially
>> more effort than just killing the "list_length == 1" case.

> Attaching a patch to just remove the n == 1 case.

It occurred to me that there's an additional reason not to make the
code work like Lockhart seems to have envisioned: if it did duplicate
the argument like that, that'd lead to double evaluation of any volatile
function that's present in the argument.  So I've gone ahead and committed
this, along with some further fooling around to make the error messages
nicer.

            regards, tom lane

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Joshua Yanovski
Date:
Great, thanks.  Yeah, I was thinking about that too--I am not sure if
there are any other examples of a time where Postgres deliberately
duplicates an argument like that (maybe there could be a check for it
to be a constexpr or something?  But that information isn't available
at this point in the analysis process).

On Tue, Feb 18, 2014 at 9:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Joshua Yanovski <pythonesque@gmail.com> writes:
>> On Mon, Feb 17, 2014 at 4:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> While it would be a reasonably simple change to make this work as
>>> originally intended, I'm strongly tempted to just rip it out instead,
>>> and only support the SQL-mandated syntax.  If anyone was using this
>>> undocumented feature, you'd think they'd have complained sometime in
>>> the last ten years.  And making it work would really entail adding
>>> documentation and a regression test case, so it'd be substantially
>>> more effort than just killing the "list_length == 1" case.
>
>> Attaching a patch to just remove the n == 1 case.
>
> It occurred to me that there's an additional reason not to make the
> code work like Lockhart seems to have envisioned: if it did duplicate
> the argument like that, that'd lead to double evaluation of any volatile
> function that's present in the argument.  So I've gone ahead and committed
> this, along with some further fooling around to make the error messages
> nicer.
>
>                         regards, tom lane



--
Josh

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Tom Lane
Date:
Joshua Yanovski <pythonesque@gmail.com> writes:
> Great, thanks.  Yeah, I was thinking about that too--I am not sure if
> there are any other examples of a time where Postgres deliberately
> duplicates an argument like that (maybe there could be a check for it
> to be a constexpr or something?  But that information isn't available
> at this point in the analysis process).

Yeah, BETWEEN is like that.  I'd like to fix it sometime, but it's
kind of problematic because of the risk of losing index optimizations
(which only understand x >= y and x <= z, not a hypothetical combined
node).

I think there are no other examples (CASE used to be one, but we fixed
that long ago), but haven't trawled the grammar to be sure.

            regards, tom lane

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Tom Lane
Date:
I wrote:
> Joshua Yanovski <pythonesque@gmail.com> writes:
>> Great, thanks.  Yeah, I was thinking about that too--I am not sure if
>> there are any other examples of a time where Postgres deliberately
>> duplicates an argument like that (maybe there could be a check for it
>> to be a constexpr or something?  But that information isn't available
>> at this point in the analysis process).

> Yeah, BETWEEN is like that.  I'd like to fix it sometime, but it's
> kind of problematic because of the risk of losing index optimizations
> (which only understand x >= y and x <= z, not a hypothetical combined
> node).

Actually, it suddenly strikes me that there's a pretty simple answer to
that.  Have the parser generate a node representing BETWEEN, with three
arguments.  In the planner, *if* the first argument is non-volatile,
replace the BETWEEN with "x >= y AND x <= z"; otherwise, leave it alone,
and execute it as-is.  This transformation is semantically correct and
will still expose index-optimizable comparisons in all cases of interest
(since a volatile expression isn't indexable).  Moreover we get rid of the
double evaluation risk for volatile first arguments, as well as the
incredible inefficiency of the BETWEEN SYMMETRIC cases.

There are some other issues still to be thought about, since the
parser is currently willing to cast "x" differently in the two comparisons
--- but frankly I think any case where that matters is probably erroneous
SQL code in the first place.  (See the thread referenced in the comment in
the grammar for more info.)

            regards, tom lane

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Joshua Yanovski
Date:
That sounds like it should work.  I was also wondering whether it
might be possible to do this more generally with some kind of "let"
internal node:

(let var = x in (var >= y and var <= z)

I don't know enough about the planner or the SQL standard to know
whether this would work, but it does seem a little nicer than
special-casing BETWEEN to me.  Is the issue there that the extra layer
of indirection would cause problems or create artificial optimization
boundaries?

On Tue, Feb 18, 2014 at 10:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> Joshua Yanovski <pythonesque@gmail.com> writes:
>>> Great, thanks.  Yeah, I was thinking about that too--I am not sure if
>>> there are any other examples of a time where Postgres deliberately
>>> duplicates an argument like that (maybe there could be a check for it
>>> to be a constexpr or something?  But that information isn't available
>>> at this point in the analysis process).
>
>> Yeah, BETWEEN is like that.  I'd like to fix it sometime, but it's
>> kind of problematic because of the risk of losing index optimizations
>> (which only understand x >= y and x <= z, not a hypothetical combined
>> node).
>
> Actually, it suddenly strikes me that there's a pretty simple answer to
> that.  Have the parser generate a node representing BETWEEN, with three
> arguments.  In the planner, *if* the first argument is non-volatile,
> replace the BETWEEN with "x >= y AND x <= z"; otherwise, leave it alone,
> and execute it as-is.  This transformation is semantically correct and
> will still expose index-optimizable comparisons in all cases of interest
> (since a volatile expression isn't indexable).  Moreover we get rid of the
> double evaluation risk for volatile first arguments, as well as the
> incredible inefficiency of the BETWEEN SYMMETRIC cases.
>
> There are some other issues still to be thought about, since the
> parser is currently willing to cast "x" differently in the two comparisons
> --- but frankly I think any case where that matters is probably erroneous
> SQL code in the first place.  (See the thread referenced in the comment in
> the grammar for more info.)
>
>                         regards, tom lane



--
Josh

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Tom Lane
Date:
Joshua Yanovski <pythonesque@gmail.com> writes:
> That sounds like it should work.  I was also wondering whether it
> might be possible to do this more generally with some kind of "let"
> internal node:

> (let var = x in (var >= y and var <= z)

> I don't know enough about the planner or the SQL standard to know
> whether this would work, but it does seem a little nicer than
> special-casing BETWEEN to me.

Hmm ... interesting idea, but I don't think it answers any of the
fundamental issues about comparison semantics (ie, how much do we
care if "var" is promoted differently in the two clauses).

Also, as a grammar translation target for BETWEEN it would be a pretty
bad choice, because now views that had been defined in a perfectly
SQL-compliant manner would print out with very-PG-specific syntax.
The existing translation is nasty because of the double eval issue,
but at least what we print is standard SQL.

            regards, tom lane

Re: BUG #9227: Error on SELECT ROW OVERLAPS ROW with single ROW argument

From
Joshua Yanovski
Date:
> Hmm ... interesting idea, but I don't think it answers any of the
> fundamental issues about comparison semantics (ie, how much do we
> care if "var" is promoted differently in the two clauses).
My first thought was that it essentially would work as you describe
above--if the expression is volatile, evaluate the argument first and
then evaluate the expression, otherwise replace the parameter within
the subexpression.  But then I realized that you didn't actually say
that is what the BETWEEN node would be doing :)  I suppose any extra
logic would have to be present in some translation function to deal
with the let node, so effectively it would be special-cased anyway.

>
> Also, as a grammar translation target for BETWEEN it would be a pretty
> bad choice, because now views that had been defined in a perfectly
> SQL-compliant manner would print out with very-PG-specific syntax.
> The existing translation is nasty because of the double eval issue,
> but at least what we print is standard SQL.
>
>                         regards, tom lane
That's a good point, I suppose I'll just have to wait for this feature
to be added to the SQL standard :)

--
Josh