Thread: Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors

Attaching patch... :-/

On Tue, Oct 31, 2017 at 4:27 PM, Rob McColl <rob@robmccoll.com> wrote:
Between 9.6.5 and 10, the handling of parenthesized single-column UPDATE statements changed. In 9.6.5, they were treated identically to unparenthesized single-column UPDATES. In 10, they are treated as multiple-column updates.  This results in this being valid in Postgres 9.6.5, but an error in Postgres 10:

CREATE TABLE test (id INT PRIMARY KEY, data INT);
INSERT INTO test VALUES (1, 1);
UPDATE test SET (data) = (2) WHERE id = 1;

In 10 and the current master, this produces the error:

errmsg("source for a multiple-column UPDATE item must be a sub-SELECT or ROW() expression")

I believe that this is not an intended change or behavior, but is instead an unintentional side effect of 906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd  Improve handling of "UPDATE ... SET (column_list) = row_constructor". (https://github.com/postgres/postgres/commit/906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd).

This is a small patch to the grammar that restores the previous behavior by adding a rule to the set_clause rule and modifying the final rule of the set_clause rule to only match lists of more then one element.  I'm not sure if there are more elegant or preferred ways to address this.

Compiled and tested on Ubuntu 17.04 Linux 4.10.0-33-generic x86_64.

Regression test added under the update test to cover the parenthesized single-column case.

I see no reason this would affect performance.

Thanks,
-rob

--
Rob McColl
@robmccoll

Attachment
Rob McColl <rob@robmccoll.com> writes:
> Attaching patch... :-/

The reason why hacking your way to a backwards-compatible solution is
a bad idea is that it breaks the SQL standard compliance we're trying to
achieve here.  According to the spec, the elements of a parenthesized
SET list should be assigned from the fields of a composite RHS.  If
there's just one element of the SET list, the RHS should be a single-field
composite value, and this syntax should result in extracting its field.
This patch doesn't do that; it preserves our previous broken behavior,
and thus just puts off the day of reckoning that inevitably will come.

As a concrete example, the spec says this should work:

create table t (f1 int);
update t set (f1) = row(4);

and it does in v10, but (without having tested) your patch will break it.
This is not so exciting for simple row constructors, where you could just
leave off the word ROW; but it is a critical distinction if we're ever to
get to the point of allowing other composite-returning constructs here,
for example composite-returning functions.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates canproduce errors

From
"David G. Johnston"
Date:
On Tue, Oct 31, 2017 at 3:14 PM, Rob McColl <rob@robmccoll.com> wrote:

I believe that this is not an intended change or behavior, but is instead an unintentional side effect of 906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd  Improve handling of "UPDATE ... SET (column_list) = row_constructor". (https://github.com/postgres/postgres/commit/906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd).

​At this point the intent of 906bfcad doesn't really matter - and given the number of complaints in the month since ​v10 went live I'm tending to lean toward bringing the pre-10 behavior back. There is no ambiguity involved here and the breakage of existing applications seems considerably worse than the technical oddity of allowing (val) to be interpreted as a row_constructor in this situation.  From a standards perspective we are strictly more permissive so no new problem there.

On a related note, the 10.0 syntax guide is wrong, it needs to break out the parenthesized single-column and multi-column variants separately:

Presently: ( column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT } [, ...] )

Basically one cannot specify only a single column_name AND omit ROW

It would have been nice if the syntax for that variant would have been:

( column_name, column_name [, ...] ) = [ ROW ] ( { expression | DEFAULT }, { expression | DEFAULT } [, ...] )
If the v10 behavior remains the above change should be made as well as adding:

( column_name ) = ROW ( expression | DEFAULT )

If we revert 10 to the pre-10 behavior the existing syntax will work.

David J.


Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates canproduce errors

From
"David G. Johnston"
Date:
On Tue, Oct 31, 2017 at 3:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
According to the spec, the elements of a parenthesized
SET list should be assigned from the fields of a composite RHS.  If
there's just one element of the SET list, the RHS should be a single-field
composite value, and this syntax should result in extracting its field.
This patch doesn't do that; it preserves our previous broken behavior,
and thus just puts off the day of reckoning that inevitably will come.

​Definitely moderates my opinion in my concurrent emai​l...though postponement is not strictly bad given the seeming frequency of the existing problematic syntax in the wild already.

David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> ​Definitely moderates my opinion in my concurrent emai​l...though
> postponement is not strictly bad given the seeming frequency of the
> existing problematic syntax in the wild already.

Yeah, I'd hoped to get some capability extensions done here before v10
shipped, in line with the theory I've expressed in the past that it's
better if you can point to actual new features justifying a compatibility
break.  However, that didn't happen in time.

I'm disinclined to revert the change though; if there are people making
use of this oddity now, then the longer we leave it in place the more
people are going to be hurt when we do break it.

If I had a time machine, I'd go back and fix the original multi-column
SET patch so that it required the word ROW in all cases --- then at
least it'd have accepted a conformant subset of the standard syntax.
Oh well.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers