Thread: Is this a bug ?
We noticed this bug in Redshift. It then occurred to me that this may very well be a PG bug since Redshift is based on PG. Reproduced it in Version 11.5 create table bugtest (fld1 char(1)) ; insert into bugtest values('a'); insert into bugtest values('b'); insert into bugtest values('c'); insert into bugtest values('d'); insert into bugtest values('e'); select count(*) from bugtest where fld1 in ('a','b','c', 'd','e'); -> produces 5 as output. So far so good. Now check this: select count(*) from bugtest where fld1 in ('a','b','c' 'd','e'); Note the missing comma after 'c'. PG takes it a syntactically right SQL and gives 3 as output. In SQLServer it errors out SQL Error [102] [S0001]: Incorrect syntax near 'd'. Can't believe this bug was never found before. We ended up wasting lot of time to figure this out.
On 10/23/19 10:42 AM, Ravi Krishna wrote: > We noticed this bug in Redshift. It then occurred to me that this may > very well be a PG bug since Redshift is based on PG. > > Reproduced it in Version 11.5 > > create table bugtest (fld1 char(1)) ; > > insert into bugtest values('a'); > insert into bugtest values('b'); > insert into bugtest values('c'); > insert into bugtest values('d'); > insert into bugtest values('e'); > > select count(*) from bugtest where fld1 in ('a','b','c', > 'd','e'); -> produces 5 as output. > So far so good. > > Now check this: > > select count(*) from bugtest where fld1 in ('a','b','c' > 'd','e'); > > Note the missing comma after 'c'. > > PG takes it a syntactically right SQL and gives 3 as output. > > In SQLServer it errors out SQL Error [102] [S0001]: Incorrect syntax near 'd'. > > Can't believe this bug was never found before. We ended up wasting lot of time to figure this out. Confirmed in 9.6. postgres@haggis:~$ psql psql (9.6.15) Type "help" for help. postgres=# create table bugtest (fld1 char(1)) ; CREATE TABLE postgres=# postgres=# insert into bugtest values('a'); INSERT 0 1 postgres=# insert into bugtest values('b'); INSERT 0 1 postgres=# insert into bugtest values('c'); INSERT 0 1 postgres=# insert into bugtest values('d'); INSERT 0 1 postgres=# insert into bugtest values('e'); INSERT 0 1 postgres=# select count(*) from bugtest where fld1 in ('a','b','c', postgres(# 'd','e'); count ------- 5 (1 row) postgres=# select count(*) from bugtest where fld1 in ('a','b','c' postgres(# 'd','e'); count ------- 3 (1 row) -- Angular momentum makes the world go 'round.
Weird, but apparently not a bug. From https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS > Two string constants that are only separated by whitespace with at > least one newline are concatenated and effectively treated as if the > string had been written as one constant. For example: > > SELECT 'foo' > 'bar'; > > is equivalent to: > > SELECT 'foobar'; > > but: > > SELECT 'foo' 'bar'; > > is not valid syntax. (This slightly bizarre behavior is specified by > SQL; PostgreSQL is following the standard.) -- Dakkar - <Mobilis in mobile> GPG public key fingerprint = A071 E618 DD2C 5901 9574 6FE2 40EA 9883 7519 3F88 key id = 0x75193F88
On Wed, 23 Oct 2019 at 16:42, Ravi Krishna <sr_krishna@aol.com> wrote: > select count(*) from bugtest where fld1 in ('a','b','c' > 'd','e'); > > Note the missing comma after 'c'. > > PG takes it a syntactically right SQL and gives 3 as output. > > In SQLServer it errors out SQL Error [102] [S0001]: Incorrect syntax near 'd'. > > Can't believe this bug was never found before. We ended up wasting lot of time to figure this out. Simplify: select 'a' db-# 'b'; ?column? ---------- ab (1 row) This is not a bug. https://www.postgresql.org/docs/9.2/sql-syntax-lexical.html Two string constants that are only separated by whitespace with at least one newline are concatenated and effectively treated as if the string had been written as one constant. Geoff
> > Simplify: > > select 'a' > db-# 'b'; > ?column? > ---------- > ab > (1 row) > > This is not a bug. > > https://www.postgresql.org/docs/9.2/sql-syntax-lexical.html > > Two string constants that are only separated by whitespace with at > least one newline are concatenated and effectively treated as if the > string had been written as one constant. > > Geoff Shouldn't the output be 4 in that case since it is looking for('a','b','cd','e') and it should find all but cd.
On 10/23/19 10:51 AM, Geoff Winkless wrote: > On Wed, 23 Oct 2019 at 16:42, Ravi Krishna <sr_krishna@aol.com> wrote: > >> select count(*) from bugtest where fld1 in ('a','b','c' >> 'd','e'); >> >> Note the missing comma after 'c'. >> >> PG takes it a syntactically right SQL and gives 3 as output. >> >> In SQLServer it errors out SQL Error [102] [S0001]: Incorrect syntax near 'd'. >> >> Can't believe this bug was never found before. We ended up wasting lot of time to figure this out. > Simplify: > > select 'a' > db-# 'b'; > ?column? > ---------- > ab > (1 row) > > This is not a bug. > > https://www.postgresql.org/docs/9.2/sql-syntax-lexical.html > > Two string constants that are only separated by whitespace with at > least one newline are concatenated and effectively treated as if the > string had been written as one constant. Then -- since the 'e' is separated from 'd' by a comma, the result should be "4", not "3". No doubt: it's a bug, no matter what the Pg devs say. -- Angular momentum makes the world go 'round.
Ron <ronljohnsonjr@gmail.com> writes: > On 10/23/19 10:51 AM, Geoff Winkless wrote: >> Two string constants that are only separated by whitespace with at >> least one newline are concatenated and effectively treated as if the >> string had been written as one constant. > Then -- since the 'e' is separated from 'd' by a comma, the result should be > "4", not "3". No, because your command is equivalent to select count(*) from bugtest where fld1 in ('a','b','cd','e'); 'cd' does not match any of the table rows, so 3 is the correct answer. > No doubt: it's a bug, no matter what the Pg devs say. Complain to the SQL standards committee, not us. regards, tom lane
On 23/10/2019 16:55, Ron wrote: > On 10/23/19 10:51 AM, Geoff Winkless wrote: >> On Wed, 23 Oct 2019 at 16:42, Ravi Krishna <sr_krishna@aol.com> wrote: >> >>> select count(*) from bugtest where fld1 in ('a','b','c' >>> 'd','e'); >>> >>> Note the missing comma after 'c'. >>> >>> PG takes it a syntactically right SQL and gives 3 as output. >>> >>> In SQLServer it errors out SQL Error [102] [S0001]: Incorrect syntax >>> near 'd'. >>> >>> Can't believe this bug was never found before. We ended up wasting >>> lot of time to figure this out. >> Simplify: >> >> select 'a' >> db-# 'b'; >> ?column? >> ---------- >> ab >> (1 row) >> >> This is not a bug. >> >> https://www.postgresql.org/docs/9.2/sql-syntax-lexical.html >> >> Two string constants that are only separated by whitespace with at >> least one newline are concatenated and effectively treated as if the >> string had been written as one constant. > > Then -- since the 'e' is separated from 'd' by a comma, the result > should be "4", not "3". > > No doubt: it's a bug, no matter what the Pg devs say. > Does any record in your test table contain the concatenated 'cd' in "fld1"? If not, it's working as per specification & documentation. I can agree that this sounds like one of those idiosyncratic parts of the ANSI specification that exist only for compatibility with some ancient practice, but this behaviour is documented. BTW, can parser behavior be affected by settings this early? A GUC setting disabling this madness would make sense IMHO... -- Regards Fabio Ugo Venchiarutti OSPCFC Network Engineering Dpt. Ocado Technology -- Notice: This email is confidential and may contain copyright material of members of the Ocado Group. Opinions and views expressed in this message may not necessarily reflect the opinions and views of the members of the Ocado Group. If you are not the intended recipient, please notify us immediately and delete all copies of this message. Please note that it is your responsibility to scan this message for viruses. References to the "Ocado Group" are to Ocado Group plc (registered in England and Wales with number 7098618) and its subsidiary undertakings (as that expression is defined in the Companies Act 2006) from time to time. The registered office of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.
On Wed, Oct 23, 2019 at 8:56 AM Ravi Krishna <sr_krishna@aol.com> wrote:
>
> Simplify:
>
> select 'a'
> db-# 'b';
> ?column?
> ----------
> ab
> (1 row)
>
> This is not a bug.
>
> https://www.postgresql.org/docs/9.2/sql-syntax-lexical.html
>
> Two string constants that are only separated by whitespace with at
> least one newline are concatenated and effectively treated as if the
> string had been written as one constant.
>
> Geoff
Shouldn't the output be 4 in that case since it is looking for('a','b','cd','e') and it should find all but cd.
And that means there are 3 options on the table = a, b, and e - where is the 4th item?
cd is not in the table so it cannot be found.
John
On Wed, 23 Oct 2019 at 16:55, Ron <ronljohnsonjr@gmail.com> wrote: > Then -- since the 'e' is separated from 'd' by a comma, the result should be > "4", not "3". > > No doubt: it's a bug, no matter what the Pg devs say. I'm confused why you consider that to be true. The count is checking for columns containing the strings 'a', 'b', 'cd' or 'e'. There is not one with 'cd', so the correct result is 3. Geoff
On 10/23/19 11:03 AM, Geoff Winkless wrote: > On Wed, 23 Oct 2019 at 16:55, Ron <ronljohnsonjr@gmail.com> wrote: >> Then -- since the 'e' is separated from 'd' by a comma, the result should be >> "4", not "3". >> >> No doubt: it's a bug, no matter what the Pg devs say. > I'm confused why you consider that to be true. The count is checking > for columns containing the strings 'a', 'b', 'cd' or 'e'. There is not > one with 'cd', so the correct result is 3. That's my mistake. -- Angular momentum makes the world go 'round.
On 10/23/19 11:00 AM, Tom Lane wrote: > Ron <ronljohnsonjr@gmail.com> writes: >> On 10/23/19 10:51 AM, Geoff Winkless wrote: >>> Two string constants that are only separated by whitespace with at >>> least one newline are concatenated and effectively treated as if the >>> string had been written as one constant. >> Then -- since the 'e' is separated from 'd' by a comma, the result should be >> "4", not "3". > No, because your command is equivalent to > > select count(*) from bugtest where fld1 in ('a','b','cd','e'); > > 'cd' does not match any of the table rows, so 3 is the correct answer. Yes, that's my mistake. >> No doubt: it's a bug, no matter what the Pg devs say. > Complain to the SQL standards committee, not us. As much as I hate to say it, MSFT was right to ignore this bug in the standard. -- Angular momentum makes the world go 'round.
On Wed, 23 Oct 2019 at 17:09, Ron <ronljohnsonjr@gmail.com> wrote: > As much as I hate to say it, MSFT was right to ignore this bug in the standard. Standards are standards for a reason. It is almost never correct to deliberately ignore them. If you don't like them, then revise the standard. Historically Microsoft ignored standards either because they misunderstood them or because they wanted to lock in their customers, not for any reasons of altruism. For what it's worth, I can see a value to having SELECT 'this is quite a long string' 'which I've joined together ' 'across multiple lines'; although the advantage of it vs using a concat operator is slim. Geoff
On 10/23/19 11:20 AM, Geoff Winkless wrote: > On Wed, 23 Oct 2019 at 17:09, Ron <ronljohnsonjr@gmail.com> wrote: > >> As much as I hate to say it, MSFT was right to ignore this bug in the standard. > Standards are standards for a reason. It is almost never correct to > deliberately ignore them. If you don't like them, then revise the > standard. > > Historically Microsoft ignored standards either because they > misunderstood them or because they wanted to lock in their customers, > not for any reasons of altruism. > > For what it's worth, I can see a value to having > > SELECT 'this is quite a long string' > 'which I've joined together ' > 'across multiple lines'; > > although the advantage of it vs using a concat operator is slim. There is no advantage to using it vs using a concat operator, and all disadvantage. -- Angular momentum makes the world go 'round.
On Wed, 23 Oct 2019 at 17:20, Geoff Winkless <pgsqladmin@geoff.dj> wrote: > For what it's worth, I can see a value to having > > SELECT 'this is quite a long string' > 'which I've joined together ' > 'across multiple lines'; > > although the advantage of it vs using a concat operator is slim. As an aside, Postgres isn't the only DB to follow the standard here. mysql> select 'hello' -> ' there'; +-------------+ | hello | +-------------+ | hello there | +-------------+ Geoff
On 10/23/19 11:27 AM, Geoff Winkless wrote: > On Wed, 23 Oct 2019 at 17:20, Geoff Winkless <pgsqladmin@geoff.dj> wrote: >> For what it's worth, I can see a value to having >> >> SELECT 'this is quite a long string' >> 'which I've joined together ' >> 'across multiple lines'; >> >> although the advantage of it vs using a concat operator is slim. > As an aside, Postgres isn't the only DB to follow the standard here. > > mysql> select 'hello' > -> ' there'; > +-------------+ > | hello | > +-------------+ > | hello there | > +-------------+ This is the kind of weird stuff that we mocked mysql for. This too would have been roundly mocked if discovered in mysql: postgres=# select to_date('2018150X','YYYYMMDD'); to_date ------------ 2019-03-03 (1 row) -- Angular momentum makes the world go 'round.
On 10/23/19 9:30 AM, Ron wrote: > On 10/23/19 11:27 AM, Geoff Winkless wrote: >> On Wed, 23 Oct 2019 at 17:20, Geoff Winkless <pgsqladmin@geoff.dj> wrote: >>> For what it's worth, I can see a value to having >>> >>> SELECT 'this is quite a long string' >>> 'which I've joined together ' >>> 'across multiple lines'; >>> >>> although the advantage of it vs using a concat operator is slim. >> As an aside, Postgres isn't the only DB to follow the standard here. >> >> mysql> select 'hello' >> -> ' there'; >> +-------------+ >> | hello | >> +-------------+ >> | hello there | >> +-------------+ > > This is the kind of weird stuff that we mocked mysql for. > > This too would have been roundly mocked if discovered in mysql: > > postgres=# select to_date('2018150X','YYYYMMDD'); > to_date > ------------ > 2019-03-03 > (1 row) > > As of v10 it does not work in Postgres either: test_(postgres)# select version(); version ---------------------------------------------------------------------------------------------------------------------------- PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 7.4.1 20190424 [gcc-7-branch revision 270538], 64-bit (1 row) test_(postgres)# select to_date('2018150X','YYYYMMDD'); ERROR: date/time field value out of range: "2018150X" -- Adrian Klaver adrian.klaver@aklaver.com
On 23/10/2019 17:30, Ron wrote: > On 10/23/19 11:27 AM, Geoff Winkless wrote: >> On Wed, 23 Oct 2019 at 17:20, Geoff Winkless <pgsqladmin@geoff.dj> wrote: >>> For what it's worth, I can see a value to having >>> >>> SELECT 'this is quite a long string' >>> 'which I've joined together ' >>> 'across multiple lines'; >>> >>> although the advantage of it vs using a concat operator is slim. >> As an aside, Postgres isn't the only DB to follow the standard here. >> >> mysql> select 'hello' >> -> ' there'; >> +-------------+ >> | hello | >> +-------------+ >> | hello there | >> +-------------+ > > This is the kind of weird stuff that we mocked mysql for. > > This too would have been roundly mocked if discovered in mysql: > > postgres=# select to_date('2018150X','YYYYMMDD'); > to_date > ------------ > 2019-03-03 > (1 row) > > The main difference being that we mocked MySQL because it BROKE the standard requirements in bespoke, arguably lazy/stupid/dangerous ways (and often undocumented as a result). You ran into a gotcha of the SQL specification; in your shoes I'd instead appreciate the fact that the PG maintainers went out of their way and documented this spec weirdness as such in their own project when they could have just went "look into the specification for oddities". MySQL silently truncated your overflowing string for years without as much as a warning, by default. In my book that broke specification, rule of least surprise, data safety and a lot more. -- Regards Fabio Ugo Venchiarutti OSPCFC Network Engineering Dpt. Ocado Technology -- Notice: This email is confidential and may contain copyright material of members of the Ocado Group. Opinions and views expressed in this message may not necessarily reflect the opinions and views of the members of the Ocado Group. If you are not the intended recipient, please notify us immediately and delete all copies of this message. Please note that it is your responsibility to scan this message for viruses. References to the "Ocado Group" are to Ocado Group plc (registered in England and Wales with number 7098618) and its subsidiary undertakings (as that expression is defined in the Companies Act 2006) from time to time. The registered office of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, Hatfield, Hertfordshire, AL10 9UL.
On 10/23/19 11:20 AM, Geoff Winkless wrote: > On Wed, 23 Oct 2019 at 17:09, Ron <ronljohnsonjr@gmail.com> wrote: > >> As much as I hate to say it, MSFT was right to ignore this bug in the standard. > Standards are standards for a reason. It is almost never correct to > deliberately ignore them. If you don't like them, then revise the > standard. > > Historically Microsoft ignored standards either because they > misunderstood them or because they wanted to lock in their customers, > not for any reasons of altruism. > > For what it's worth, I can see a value to having > > SELECT 'this is quite a long string' > 'which I've joined together ' > 'across multiple lines'; > > although the advantage of it vs using a concat operator is slim. It just occurred to me that this is probably something that COBOL does (it's been 30 years, so I might be mistaken) and so that's how it slipped into the SQL standard. -- Angular momentum makes the world go 'round.
On 2019-10-23 17:20:07 +0100, Geoff Winkless wrote: > For what it's worth, I can see a value to having > > SELECT 'this is quite a long string' > 'which I've joined together ' > 'across multiple lines'; > > although the advantage of it vs using a concat operator is slim. True. However, concatenation of string literals by juxtaposition isn't specific to SQL. Two other well known languages where this works (even without a newline) are C and Python. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Attachment
On Wed, Oct 23, 2019 at 5:55 PM Ron <ronljohnsonjr@gmail.com> wrote: > No doubt: it's a bug, no matter what the Pg devs say. select 'a' 'b' = 'ab' this is what SQL mandates You should submit a complain to SQL guys -- cpp-today.blogspot.com
On Wed, Oct 23, 2019 at 9:03 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote: > > On 2019-10-23 17:20:07 +0100, Geoff Winkless wrote: > > For what it's worth, I can see a value to having > > > > SELECT 'this is quite a long string' > > 'which I've joined together ' > > 'across multiple lines'; > > > > although the advantage of it vs using a concat operator is slim. > > True. However, concatenation of string literals by juxtaposition isn't > specific to SQL. Two other well known languages where this works (even > without a newline) are C and Python. And C++ as well
So reading responses from all, here is a SQL test I did on few RDBMS: select 'John' 'Doe' as rrr from information_schema.tables limit 1; PG output rrr --- JohnDoe Oracle and mysql gave same output as PG with no error SQLServer: Syntax error near 'as'. DB2 gave same error as SQLServer.
On Wed, 23 Oct 2019 11:27:12 -0500, Ron <ronljohnsonjr@gmail.com> wrote: >On 10/23/19 11:20 AM, Geoff Winkless wrote: >> On Wed, 23 Oct 2019 at 17:09, Ron <ronljohnsonjr@gmail.com> wrote: >> >>> As much as I hate to say it, MSFT was right to ignore this bug in the standard. >> Standards are standards for a reason. It is almost never correct to >> deliberately ignore them. If you don't like them, then revise the >> standard. >> >> Historically Microsoft ignored standards either because they >> misunderstood them or because they wanted to lock in their customers, >> not for any reasons of altruism. >> >> For what it's worth, I can see a value to having >> >> SELECT 'this is quite a long string' >> 'which I've joined together ' >> 'across multiple lines'; >> >> although the advantage of it vs using a concat operator is slim. > >There is no advantage to using it vs using a concat operator, and all >disadvantage. It WAS an advantage querying interactively on 80 character text mode screens, and when SQL was embedded (directly) into programs written in other languages. Regardless of how recent the latest standard - SQL still has many ... warts really, but I will be charitable and call them "vestiges" ... of its roots as a 1970s language. YMMV, George