Thread: trailing junk in numeric literals

trailing junk in numeric literals

From

Peter Eisentraut

Date:

28 December 2020, 20:22:58

I was surprised to find that this doesn't error:

=> select 100a;
   a
-----
  100

I suspect this and similar cases used to error before aliases without AS 
were introduced.  But now this seems possibly problematic.  Should we 
try to handle this better?

Re: trailing junk in numeric literals

From

Tom Lane

Date:

28 December 2020, 20:54:59

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> I was surprised to find that this doesn't error:
> => select 100a;
>    a
> -----
>   100

> I suspect this and similar cases used to error before aliases without AS 
> were introduced.  But now this seems possibly problematic.  Should we 
> try to handle this better?

Meh.  I think you'd get more brickbats than kudos if you start insisting
on a space there.

I'm too lazy to try to decipher the SQL spec right now, but ISTR that
it insists on whitespace between a numeric literal and an identifier.
So strictly speaking this SQL code is nonstandard anyway.  But our
lexer has always been forgiving about not requiring space if it's
not logically necessary to separate tokens.  I doubt trying to
change that would improve matters.

            regards, tom lane

Re: trailing junk in numeric literals

From

Peter Eisentraut

Date:

29 December 2020, 09:18:27

On 2020-12-28 21:54, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> I was surprised to find that this doesn't error:
>> => select 100a;
>>     a
>> -----
>>    100
> 
>> I suspect this and similar cases used to error before aliases without AS
>> were introduced.  But now this seems possibly problematic.  Should we
>> try to handle this better?
> 
> Meh.  I think you'd get more brickbats than kudos if you start insisting
> on a space there.
> 
> I'm too lazy to try to decipher the SQL spec right now, but ISTR that
> it insists on whitespace between a numeric literal and an identifier.

Yeah, non-delimiter tokens are supposed to be separated by delimiter tokens.

> So strictly speaking this SQL code is nonstandard anyway.  But our
> lexer has always been forgiving about not requiring space if it's
> not logically necessary to separate tokens.  I doubt trying to
> change that would improve matters.

Well, the idea is to diagnose potential typos better.  But if there is 
no interest, then that's fine.

Re: trailing junk in numeric literals

From

Fabien COELHO

Date:

29 December 2020, 10:24:13

Hello Peter,

My 0.02€:

>> So strictly speaking this SQL code is nonstandard anyway.  But our
>> lexer has always been forgiving about not requiring space if it's
>> not logically necessary to separate tokens.  I doubt trying to
>> change that would improve matters.
>
> Well, the idea is to diagnose potential typos better.  But if there is no 
> interest, then that's fine.

ISTM that silently accepting bogus syntax hides bugs rather than helps 
users. I'm personaly all for fixing these, especially when I'm said user.

My latest catch was:

   SELECT TIMESTAMP '2020-12-29Z06:16:18'; # 2020-12-29 00:00:00

But:

   SELECT TIMESTAMPTZ '2020-12-29Z06:16:18'; # 2020-12-29 07:16:18+01
   SELECT TIMESTAMP '2020-12-29T06:16:18'; # 2020-12-29 06:16:18

I happen to type a O which is close to 0 for which the shift key is also 
needed on the French keyboard. This makes the unhelpful:

   SELECT 12O; # 12 as O

I think that the policy should be to help user by detecting mistyped 
entries, not trying to interpret them out of the norm and expectations.

-- 
Fabien.

Re: trailing junk in numeric literals

From

Vik Fearing

Date:

16 January 2021, 13:02:56

On 12/29/20 10:18 AM, Peter Eisentraut wrote:
> On 2020-12-28 21:54, Tom Lane wrote:
>> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>>> I was surprised to find that this doesn't error:
>>> => select 100a;
>>>     a
>>> -----
>>>    100
>>
>>> I suspect this and similar cases used to error before aliases without AS
>>> were introduced.  But now this seems possibly problematic.  Should we
>>> try to handle this better?
>>
>> Meh.  I think you'd get more brickbats than kudos if you start insisting
>> on a space there.
>>
>> I'm too lazy to try to decipher the SQL spec right now, but ISTR that
>> it insists on whitespace between a numeric literal and an identifier.
> 
> Yeah, non-delimiter tokens are supposed to be separated by delimiter
> tokens.
> 
>> So strictly speaking this SQL code is nonstandard anyway.  But our
>> lexer has always been forgiving about not requiring space if it's
>> not logically necessary to separate tokens.  I doubt trying to
>> change that would improve matters.
> 
> Well, the idea is to diagnose potential typos better.  But if there is
> no interest, then that's fine.


I am in favor of such a change so that we can also accept 1_000_000
which currently parses as "1 AS _000_000" (which also isn't compliant
because identifiers cannot start with an underscore, but I don't want to
take it that far).

It would also allow us to have 0xdead_beef, 0o_777, and 0b1010_0000_1110
without most of it being interpreted as an alias.
-- 
Vik Fearing

Re: trailing junk in numeric literals

From

Andreas Karlsson

Date:

16 January 2021, 15:32:10

On 1/16/21 2:02 PM, Vik Fearing wrote:
> I am in favor of such a change so that we can also accept 1_000_000
> which currently parses as "1 AS _000_000" (which also isn't compliant
> because identifiers cannot start with an underscore, but I don't want to
> take it that far).
> 
> It would also allow us to have 0xdead_beef, 0o_777, and 0b1010_0000_1110
> without most of it being interpreted as an alias.

That would be a nice feature. Is it part of the SQL standard?

Andreas

Re: trailing junk in numeric literals

From

Vik Fearing

Date:

16 January 2021, 16:54:41

On 1/16/21 4:32 PM, Andreas Karlsson wrote:
> On 1/16/21 2:02 PM, Vik Fearing wrote:
>> I am in favor of such a change so that we can also accept 1_000_000
>> which currently parses as "1 AS _000_000" (which also isn't compliant
>> because identifiers cannot start with an underscore, but I don't want to
>> take it that far).
>>
>> It would also allow us to have 0xdead_beef, 0o_777, and 0b1010_0000_1110
>> without most of it being interpreted as an alias.
> 
> That would be a nice feature. Is it part of the SQL standard?

Yes, all of that is in the standard.
-- 
Vik Fearing

Re: trailing junk in numeric literals

From

Tom Lane

Date:

16 January 2021, 17:10:46

Vik Fearing <vik@postgresfriends.org> writes:
> On 1/16/21 4:32 PM, Andreas Karlsson wrote:
>> On 1/16/21 2:02 PM, Vik Fearing wrote:
>>> I am in favor of such a change so that we can also accept 1_000_000
>>> which currently parses as "1 AS _000_000" (which also isn't compliant
>>> because identifiers cannot start with an underscore, but I don't want to
>>> take it that far).
>>> It would also allow us to have 0xdead_beef, 0o_777, and 0b1010_0000_1110
>>> without most of it being interpreted as an alias.

>> That would be a nice feature. Is it part of the SQL standard?

> Yes, all of that is in the standard.

Really?  Please cite chapter and verse.  AFAICS in SQL:2011 5.3 <literal>,
a numeric literal can't contain any extraneous characters, just sign,
digits, optional decimal point, and optional exponent.  Hex and octal
literals are certainly not there either.

            regards, tom lane

Re: trailing junk in numeric literals

From

Vik Fearing

Date:

16 January 2021, 17:19:08

On 1/16/21 6:10 PM, Tom Lane wrote:
> Vik Fearing <vik@postgresfriends.org> writes:
>> On 1/16/21 4:32 PM, Andreas Karlsson wrote:
>>> On 1/16/21 2:02 PM, Vik Fearing wrote:
>>>> I am in favor of such a change so that we can also accept 1_000_000
>>>> which currently parses as "1 AS _000_000" (which also isn't compliant
>>>> because identifiers cannot start with an underscore, but I don't want to
>>>> take it that far).
>>>> It would also allow us to have 0xdead_beef, 0o_777, and 0b1010_0000_1110
>>>> without most of it being interpreted as an alias.
> 
>>> That would be a nice feature. Is it part of the SQL standard?
> 
>> Yes, all of that is in the standard.
> 
> Really?  Please cite chapter and verse.  AFAICS in SQL:2011 5.3 <literal>,
> a numeric literal can't contain any extraneous characters, just sign,
> digits, optional decimal point, and optional exponent.  Hex and octal
> literals are certainly not there either.

With respect, you are looking at a 10-year-old document and I am not.

5.3 <literal> has since been modified.
-- 
Vik Fearing

Re: trailing junk in numeric literals

From

Tom Lane

Date:

16 January 2021, 20:18:46

Vik Fearing <vik@postgresfriends.org> writes:
> With respect, you are looking at a 10-year-old document and I am not.
> 5.3 <literal> has since been modified.

Is a newer version of the spec available online?

            regards, tom lane