Thread: BUG #1517: SQL interval syntax is accepted by the parser, but the interpretation is bogus
BUG #1517: SQL interval syntax is accepted by the parser, but the interpretation is bogus
From
"Roy Badami"
Date:
The following bug has been logged online: Bug reference: 1517 Logged by: Roy Badami Email address: roy@gnomon.org.uk PostgreSQL version: 8.0.1 Operating system: Solaris 9 Description: SQL interval syntax is accepted by the parser, but the interpretation is bogus Details: The parser accepts SQL interval syntax, but then silently ignores it, treating it as a zero interval. radius=# select date '2005-01-01' + interval '1' month; ?column? --------------------- 2005-01-01 00:00:00 (1 row) radius=# select timestamp '2005-01-1 00:00:00' + interval '1' minute; ?column? --------------------- 2005-01-01 00:00:00 (1 row) radius=#
Roy Badami wrote: > > The following bug has been logged online: > > Bug reference: 1517 > Logged by: Roy Badami > Email address: roy@gnomon.org.uk > PostgreSQL version: 8.0.1 > Operating system: Solaris 9 > Description: SQL interval syntax is accepted by the parser, but the > interpretation is bogus > Details: > > The parser accepts SQL interval syntax, but then silently ignores it, > treating it as a zero interval. > > radius=# select date '2005-01-01' + interval '1' month; > ?column? > --------------------- > 2005-01-01 00:00:00 > (1 row) > > radius=# select timestamp '2005-01-1 00:00:00' + interval '1' minute; > ?column? > --------------------- > 2005-01-01 00:00:00 > (1 row) Well, that certainly belongs in the 'bizarre' category. It should not accept that syntax. It should require the 'month' or 'minute' to be in single quotes. This is wrong: test=> select date '2005-01-01' + interval '1' month; ?column? --------------------- 2005-01-01 00:00:00 (1 row) This is right: test=> select date '2005-01-01' + interval '1 month'; ?column? --------------------- 2005-02-01 00:00:00 (1 row) In fact when the 'month' is outside the quotes, it modifies the 'interval', like this: test=> select date '2005-01-01' + interval '1 year' year to month; ?column? --------------------- 2006-01-01 00:00:00 (1 row) and in fact the '1' is taken to be 1 second: test=> select date '2005-01-01' + interval '1'; ?column? --------------------- 2005-01-01 00:00:01 (1 row) So, in fact these work just fine: test=> select date '2005-01-01' + interval '1' second; ?column? --------------------- 2005-01-01 00:00:01 (1 row) test=> select date '2005-01-01' + interval '1' hour to second; ?column? --------------------- 2005-01-01 00:00:01 (1 row) Do we need help in this area? Yes. Where? I don't know. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Well, that certainly belongs in the 'bizarre' category. It should not > accept that syntax. It should require the 'month' or 'minute' to be in > single quotes. No, it shouldn't; read the SQL spec. AFAICS the syntax select interval '1' month is perfectly spec-compliant. The variant select interval '1 month' is *not* per-spec, it is a Postgres-ism. Tom Lockhart was working on this stuff shortly before he decided that raising horses was a more interesting use of his spare time. It doesn't look like he ever quite finished. I tried several back versions of Postgres to see if it had ever operated correctly and the answer seems to be "no" :-( ... although we have managed to fail in more than one way over the years ... regards, tom lane
Tom> No, it shouldn't; read the SQL spec. AFAICS the syntax Tom> select interval '1' month is perfectly spec-compliant. The Tom> variant select interval '1 month' is *not* per-spec, it is a Tom> Postgres-ism. That is my understanding, though I don't have a copy of the spec (my reference is Date & Darwen's "A guide to the SQL standard") However, it may be better if the PostgreSQL parser rejected the syntax. The current behaviour would seem to be a smoking gun for people porting ANSI-compliant SQL applications (assuming such things exist :) to PostgreSQL. -roy
Tom> AFAICS the syntax Tom> select interval '1' month Tom> is perfectly spec-compliant. Well, it's not _perfectly_ spec compliant, because AIUI SELECTs without FROM clauses are a postgres-ism, too. But I'm just nitpicking... -roy
Roy Badami wrote: > > Tom> No, it shouldn't; read the SQL spec. AFAICS the syntax > Tom> select interval '1' month is perfectly spec-compliant. The > Tom> variant select interval '1 month' is *not* per-spec, it is a > Tom> Postgres-ism. > > That is my understanding, though I don't have a copy of the spec (my > reference is Date & Darwen's "A guide to the SQL standard") We have links to the spec in the developer's FAQ. > However, it may be better if the PostgreSQL parser rejected the > syntax. The current behaviour would seem to be a smoking gun for > people porting ANSI-compliant SQL applications (assuming such things > exist :) to PostgreSQL. So, we have a few major problems with intervals. Let me think a little and I will summarize. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce> So, we have a few major problems with intervals. Let me Bruce> think a little and I will summarize. FWIW, AFAICT the problems I reported in bug 1517 and 1518 all relate to undocumented features of PostgreSQL. All the documented interval functionality works fine. The undocumented support for ANSI SQL interval data types and litereals doesn't :-/ -roy
Roy Badami <roy@gnomon.org.uk> writes: > All the documented interval functionality works fine. The > undocumented support for ANSI SQL interval data types and litereals > doesn't :-/ I think the reason it's not documented is precisely that Tom never finished it. It may not be very far away though --- seeing that the grammar support exists, I suspect the only missing piece is that interval_in isn't paying attention to the typmod info, as it should do to disambiguate input like '1'. Or maybe that support is partially there but doesn't quite work. Feel like hacking the code? regards, tom lane
Tom> Feel like hacking the code? Hmm, in principle I might take a look some time; in reality it's unlikely I'll have time any time soon... There are some design issues involved, though. If you have the type modifier, do you isnist on SQL syntax in the string? ie do you accept interval '1 day 1 hour' day to second Personally I think it would be a bad idea to allow hybrid SQL/postgres syntax like this. IMHO, you should either write interval '1 day 1 hour' (postgres style), or interval '1 1:00:00' day to second (SQL style.) Hmm, except writing the above has just raised another question. Is that what the postgres-ism really means (I think it does) or does it mean interval '1 1' day to hour Once you start distinguishing your interval types, does this become important? Actually, I can't immediately see a case where it would matter, but that doesn't mean there isn't one... -roy
Roy Badami <roy@gnomon.org.uk> writes: > ie do you accept > interval '1 day 1 hour' day to second I think we have to, and the reason is that this isn't different under the hood from reading the external value '1 day 1 hour' and storing it into a column that has the DAY TO SECOND typmod. If we reject the above we'd be breaking existing dump files. Furthermore this would imply that dump output from a constrained interval column would *have to* not have any decoration; ie we could only output '1 1' and not '1 day 1 hour'. Regardless of what the spec says, I find the former dangerously ambiguous. I'm happy to see our code upgraded to accept the spec's syntax. I won't be happy to see it changed to reject input that we used to accept, especially when the only argument for doing so is a narrow-minded insistence that we can't accept anything beyond what the spec says. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> ie do you accept interval '1 day 1 hour' day to second Tom> I think we have to, and the reason is that this isn't Tom> different under the hood from reading the external value '1 Tom> day 1 hour' and storing it into a column that has the DAY TO Tom> SECOND typmod. I don't know anything about the postgres internals, but I don't see it has to be this way. INTERVAL '1 day 1 hour' DAY TO SECOND won't occur in any existing dump file. But if it's important to treat this the same as casting the string '1 day 1 hour' to type INTERVAL DAY TO SECOND then yes, you'll have to accept it. But this is just syntax; I don't see why you have to interpret it that way... But on refelction if you want to treat INTERVAL 'postgres-interval' ansi-interval-type as equivalent to CAST (INTERVAL 'postgres-interval' AS INTERVAL ansi-interval-type) that's probably not unreasonable. Though it creates an inconsistency with the current (undocumented) postgresism of treating INTERVAL '1' as INTERVAL '1 second' since clearly you can't treat the ANSI interval INTERVAL '1' HOUR as CAST (INTERVAL '1 second' AS INTERVAL HOUR) -roy
Similary the undocumented postgresism of interpreting INTERVAL '1:02' as 1 hour 2 minutes is consistent with the ANSI INTERVAL '1:02' HOUR TO MINUTE but not with the ANSI INTERVAL '1:02' MINUTE TO SECOND which of course means 1 minute 2 seconds. The fact is that ANSI interval syntax is very different from postgres interval syntax. In ANSI interval syntax the literal string can only be interpreted in the context of the interval type; in postgres interval syntax the literal string has a well defined meaning in and of itself, and no interval type is explicitly declared. So I think I'm back to where I started. Attempting to define semantics for a hybrid format, where you have an ANSI interval type but the literal string formatted in postgres interval format is unnecessarity confusing and complicated. -roy
Roy Badami <roy@gnomon.org.uk> writes: > Similary the undocumented postgresism of interpreting > INTERVAL '1:02' > as 1 hour 2 minutes is consistent with the ANSI > INTERVAL '1:02' HOUR TO MINUTE > but not with the ANSI > INTERVAL '1:02' MINUTE TO SECOND > which of course means 1 minute 2 seconds. Well, that's an annoying case but I don't think it means we should throw up our hands and reject cases that are (a) perfectly unambiguous and (b) accepted by the present and past code. We have to be able to support casts from undecorated INTERVAL to INTERVALs with typmods, so most of these issues *have* to be dealt with anyway; we can't arbitrarily reject them. What I am thinking is that (a) if the input string is undecorated or ambiguous, use the typmod to help resolve it --- in particular this should cover all of the spec-mandated cases. (b) if it is unambiguous Postgres-style syntax, read it that way and then perform a cast to the restricted interval type. regards, tom lane
Roy Badami wrote: > Tom> Feel like hacking the code? > > Hmm, in principle I might take a look some time; in reality it's > unlikely I'll have time any time soon... > > There are some design issues involved, though. If you have the type > modifier, do you isnist on SQL syntax in the string? > > ie do you accept > > select interval '1 day 1 hour' day to second > > Personally I think it would be a bad idea to allow hybrid SQL/postgres > syntax like this. I am wondering why we allow the 'interval' data type specification to be after the string. It seems this should be written as: select interval day to second '1 day 1 hour' However, we don't support that syntax, only the one with the specification after. Timestamp does it logically with 'time zone': test=> select timestamp with time zone '2004-01-01'; timestamptz ------------------------ 2004-01-01 00:00:00-05 (1 row) test=> select timestamp '2004-01-01' with time zone; ERROR: syntax error at or near "with" at character 31 So, I am thinking we should allow something like this: interval day to second '1' hour where the 'day to second' is the interval data type specification and 'hour' is the data value specification. I realize this might break backward compatibility but we never documented that syntax and we need to use it to be consistent with the SQL standard. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce> select interval day to second '1 day 1 hour' Bruce> However, we don't support that syntax, only the one with Bruce> the specification after. Is that valid ANSI SQL? -roy
Roy Badami wrote: > Bruce> select interval day to second '1 day 1 hour' > > Bruce> However, we don't support that syntax, only the one with > Bruce> the specification after. > > Is that valid ANSI SQL? No idea. It just seemed like the data type specification and the data value specification has to be split apart somehow. Right now we use the clause after the string as the date type specification, and I see you saying that the data value specification has to after the string. Is that correct? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce> somehow. Right now we use the clause after the string as Bruce> the date type specification, and I see you saying that the Bruce> data value specification has to after the string. Is that Bruce> correct? Well, that's what 'A guide to the SQL standard' gives as the syntax for interval literals. Note too that this isn't just some optional qualifier, it's presence is mandatory in ANSI SQL, since without it you can't interpret the string. I had a brief look at the standard, but I don't know my way around it and couldn't immeidately find where this is specified... -roy
Bruce> test=> select timestamp with time zone '2004-01-01'; Also, FWIW, according to the postgres doc this is a postgresism. The 'with time zone' clause never occurs in an ANSI timestamp literal; whether it is a timestamp or a timestamp with time zone depends on whether a time zone specification is included in the literal string. -roy
Roy Badami wrote: > Bruce> select interval day to second '1 day 1 hour' > > Bruce> However, we don't support that syntax, only the one with > Bruce> the specification after. > > Is that valid ANSI SQL? I guess my point is that we should allow: select interval '1' day '1' hour as SQL standard and equavalent to: select interval '1 day 1 hour' and if we need to specify the data type we would add it before the data value: select interval day to second '1' day '1' hour I see no way to support the SQL syntax and allow the data type specification after the data value, and it doesn't make any sense to do so because it isn't logical and probably not used by anyone, though we might be able to support it as a specialized case. We could accept: select interval year to month '1' month select interval year to month '1 month' and as a special case: select interval '1 month' year to month but not: select interval '1' year to month select interval year to month '1' -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce> I guess my point is that we should allow: Bruce> select interval '1' day '1' hour Bruce> as SQL standard and equavalent to: Ah, I think you're misunderstanding what the SQL standard interval literal syntax looks like. It would be INTERVAL '1 1' DAY TO HOUR Essentially the full syntax for a day-time interval is INTERVAL '1 2:03:04' DAY TO SECOND and the full syntax of a year-month interval is INTERVA: '1-2' YEAR TO MONTH but if you use a more restricted interval type you omit the fields that aren't present in your interval type.
Roy> It would be Roy> INTERVAL '1 1' DAY TO HOUR Actually, it would be any one of the following: INTERVAL '1 1' DAY TO HOUR INTERVAL '1 1:00' DAY TO MINUTE INTERVAL '1 1:00:00' DAY TO SECOND INTERVAL '25' HOUR INTERVAL '25:00' HOUR TO MINUTE INTERVAL '25:00:00' HOUR TO SECOND INTERVAL '1500' MINUTE INTERVAL '1500:00' MINUTE TO SECOND INTERVAL '90000' SECOND all of which are equivalent to the postgres interval INTERVAL '1 day 1 hour' but all of them have (in ANSI SQL) disitinct data types. At this point you may decide that ANSI intervals are (to put it politely) somewhat overcomplex... :-) -roy
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I am wondering why we allow the 'interval' data type specification to be > after the string. Because that's what the standard demands. Please don't muddy the waters by introducing yet more nonstandard syntax into the discussion. regards, tom lane
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I guess my point is that we should allow: > select interval '1' day '1' hour > as SQL standard Where do you get that that's in the SQL standard? What is in the standard is <interval literal> ::= INTERVAL [ <sign> ] <interval string> <interval qualifier> <interval string> ::= <quote> { <year-month literal> | <day-time literal> } <quote> <interval qualifier> ::= <start field> TO <end field> | <single datetime field> <interval qualifier> is defined in SQL92 10.1 --- I won't repeat the whole BNF for it because it's tedious, but the point here is that you get *one* string and *one* qualifier, in that order. (I notice that we don't have the <sign> accounted for yet, btw, but the rest of the syntax seems to be up to speed.) regards, tom lane
Roy Badami <roy@gnomon.org.uk> writes: > ... but all of them have (in ANSI SQL) disitinct data types. They are distinct types, or at least different typmods, in Postgres as well. regards, tom lane
Added to TODO also: * Interpret INTERVAL '1:30' MINUTE TO SECOND as '1 minute 30 seconds' --------------------------------------------------------------------------- Tom Lane wrote: > Roy Badami <roy@gnomon.org.uk> writes: > > Similary the undocumented postgresism of interpreting > > INTERVAL '1:02' > > as 1 hour 2 minutes is consistent with the ANSI > > INTERVAL '1:02' HOUR TO MINUTE > > but not with the ANSI > > INTERVAL '1:02' MINUTE TO SECOND > > which of course means 1 minute 2 seconds. > > Well, that's an annoying case but I don't think it means we should throw > up our hands and reject cases that are (a) perfectly unambiguous and > (b) accepted by the present and past code. > > We have to be able to support casts from undecorated INTERVAL to > INTERVALs with typmods, so most of these issues *have* to be dealt with > anyway; we can't arbitrarily reject them. What I am thinking is that > (a) if the input string is undecorated or ambiguous, use the typmod > to help resolve it --- in particular this should cover all of the > spec-mandated cases. > (b) if it is unambiguous Postgres-style syntax, read it that way and > then perform a cast to the restricted interval type. > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce> * Interpret INTERVAL '1:30' MINUTE TO SECOND as '1 Bruce> minute 30 seconds' This seems redundant; it's just another example of the ANSI day-time interval syntax... -roy
Roy Badami wrote: > > Bruce> * Interpret INTERVAL '1:30' MINUTE TO SECOND as '1 > Bruce> minute 30 seconds' > > This seems redundant; it's just another example of the ANSI day-time > interval syntax... The reason I added it is that usually people think of the PG syntax as '1 hour 30 seconds'. The '1:30' is a kind of subtle because both PG and ANSI support that syntax and we need to handle that. The tricky part is that we can't say by looking at '1:30' whether it is PG or ANSI, and that will affect how we deal with the clause after it. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce> The reason I added it is that usually people think of the Bruce> PG syntax as '1 hour 30 seconds'. The '1:30' is a kind of Bruce> subtle because both PG and ANSI support that syntax and we Bruce> need to handle that. The tricky part is that we can't say Bruce> by looking at '1:30' whether it is PG or ANSI, and that Bruce> will affect how we deal with the clause after it. It could be either in ANSI: INTERVAL '1:30' MINUTE TO SECOND INTERVAL '1:30' HOUR TO MINUTE Similarly, pg interprets INTERVAL '1' as INTERVAL '1 second' In ANSI, it could be any one of INTERVAL '1' SECOND INTERVAL '1' MINUTE INTERVAL '1' HOUR INTERVAL '1' DAY INTERVAL '1' MONTH INTERVAL '1' YEAR In ANSI you can only parse the string by reference to the interval type. -roy
Roy Badami wrote: > > Bruce> The reason I added it is that usually people think of the > Bruce> PG syntax as '1 hour 30 seconds'. The '1:30' is a kind of > Bruce> subtle because both PG and ANSI support that syntax and we > Bruce> need to handle that. The tricky part is that we can't say > Bruce> by looking at '1:30' whether it is PG or ANSI, and that > Bruce> will affect how we deal with the clause after it. > > It could be either in ANSI: > > INTERVAL '1:30' MINUTE TO SECOND > INTERVAL '1:30' HOUR TO MINUTE > > Similarly, pg interprets > > INTERVAL '1' > > as > > INTERVAL '1 second' > > In ANSI, it could be any one of > > INTERVAL '1' SECOND > INTERVAL '1' MINUTE > INTERVAL '1' HOUR > INTERVAL '1' DAY > INTERVAL '1' MONTH > INTERVAL '1' YEAR > > In ANSI you can only parse the string by reference to the interval type. This is a good point. I am thinking we are going to have some backwards compatibility problems here. Consider this TODO item: o Interpret INTERVAL '1 year' MONTH as CAST (INTERVAL '1 year' AS INTERVAL MONTH), and this should return '12 months' In this, we know that '1 year' is PG syntax, and can do the cast cleanly. However, in the case of '01:02' or '1', we don't know if that is PG stynax or ANSI syntax. I think we are going to have to interpret those as ANSI. Here is a clearer TODO item. I changed this: o Interpret INTERVAL '1:30' MINUTE TO SECOND as '1 minute 30 seconds' to this: o Interpret syntax that isn't uniquely ANSI or PG, like '1:30' or '1' as ANSI syntax, e.g. interpret '1:30' MINUTE TO SECOND as '1 minute 30 seconds' This is going to be a backward compatibility problem, but to support ANSI syntax we are going to need to do this. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce> This is going to be a backward compatibility problem, but Bruce> to support ANSI syntax we are going to need to do this. Given the existence of the INTERVAL '1' MONTH etc syntax is completely undocumented, I doubt it's a major problem. (Actually, INTERVAL '1' and INTERVAL '1:30' are undocumented, too, but it's probably more likely that people are relying on that). -roy
Roy Badami wrote: > > Bruce> This is going to be a backward compatibility problem, but > Bruce> to support ANSI syntax we are going to need to do this. > > Given the existence of the INTERVAL '1' MONTH etc syntax is completely > undocumented, I doubt it's a major problem. > > (Actually, INTERVAL '1' and INTERVAL '1:30' are undocumented, too, but > it's probably more likely that people are relying on that). Yes, I am thinking that interval strings that contain letters are going to be PG, and anything else is ANSI. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > o Interpret syntax that isn't uniquely ANSI or PG, like '1:30' or > '1' as ANSI syntax, e.g. interpret '1:30' MINUTE TO SECOND as > '1 minute 30 seconds' > This is going to be a backward compatibility problem, but to support > ANSI syntax we are going to need to do this. We should do that *only* when an ISO (not ANSI) interval constraint is present. So plain INTERVAL '1' would still be read under PG rules. That would eliminate the backward-compatibility problem pretty well (since the constraints aren't documented and hence aren't being used now), while not posing a big problem for ISO cases (since if there's no constraint there are no ambiguous cases, I believe --- the ISO syntax would require all fields to be present). regards, tom lane
Tom> That would eliminate the backward-compatibility problem Tom> pretty well (since the constraints aren't documented and Tom> hence aren't being used now), while not posing a big problem Tom> for ISO cases (since if there's no constraint there are no Tom> ambiguous cases, I believe --- the ISO syntax would require Tom> all fields to be present). The 'constraint' (interval type descriptor or whatever it's really called) is mandatory in standard SQL, I think, so there's no ambiguity anyway, unless anyone is using this undocumented syntax at the moment...
Roy> The 'constraint' (interval type descriptor or whatever it's Roy> really called) is mandatory in standard SQL, I think, so Roy> there's no ambiguity anyway, unless anyone is using this Roy> undocumented syntax at the moment... Incidentally, this was the ratinale behind my earlier suggestion, that: * if the interval type descriptor is absent, parse the interval as a traditional postgres interval * if the interval type descriptor is present, parse the interval according to the standard syntax I have no objection to allowing things like '1 hour 10 minutes' DAY TO SECOND but I'm just wondering whether the hybrid syntax is an unnecessary complication. -roy
Roy> I have no objection to allowing things like Roy> '1 hour 10 minutes' DAY TO SECOND Roy> but I'm just wondering whether the hybrid syntax is an Roy> unnecessary complication. And furthermore, if you really want to allow constrained postgres syntax interval literals (and I can't at the moment see how constraining a literal is useful) then why *not* use the syntax Brian suggested: INTERVAL DAY TO SECOND '1 hour 10 minutes' This keeps the non-standard postgres syntax and the SQL standard syntax nicely separate... -roy
Tom> We should do that *only* when an ISO (not ANSI) Just to clarify, is that a distinction or just a clarification? (ie are ANSI and ISO SQL different?) -roy
Roy Badami <roy@gnomon.org.uk> writes: > Roy> The 'constraint' (interval type descriptor or whatever it's > Roy> really called) is mandatory in standard SQL, True. (<interval qualifier> is what SQL99 calls it.) > I have no objection to allowing things like > '1 hour 10 minutes' DAY TO SECOND > but I'm just wondering whether the hybrid syntax is an unnecessary > complication. In the context of interval literals it's probably unnecessary, but that's not the only thing to worry about. In particular we have to consider the behavior of the input and output routines for cases like COPY. I think it would be really bad to reject '1 hour 10 minutes' as data input into an interval field just because it has an ISO qualifier. Also, I would personally prefer to see the output from an interval field remain in the Postgres format, precisely because the ISO format is ambiguous without knowledge of the qualifier. (Possibly we should create a DateStyle-like GUC to determine that, but so far no one's requested one.) regards, tom lane
Roy Badami <roy@gnomon.org.uk> writes: > Tom> We should do that *only* when an ISO (not ANSI) > Just to clarify, is that a distinction or just a clarification? (ie > are ANSI and ISO SQL different?) There is no "ANSI SQL". Only ISO SQL. ANSI never issued their own spec to my knowledge ... unlike, say, C, where I believe ANSI C was in fact issued by ANSI. regards, tom lane
Tom> In the context of interval literals it's probably Tom> unnecessary, but that's not the only thing to worry about. Tom> In particular we have to consider the behavior of the input Tom> and output routines for cases like COPY. I think it would be Tom> really bad to reject '1 hour 10 minutes' as data input into Tom> an interval field just because it has an ISO qualifier. Hmm, but COPY is non-standard, so I'd be happy that it insisted on postgres interval syntax. ANSI interval syntax is confusing in this context, precisely because there is nowhere to actually put an 'interval qualifier' in the literals. Otherwise the fact that ALTERing a table to add a constraint will completely change the semantices of the COPYing data into the table worries me. I don't think that conceptually the qualifier on an interval type is really the same thing as the qualifier on an interval literal. Tom> Also, I would personally prefer to see the output from an Tom> interval field remain in the Postgres format Seconded. But that's fine, too, I think. Pretty much everything about an interactive SQL session is implementation defined. If people are using embedded SQL, or one of the libraries, than isn't it a non-issue? -roy
Tom Lane wrote: > There is no "ANSI SQL". Only ISO SQL. ANSI never issued their own > spec to my knowledge ... unlike, say, C, where I believe ANSI C was > in fact issued by ANSI. There is an ANSI SQL standard ANSI X3.135. At least my copy of SQL 1999 carries a combined ANSI/ISO label. Nevertheless, referring to ISO seems more appropriate. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Roy Badami <roy@gnomon.org.uk> writes: > Tom> In particular we have to consider the behavior of the input > Tom> and output routines for cases like COPY. > Hmm, but COPY is non-standard, so I'd be happy that it insisted on > postgres interval syntax. It's not different from INSERT INTO foo VALUES('1 year 1 month'); Nothing nonstandard about that that I can see. > ANSI interval syntax is confusing in this > context, precisely because there is nowhere to actually put an > 'interval qualifier' in the literals. Yes. The ISO design for the datatype is pretty brain-dead if you ask me --- the basic meaning of a data literal shouldn't be so dependent on context. Still, it's there, and we should make some effort towards supporting all but the really awfulest parts of it ;-) regards, tom lane
Tom> It's not different from Tom> INSERT INTO foo VALUES('1 year 1 month'); Tom> Nothing nonstandard about that that I can see. Oh, does ISO allow a cast from a string to an interval? Yes, I think it probably does. And with the interpretation of the string is dependent on the ISO interval type, just like for literals? Hmm, yes, I think it probably does... That convinces me that you have to figure out what to do with CAST ('1 year 1 month' AS INTERVAL MONTH) It _still_ doesn't completely convince me that you need to (or even want to) support hybrid interval syntax. Though given you have to support both pg and ISO syntax in the cast, I guess it becomes more consistent (and less implementation work) to support both in the literal, too... I'm concinced, I think. (For the moment :) -roy
Tom> Yes. The ISO design for the datatype is pretty brain-dead if Tom> you ask me --- the basic meaning of a data literal shouldn't Tom> be so dependent on context. Arguably it's not, because the interval qualifier is an intrinsic (and mandatory) part of the literal syntax, so the full ISO interval is completely unambigous. Where you run into problems is where you are casting strings to intervals, in which case the way the string is parsed is (rather unncessarily) dependent on the exact type of the interval. This area certainly seems (over) complex, although you can see the logic behind the design... -roy
Roy Badami <roy@gnomon.org.uk> writes: > That convinces me that you have to figure out what to do with > CAST ('1 year 1 month' AS INTERVAL MONTH) > It _still_ doesn't completely convince me that you need to (or even > want to) support hybrid interval syntax. > Though given you have to support both pg and ISO syntax in the cast, I > guess it becomes more consistent (and less implementation work) to > support both in the literal, too... Right. It's the same underlying routines; it would actually be pretty hard to make the above cast behave differently from input into a column of INTERVAL MONTH type, or from the interval-literal case. In all these cases we are taking a string and cramming it into the input conversion function for the INTERVAL datatype. The conversion function has access to the typmod (ie the encoded <interval qualifier>) but it doesn't know much more about the context than that. regards, tom lane
Roy Badami <roy@gnomon.org.uk> writes: > Tom> Yes. The ISO design for the datatype is pretty brain-dead if > Tom> you ask me --- the basic meaning of a data literal shouldn't > Tom> be so dependent on context. > Arguably it's not, because the interval qualifier is an intrinsic (and > mandatory) part of the literal syntax, so the full ISO interval is > completely unambigous. Well, it's okay *in the context of an interval literal*, where everything you need to know is right there, even if some of it is inside the quotes and some isn't. The problem comes as soon as you consider external data representations --- if you consider the bit inside the quotes to be the preferred textual representation, then you have a situation where you can't interpret the value without context information that's not directly attached to the data. This is why I say it's ambiguous. As you point out, the ISO spec doesn't actually say that that string literal has to be the external textual representation ... but it's surely pretty bizarre that it's not a good candidate to be the textual representation. regards, tom lane
Tom> As you point out, the ISO spec doesn't actually say that that Tom> string literal has to be the external textual representation Tom> ... but it's surely pretty bizarre that it's not a good Tom> candidate to be the textual representation. Yes, I'm convicned. Particularly after considering casts of strings to intervals... -roy
Tom Lane wrote: > Roy Badami <roy@gnomon.org.uk> writes: > > Tom> We should do that *only* when an ISO (not ANSI) > > Just to clarify, is that a distinction or just a clarification? (ie > > are ANSI and ISO SQL different?) > > There is no "ANSI SQL". Only ISO SQL. ANSI never issued their own > spec to my knowledge ... unlike, say, C, where I believe ANSI C was > in fact issued by ANSI. TODO updated. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Roy Badami wrote: > Roy> The 'constraint' (interval type descriptor or whatever it's > Roy> really called) is mandatory in standard SQL, I think, so > Roy> there's no ambiguity anyway, unless anyone is using this > Roy> undocumented syntax at the moment... > > Incidentally, this was the ratinale behind my earlier suggestion, that: > > * if the interval type descriptor is absent, parse the interval as a > traditional postgres interval > > * if the interval type descriptor is present, parse the interval according > to the standard syntax > > I have no objection to allowing things like > > '1 hour 10 minutes' DAY TO SECOND > > but I'm just wondering whether the hybrid syntax is an unnecessary > complication. Added to TODO: o For syntax that isn't uniquely ISO or PG syntax, like '1:30' or '1', treat as ISO if there is a range specification clause, and as PG if there no clause is present, e.g. interpret '1:30' MINUTE TO SECOND as '1 minute 30 seconds', and interpret '1:30' as '1 hour, 30 minutes' This brings up the issue of how we are to output interval values. Currently we do: test=> select interval '1 hour 1 minute'; interval ---------- 01:01:00 (1 row) and test=> select interval '1 hour 1 second' hour to minute; interval ---------- 01:00:00 (1 row) so I think we are OK because we don't output ambiguous syntax. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
test=> select interval '1 hour 1 minute'; interval ---------- 01:01:00 (1 row) Hmm, I don't think I really like having a seconds field in the output, given that the column is by definition only storing data to a precision of a minute. -roy
Roy Badami <roy@gnomon.org.uk> writes: > test=> select interval '1 hour 1 minute'; > interval > ---------- > 01:01:00 > (1 row) > Hmm, I don't think I really like having a seconds field in the output, > given that the column is by definition only storing data to a > precision of a minute. Leaving out the seconds would make the display ambiguous. regards, tom lane
Tom Lane wrote: > Roy Badami <roy@gnomon.org.uk> writes: > > test=> select interval '1 hour 1 minute'; > > interval > > ---------- > > 01:01:00 > > (1 row) > > > Hmm, I don't think I really like having a seconds field in the output, > > given that the column is by definition only storing data to a > > precision of a minute. > > Leaving out the seconds would make the display ambiguous. Agreed. What is really weird is that the time is always displayed for a zero value: test=> select interval '0 years' year; interval ---------- 00:00:00 (1 row) but a non-zero shows the proper units: test=> select interval '1 years' year; interval ---------- 1 year (1 row) Is that OK? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Agreed. What is really weird is that the time is always displayed for a > zero value: > test=> select interval '0 years' year; > interval > ---------- > 00:00:00 > (1 row) > but a non-zero shows the proper units: > test=> select interval '1 years' year; > interval > ---------- > 1 year > (1 row) > Is that OK? Well, it's a bit arbitrary, but I suppose it was done because otherwise a zero interval would show nothing at all ... regards, tom lane
Bruce Momjian wrote: > This is going to be a backward compatibility problem You say that as if it's a bad thing. In a previous thread, I think Bruce and Tom both commented on ripping out some of the weird undocumented interval behavior: http://archives.postgresql.org/pgsql-patches/2003-09/msg00134.php http://archives.postgresql.org/pgsql-patches/2003-09/msg00123.php There be a lot of dragons in the PostgreSQL interval syntax. Some examples from that old thread: Why is '0.001 years'::interval less than '0.001 months'::interval or While does PostgreSQL think the interval '1Y1M'::interval' means "1 year and one minute, which is confusing because the very similar ISO 8601 time interval 'P1Y1M' means "1 year and one month" to the ISO-8601 spec? At some point I think breaking backward computability for some of the weird undocumented behavior of PostgreSQL's interval syntax would be a good thing. Or perhaps a GUC variable for IntervalStyle kinda like how DateStyle lets you pick ISO or SQL or Postgres styles - but that's ugly in different ways.... Ron Personally I avoid these problems by still using this hack (a rejected patch http://archives.postgresql.org/pgsql-patches/2003-12/msg00196.php ) that allows ISO 8601 "Time Intervals With Time Unit Designators" in either ISO-8601's "basic format" or "extended format". If anyone's wants the patch for ISO-8601 ( ftp://ftp.qsl.net/pub/g1smd/154N362_.PDF) intervals, let me know and I can send a version ported to 8.X. In my mind ISO-8601 intervals actually make sense while the PostgreSQL intervals and the ISO-SQL intervals are confusing as heck.