Thread: Re: Problems with avg on interval data type
> We have recently upgraded from 7.0.3 to 7.1 and a query which used > to work is no longer working. > The query does an avg on an interval column and now gets the error: > ERROR: Bad interval external representation '0' OK, there is one case of interval constant which is not handled correctly in the 7.1.x release -- the simplest interval specification having only an unadorned integer. That is a bug, for which I have a patch (or patches) available. Before I post the patch (which should go into the 7.1.2 release as a bug fix) I need feedback on a conventions dilemma, which led to the code modifications which introduced the bug. Here it is: Intervals usually indicate a time span, and can be specified with either "# units" strings (e.g. '5 hours') or (as of 7.1) as "hh:mm:ss" (e.g. '05:00'). A new construct, "a_expr AT TIME ZONE c_expr" is supported in 7.1, per SQL99 spec. One of the possible arguments is a_expr AT TIME ZONE 'PST' and a_expr AT TIME ZONE INTERVAL '-08:00' It is this last style which leads to the problem of how to interpret signed or unsigned integers as interval types. For example, in this context INTERVAL '-8' must be interpreted as having units of "hours", while our historical usage has INTERVAL '8' being interpreted as "seconds" (even with signed values). Currently, we interpret various forms as follows: Value Units +8 hours -8 hours 8.0 seconds 8 ?? seconds ?? I would propose that the last example should be interpreted in units of seconds, but that could be perilously close to the conventions required for the signed examples. Comments? - Thomas
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: >> We have recently upgraded from 7.0.3 to 7.1 and a query which used >> to work is no longer working. >> The query does an avg on an interval column and now gets the error: >> ERROR: Bad interval external representation '0' > OK, there is one case of interval constant which is not handled > correctly in the 7.1.x release -- the simplest interval specification > having only an unadorned integer. That is a bug, for which I have a > patch (or patches) available. I have modified the declaration of avg(interval) to initialize its accumulator as '0 second' instead of just '0', so as far as the aggregate goes it's not necessary to consider this a bug. > Currently, we interpret various forms as follows: > Value Units > +8 hours > -8 hours > 8.0 seconds > 8 ?? seconds ?? > I would propose that the last example should be interpreted in units of > seconds, but that could be perilously close to the conventions required > for the signed examples. Comments? Yipes. I do not like the idea that '+8' and '8' yield radically different results. That's definitely going to create unhappiness. I suggest that the current code is more correct than you think ;-). ISTM it is a good idea to require a units field, or at least some punctuation giving a clue about units --- for example I do not object to '08:00' being interpreted as hours and minutes. But I would be inclined to reject all four of the forms '+8', '-8', '8.0', and '8' as ambiguous. Is there something in the SQL spec that requires us to accept them? regards, tom lane
> > Value Units > > +8 hours > > -8 hours > > 8.0 seconds > > 8 ?? seconds ?? > > I would propose that the last example should be interpreted in units of > > seconds, but that could be perilously close to the conventions required > > for the signed examples. Comments? > Yipes. I do not like the idea that '+8' and '8' yield radically > different results. That's definitely going to create unhappiness. Yeah, I agree. The ugliness is that an unsigned integer has been accepted in the past as "seconds", and would seem to be a reasonable assumption. > I suggest that the current code is more correct than you think ;-). > ISTM it is a good idea to require a units field, or at least some > punctuation giving a clue about units --- for example I do not object to > '08:00' being interpreted as hours and minutes. But I would be inclined > to reject all four of the forms '+8', '-8', '8.0', and '8' as ambiguous. > Is there something in the SQL spec that requires us to accept them? Single-field signed integers (and unsigned integers?) must be acceptable for a time zone specification (pretty sure this is covered in the SQL spec). Remember that SQL is woefully inadequate for date, time, and time zone support, but afaicr a signed integer is one of the few things they do specify ;) - Thomas
Tom Lane writes: > I suggest that the current code is more correct than you think ;-). > ISTM it is a good idea to require a units field, or at least some > punctuation giving a clue about units --- for example I do not object to > '08:00' being interpreted as hours and minutes. But I would be inclined > to reject all four of the forms '+8', '-8', '8.0', and '8' as ambiguous. > Is there something in the SQL spec that requires us to accept them? Our interval is quite a bit different from the SQL version. In SQL, an interval value looks like this: INTERVAL -'5 12:30:15.3' DAY TO SECOND The unit qualifier is required. Consequentially, I would reject anything without units, except '0' maybe. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: >> I suggest that the current code is more correct than you think ;-). >> ISTM it is a good idea to require a units field, or at least some >> punctuation giving a clue about units --- for example I do not object to >> '08:00' being interpreted as hours and minutes. But I would be inclined >> to reject all four of the forms '+8', '-8', '8.0', and '8' as ambiguous. >> Is there something in the SQL spec that requires us to accept them? > Single-field signed integers (and unsigned integers?) must be acceptable > for a time zone specification (pretty sure this is covered in the SQL > spec). But surely there is other context cuing you that the number is a timezone? In any case, you weren't proposing that interval_in should accept '8' as a timezone ... regards, tom lane
> > Single-field signed integers (and unsigned integers?) must be acceptable > > for a time zone specification (pretty sure this is covered in the SQL > > spec). > But surely there is other context cuing you that the number is a > timezone? In any case, you weren't proposing that interval_in > should accept '8' as a timezone ... In the particular case I mentioned, any context is gone by the time the parser gets beyond gram.y (and that is well before the constant is evaluated). The general point is that in this case (and perhaps some other cases) there is a need to convert an interval into a time zone, so the specification of either or both had better be self consistant. We do not have an explicit timezone type which could make different assumptions about units on unadorned integers (and I'd like to avoid defining a new type for this purpose since SQL9x seems to think that "interval" should be enough). I'd also like to avoid *requiring* the full brain damage of SQL9x interval specifications such as Peter mentioned; we may support it, but should not require it since it is truely horrid. - Thomas