Thread: Interval input: usec, msec
neilc=# select '1 day 1 millisecond'::interval; interval -------------------- 1 day 00:00:00.001 (1 row) neilc=# select '1 millisecond'::interval; ERROR: invalid input syntax for type interval: "1 millisecond" neilc=# select '0.001 seconds'::interval; interval -------------- 00:00:00.001 (1 row) Similarly for "microsecond". I think allowing the first input but not the second is pretty inconsistent. Attached is a patch that fixes this for both milliseconds and microseconds. The previous coding allowed "x seconds y milliseconds" to be specified, so this patch preserves that behavior. It *also* allows "x seconds y seconds" as valid input, although you could make a consistency argument for that anyway: (with CVS HEAD, unpatched) neilc=# select '5 days 10 days'::interval; interval ---------- 15 days neilc=# select '1 week 2 week 3 weeks'::interval; interval ---------- 42 days neilc=# select '1 second 2 second'::interval; ERROR: invalid input syntax for type interval: "1 second 2 second" Is there any reason to why DecodeInterval() is willing to accept multiple specifications for some time units but not others? -Neil
Attachment
Neil Conway <neilc@samurai.com> writes: > Is there any reason to why DecodeInterval() is willing to accept > multiple specifications for some time units but not others? I'd argue that it's an oversight. I don't have a problem with adding up the values of units that really translate to the same thing (eg, '1 week 1 day' -> '8 days'), but I think '1 second 2 second' should be rejected because it's almost certainly user error. Does your patch allow '1 millisec 2 microsec', which should be allowed by this argument? I suspect that to make it work unsurprisingly, we'd need to allocate a distinct tmask bit to each logically distinct unit. (Maybe shortage of tmask bits is why the code is like it is?) regards, tom lane
On Mon, 2007-28-05 at 10:50 -0400, Tom Lane wrote: > I'd argue that it's an oversight. I don't have a problem with adding up > the values of units that really translate to the same thing (eg, > '1 week 1 day' -> '8 days'), but I think '1 second 2 second' should > be rejected because it's almost certainly user error. I don't see why "1 week 1 week" is any less likely to be user error than "1 second 1 second". > Does your patch allow '1 millisec 2 microsec', which should be allowed > by this argument? Yes: in order to allow the above input, the straightforward coding also allows "1 second 2 second". > I suspect that to make it work unsurprisingly, we'd > need to allocate a distinct tmask bit to each logically distinct unit. Okay, attached is a patch that does this. To summarize, the changes are: * add tmask bits for msec, usec (I reordered the #defines to keep them logically contiguous, but AFAICS this isn't necessary) * if the user specifies multiple instances of usec, msec, or sec, reject as invalid input * if the user specifies a fractional second ("5.5 seconds"), also consider that to be millisecond and microsecond input from the POV of rejecting duplicate units. So "5.5 seconds 1 millisecond" will be rejected. Docs and regression tests updated. If people are happy with the above behavior, I'll commit this to HEAD shortly (today/tomorrow), and perhaps backport it: not accepting "1 msec" and similar inputs is a clear bug, IMHO. BTW, does anyone know why a few of the regression tests (tinterval, point, geometry, etc.) explicitly disable and then re-enable GEQO? AFAICS the regression tests are just doing fairly routine two-table joins, so GEQO will not be invoked with a sane configuration anyway. -Neil
Attachment
Neil Conway <neilc@samurai.com> writes: > On Mon, 2007-28-05 at 10:50 -0400, Tom Lane wrote: >> I'd argue that it's an oversight. I don't have a problem with adding up >> the values of units that really translate to the same thing (eg, >> '1 week 1 day' -> '8 days'), but I think '1 second 2 second' should >> be rejected because it's almost certainly user error. > I don't see why "1 week 1 week" is any less likely to be user error than > "1 second 1 second". Right. I guess you misunderstood me: I was arguing for rejecting double occurrences of the same unit name, but not occurrences of different unit names that we happen to map into the same interval field internally. IOW the behavior ought to be predictable without knowing which unit names map to the same field. > * add tmask bits for msec, usec (I reordered the #defines to keep > them logically contiguous, but AFAICS this isn't necessary) I forget --- are the tmask bits used in stored typmod values for intervals? It'd probably be best not to change the meanings of typmod bits, since those are visible to client code if it wants to look. > BTW, does anyone know why a few of the regression tests (tinterval, > point, geometry, etc.) explicitly disable and then re-enable GEQO? Hmmm ... if you check the cvs history for those tests you might find some evidence. regards, tom lane
On Mon, 2007-28-05 at 15:05 -0400, Tom Lane wrote: > Right. I guess you misunderstood me: I was arguing for rejecting double > occurrences of the same unit name, but not occurrences of different unit > names that we happen to map into the same interval field internally. Makes sense to me. I'll send a patch for this shortly (but I'll commit the patch for the usec/msec issue first -- this second change shouldn't be backpatched). > I forget --- are the tmask bits used in stored typmod values for > intervals? It'd probably be best not to change the meanings of typmod > bits, since those are visible to client code if it wants to look. According to datetime.h, only some of the tmask bits are used in stored typmods: YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND, and the patch doesn't modify any of those. So I think we're okay. > Hmmm ... if you check the cvs history for those tests you might find > some evidence. "Modify regression tests to allow GEQ optimizer (order results).", according to the 1997 CVS commit from Thomas Lockhart that added the lines in question. Presumably that is no longer relevant. Nothing unexpected happens if the disabling of GEQO is removed, so I'll apply the attached patch shortly barring any objections. -Neil
Attachment
Neil Conway <neilc@samurai.com> writes: > On Mon, 2007-28-05 at 15:05 -0400, Tom Lane wrote: >> Hmmm ... if you check the cvs history for those tests you might find >> some evidence. > "Modify regression tests to allow GEQ optimizer (order results).", > according to the 1997 CVS commit from Thomas Lockhart that added the > lines in question. Presumably that is no longer relevant. Some digging in the CVS history shows that the parameter now known as geqo_threshold didn't exist until here: 1997-05-31 22:55 momjian * src/: backend/optimizer/path/allpaths.c, include/optimizer/geqo.h: Enable GEQO for more than six tables, from Martin. so at the time Tom was griping about this, GEQO would indeed have randomized the results for even two-table joins. > Nothing > unexpected happens if the disabling of GEQO is removed, so I'll apply > the attached patch shortly barring any objections. No objection here. regards, tom lane
On Mon, 2007-28-05 at 13:54 -0400, Neil Conway wrote: > Okay, attached is a patch that does this. To summarize, the changes are: > > * add tmask bits for msec, usec (I reordered the #defines to keep > them logically contiguous, but AFAICS this isn't necessary) > * if the user specifies multiple instances of usec, msec, or sec, > reject as invalid input > * if the user specifies a fractional second ("5.5 seconds"), also > consider that to be millisecond and microsecond input from the > POV of rejecting duplicate units. So "5.5 seconds 1 millisecond" > will be rejected. Applied to HEAD, backported to 8.2 and 8.1 After I applied the patches, it occurred to me that this patch actually slightly changes previous behavior: the previous coding accepted inputs like "1 microsecond 1 microsecond 1 second", whereas the patch rejects this input. I don't think this is a major problem (input of this kind is likely a client bug), but I figured I'd mention it... -Neil
On May 29, 2007, at 0:06 , Neil Conway wrote: > Applied to HEAD, backported to 8.2 and 8.1 One thing I noticed when looking over the patch is that there are a few bare numbers in datetime.c such as 100000, 1000, 1e-3, and 1e-6. In timestamp.[hc] we've defined macros for conversions such as #define #define USECS_PER_SEC INT64CONST(1000000) I'd like to work up a patch that would add similar macros for datetime.c, in particular using the INT64CONST construction where appropriate. Thoughts? Michael Glaesemann grzm seespotcode net