Thread: Bug in date arithmetic
Folks, While debugging an error with Aziz (postgres_newbie) Sharief in the #postgresql IRC channel, I found a major POLA violation: $ psql Welcome to psql 8.3.7, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g orterminate with semicolon to execute query \q to quit shackle@shackle:5432=# SELECT now() + '900000'; ?column? -------------------------------2009-09-03 19:03:43.195514-07 (1 row) shackle@shackle:5432=# SELECT now() - '900000'; ERROR: date/time field value out of range: "900000" HINT: Perhaps you need a different "datestyle" setting. I'd be tempted to call it a flat-out bug in the first case. Adding a string literal to a timestamptz should just result in an error, IMHO. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > While debugging an error with Aziz (postgres_newbie) Sharief in the > #postgresql IRC channel, I found a major POLA violation: I see no bug here. There is only one '+' operator with timestamptz as left input, and it is timestamptz plus interval, so the system takes the unknown literal as an interval. Possibly it should throw error instead of assuming that the unmarked value is in seconds, but I'll bet money that people are depending on that longstanding behavior. As for the other case, there are two possible interpretations: regression=# select oid::regoperator from pg_operator where oprname = '-' and oprleft = 'timestamptz'::regtype; oid -------------------------------------------------------(timestamp with time zone,timestamp with time zone)-(timestamp withtime zone,interval) (2 rows) and the first one is preferred due to an ancient and generally correct heuristic. I'm not sure why it's complaining about field overflow rather than syntax error when the literal is taken as a timestamp, but that's a pretty minor issue. regards, tom lane
On Mon, Aug 24, 2009 at 09:12:07AM -0700, David Fetter wrote: > shackle@shackle:5432=# SELECT now() + '900000'; > ?column? > ------------------------------- > 2009-09-03 19:03:43.195514-07 > (1 row) > > shackle@shackle:5432=# SELECT now() - '900000'; > ERROR: date/time field value out of range: "900000" > HINT: Perhaps you need a different "datestyle" setting. > > I'd be tempted to call it a flat-out bug in the first case. Adding a > string literal to a timestamptz should just result in an error, IMHO. But you're not adding a string literal, you're adding an interval of 900000 seconds, or 250 hours! This is the weirdness resulting from the lack of difference in syntax between string literals and other literals. i.e. SELECT timestamptz '2000-01-01'; => 2000-01-01 00:00:00+00 SELECT timestamptz '2000-01-01' + '900000'; => 2000-01-1110:00:00+00 SELECT timestamptz '2000-01-01' + interval '900000'; => 2000-01-11 10:00:00+00 SELECT timestamptz'2000-01-01' + 900000; => no operator matches (rhs is of type INT) SELECT timestamptz '2000-01-01' + text '900000'; => no operator matches (rhs is of type TEXT) It seems to be deciding the unknown type is of type date with subtraction for some reason: SELECT timestamptz '2000-01-01' + '900000'; => date/time field value out of range SELECT timestamptz '2000-01-01' - date'900000'; => date/time field value out of range Whereas you were expecting it to be using an interval as before: SELECT timestamptz '2000-01-01' - interval '900000'; => 1999-12-21 14:00:00+00 A "timestamptz + date" operator doesn't exist though, so it wouldn't wouldn't be able to pick it and is why you were seeing this odd behavior. Not sure what good fixes would be; a couple of simple (and bad) ones would be: 1) assume literals are of type text, unless otherwise specified. 2) require the type of the literal to be specified if there is any ambiguity The second option is nicer, but defining "any ambiguity" doesn't seem possible in general; worse, it would seem to change over time as operators/types were added/removed and would cause things that used to work to start breaking. Then again they would now, so I'm not sure why this would be worse. -- Sam http://samason.me.uk/
I wrote: > ... I'm not sure why it's complaining about field overflow > rather than syntax error when the literal is taken as a timestamp, > but that's a pretty minor issue. Oh, of course, it's because we allow this shorthand: regression=# select '900102'::timestamptz; timestamptz ------------------------1990-01-02 00:00:00-05 (1 row) so '900000'::timestamptz is seen as year (19)90, month 00, day 00, and "field out of range" is entirely sensible for that. Just out of curiosity, what were you *expecting* this to do? You obviously weren't expecting the literal to be taken as interval, but its contents are not very sane for any other likely interpretation. regards, tom lane
On Mon, Aug 24, 2009 at 01:18:46PM -0400, Tom Lane wrote: > I wrote: > > ... I'm not sure why it's complaining about field overflow > > rather than syntax error when the literal is taken as a timestamp, > > but that's a pretty minor issue. > > Oh, of course, it's because we allow this shorthand: > > regression=# select '900102'::timestamptz; > timestamptz > ------------------------ > 1990-01-02 00:00:00-05 > (1 row) > > so '900000'::timestamptz is seen as year (19)90, month 00, day 00, > and "field out of range" is entirely sensible for that. > > Just out of curiosity, what were you *expecting* this to do? > You obviously weren't expecting the literal to be taken as > interval, but its contents are not very sane for any other > likely interpretation. The gentleman in IRC was the one who was using the construct. I spell out my date arithmetic. :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > On Mon, Aug 24, 2009 at 01:18:46PM -0400, Tom Lane wrote: >> Just out of curiosity, what were you *expecting* this to do? > The gentleman in IRC was the one who was using the construct. Well, what did he think it would do? If it's a date it's invalid, and if it's not a date I'm not sure what he expected... regards, tom lane
On Mon, Aug 24, 2009 at 02:05:38PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > On Mon, Aug 24, 2009 at 01:18:46PM -0400, Tom Lane wrote: > >> Just out of curiosity, what were you *expecting* this to do? > > > The gentleman in IRC was the one who was using the construct. > > Well, what did he think it would do? If it's a date it's invalid, > and if it's not a date I'm not sure what he expected... I apologize for being obtuse. He expected '-' to subtract seconds, just as '+' added them. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > He expected '-' to subtract seconds, just as '+' added them. Ah. Well, what it boils down to is that in a scenario like known_type_expr operator unknown_literal we preferentially consider unknown_literal to be of the same type as the other operand. But there's no timestamp plus timestamp operator (doesn't seem like a sensible operation to me, does it to you?), so there's only one possible typing of the unknown literal, and it's not that. Plus and minus just aren't all that symmetrical in this situation. regards, tom lane
On Mon, Aug 24, 2009 at 02:31:35PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > He expected '-' to subtract seconds, just as '+' added them. > > Ah. Well, what it boils down to is that in a scenario like > > known_type_expr operator unknown_literal > > we preferentially consider unknown_literal to be of the same type as > the other operand. But there's no timestamp plus timestamp operator > (doesn't seem like a sensible operation to me, does it to you?), Doesn't seem sensible to me, either. > so there's only one possible typing of the unknown literal, and it's > not that. Plus and minus just aren't all that symmetrical in this > situation. I'm thinking that the unknown literal here should just cause an error in the case of '+'. Same with '-', for what it's worth. Cheers, David (There's something happening here. What is ain't exactly clear.) -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, Aug 24, 2009 at 8:27 PM, David Fetter<david@fetter.org> wrote: > I'm thinking that the unknown literal here should just cause an error > in the case of '+'. Same with '-', for what it's worth. That would make a lot of people very unhappy. They expect things like select date + '1 year' to just work. The problem is that they would also like select date - '1year' and select date - '2000-01-01' to work. I wonder if we could get around this by inventing a new type date_or_interval which looks at the input and decides which it is using fairly strict rules. date_sub would take that type and do the appropriate operation based on what the constant had in it. Of course we still want to be able to do date1 - date2 or date1+intervalcolumn so we would need implicit casts from date and interval to this new type. The question is whether we could arrange things so this implicit cast doesn't cause ambiguity elsewhere. -- greg http://mit.edu/~gsstark/resume.pdf
Greg Stark <gsstark@mit.edu> writes: > I wonder if we could get around this by inventing a new type > date_or_interval which looks at the input and decides which it is > using fairly strict rules. date_sub would take that type and do the > appropriate operation based on what the constant had in it. Ick. This would be a pretty enormous blot on the type system to solve one special case, in a manner that would only be helpful to newbies who don't know that they ought to cast to eliminate the ambiguity. In fact it sounds quite a bit like the implicit-cast-to-text situations we just finished getting rid of ... regards, tom lane
On Mon, Aug 24, 2009 at 09:13:09PM +0100, Greg Stark wrote: > On Mon, Aug 24, 2009 at 8:27 PM, David Fetter<david@fetter.org> wrote: > > I'm thinking that the unknown literal here should just cause an > > error in the case of '+'. Same with '-', for what it's worth. > > That would make a lot of people very unhappy. They expect things > like > > select date + '1 year' > > to just work. We broke a lot more things than this when we got rid of implicit casts to TEXT. :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Greg Stark <gsstark@mit.edu> wrote: > David Fetter<david@fetter.org> wrote: >> I'm thinking that the unknown literal here should just cause an >> error in the case of '+'. Same with '-', for what it's worth. > > That would make a lot of people very unhappy. They expect things > like > > select date + '1 year' > > to just work. I realize I'm in a minority on this, but I would also prefer an error. I expect things like SELECT "date" + (INTERVAL '1' YEAR) to just work. This whole business of taking a perfectly explicit character string literal and treating it as an unknown literal until something in the surrounding context causes it to automagically resolve to any type kinda makes my skin crawl. I'd love to have a GUC to resolve character string literals to text when they're not in a context which makes them part of some other standard literal declaration. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > I realize I'm in a minority on this, but I would also prefer an error. > I expect things like > SELECT "date" + (INTERVAL '1' YEAR) > to just work. Uh, I think you're confused. That certainly works, and I didn't hear anyone proposing to change it. The issue is about undecorated literals. If we start throwing errors for those, the fallout will make the 8.3 implicit-cast changes look like a day at the beach. I believe that it would also violate the SQL spec in numerous places --- whether you like it or not, the concept of context-dependent type resolution is built into the standard. regards, tom lane
On Mon, Aug 24, 2009 at 07:48:06PM -0400, Tom Lane wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > > I realize I'm in a minority on this, but I would also prefer an > > error. I expect things like > > > > SELECT "date" + (INTERVAL '1' YEAR) > > > > to just work. > > Uh, I think you're confused. That certainly works, and I didn't > hear anyone proposing to change it. The issue is about undecorated > literals. If we start throwing errors for those, the fallout will > make the 8.3 implicit-cast changes look like a day at the beach. I > believe that it would also violate the SQL spec in numerous places > --- whether you like it or not, the concept of context-dependent > type resolution is built into the standard. As far as you can tell, does the standard speak to adding an untyped literal to a time format? The draft standard I have here lists, "Valid operators involving datetimes and intervals" as, Operand 1 Operator Operand 2 Result Type Datetime – Datetime Interval Datetime + or – Interval Datetime Interval + Datetime Datetime Interval + or – Interval Interval Interval * or / Numeric Interval Numeric * Interval Interval It's not crystal clear to me whether any type coercion behavior is mandated here, or which kind, if there is some. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Tom, > we preferentially consider unknown_literal to be of the same type > as the other operand. I can't really think of what other assumption we would make. Any time a user doesn't specify a type, they're taking pot luck. Me, I always use some_timestamp + INTERVAL 'value' .... -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
David Fetter <david@fetter.org> wrote: > On Mon, Aug 24, 2009 at 07:48:06PM -0400, Tom Lane wrote: >> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> > I realize I'm in a minority on this, but I would also prefer an >> > error. I expect things like >> > >> > SELECT "date" + (INTERVAL '1' YEAR) >> > >> > to just work. >> That certainly works, and I didn't hear anyone proposing to change >> it. Agreed. I never meant to suggest otherwise. >> The issue is about undecorated literals. At least for my part, I was talking about literals decorated with apostrophes -- what the spec calls "character string literals". >> If we start throwing errors for those, the fallout will >> make the 8.3 implicit-cast changes look like a day at the beach. Yeah. I'm sort of resigned to that behavior. I'd be pushing a GUC to make it optional if I thought there was any chance of that being acceptable to the community. I wouldn't consider suggesting a change to the default behavior because of the backwards compatibility issues. >> I believe that it would also violate the SQL spec in numerous >> places --- whether you like it or not, the concept of >> context-dependent type resolution is built into the standard. > > As far as you can tell, does the standard speak to adding an untyped > literal to a time format? The draft standard I have here lists, > "Valid operators involving datetimes and intervals" as, > > Operand 1 Operator Operand 2 Result Type > Datetime * Datetime Interval > Datetime + or * Interval Datetime > Interval + Datetime Datetime > Interval + or * Interval Interval > Interval * or / Numeric Interval > Numeric * Interval Interval > > It's not crystal clear to me whether any type coercion behavior is > mandated here, or which kind, if there is some. Unless there's been some change in recent versions of the spec which I haven't picked up on, the PostgreSQL treatment of character string literals is novel. I don't remember any concept of an "undecorated" character string literal being of type "unknown" and therefore easily taken to be non-character types, outside of the explicit literal declarations for other types where the character string literal is preceded by a keyword to cause special treatment. B'00101010' -- bit string literal X'01FE' -- hexadecimal literal DATE '2009-12-31' -- date literal etc. If there's something I've missed, I'd love for someone to tell me what section of which version of the spec to read. This one is not something I lose any sleep over, though. Our main body of production software uses a framework where we parse the SQL and emit query classes. Our parser is more strict. It's just the ad hoc queries where we see people getting surprised by such issues as the post which started this thread. When they do something like that, once I've figured out their problem, I just point out that I've always recommended the ANSI form for literals. -Kevin