Thread: to_timestamp() and timestamp without time zone
Hernán J. González
On 06/23/2011 09:01 AM, hernan gonzalez wrote: > to_timestamp() returns a TIMESTAMP WITH TIME ZONE > > Perhaps an alternative that returns a TIMESTAMP WITHOUT TIME ZONE > (which, BTW, is the default TIMESTAMP) > should be provided. Elsewhere, there is no direct-robust way of > parsing a TIMESTAMP WITHOUT TIME ZONE (which > represesents a "local date-time" which behaviour should be totally > independent of the timezone set in the server or > session). > > Of course, doing a simple cast like this will work ... "almost" always: > db=# select to_timestamp('2011-12-30 00:30:00','YYYY-MM-DD > HH24:MI:SS')::timestamp without time zone; > to_timestamp > --------------------- > 2011-12-30 00:30:00 > > Here the string is assumed to be the textual representation of a > "local date time" (no timezone specified or assumed, > just "the date and the hour that tell the wall calendar and the wall > clock"), which is parsed/converted to the proper > type (TIMESTAMP WITHOUT TIME ZONE). But what really happens here is > that the string is parsed as a physical > time using an implicit timezone (that of the session), and then, when > casted to a plain timezone, the calendar info > is recomputed (with the same TIMEZONE) and then the timezone info > discarded. This almost always works as expected, > regardless of the session timezone, because the same timezone is used > twice and the dependecy is cancelled... > but not always: > > db=# set TIMEZONE='America/Argentina/Buenos_Aires'; > db=# select to_timestamp('2007-12-30 00:30:00','YYYY-MM-DD > HH24:MI:SS')::timestamp without time zone; > to_timestamp > --------------------- > 2007-12-30 01:30:00 > > This is not, then, a fiable way of parsing a TIMESTAMP [WITHOUT TIME > ZONE] , and I think it's potentially dangerous. > Rather than being not viable, I'd argue that is is not correct. Rather, a simple direct cast will suffice: '2011-12-30 00:30:00'::timestamp without time zone Every feature and function in PostgreSQL is "potentially dangerous" - understanding them and using them correctly is the responsibility of the programmer. Time handling has lots of subtleties that take time to digest. It appears that you would like a timestamp of 2011-12-30 00:30:00 which you can get. But even so, there are places in the world where that time exists and other places in the world that it does not. If you try to force that timestamp into a zone where it doesn't exist, PostgreSQL makes a reasonable interpretation of the intended point in time. Cheers, Steve
Rather than being not viable, I'd argue that is is not correct. Rather, a simple direct cast will suffice:
'2011-12-30 00:30:00'::timestamp without time zone
Every feature and function in PostgreSQL is "potentially dangerous" - understanding them and using them correctly is the responsibility of the programmer. Time handling has lots of subtleties that take time to digest
. It appears that you would like a timestamp of 2011-12-30 00:30:00 which you can get. But even so, there are places in the world where that time exists and other places in the world that it does not.
If you try to force that timestamp into a zone where it doesn't exist, PostgreSQL makes a reasonable interpretation of the intended point in time.
On 06/23/2011 11:40 AM, hernan gonzalez wrote: > Rather than being not viable, I'd argue that is is not correct. > Rather, a simple direct cast will suffice: > '2011-12-30 00:30:00'::timestamp without time zone > > > That works only for that particular format. The point is that, for > example, if I have some local date time > stored as a string in other format ('30/12/2011 00:30:00') I cannot > reliably parse it as a TIMESTAMP. Which I should. Works here. I am in US PDT: select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS ')::timestamp with time zone; to_timestamp ------------------------ 2011-12-30 00:30:00-08 > > Every feature and function in PostgreSQL is "potentially dangerous" > - understanding them and using them correctly is the responsibility > of the programmer. Time handling has lots of subtleties that take > time to digest > > > Thanks for the advice. But it's precisely in the role of a programmer > who has digested a good deal about date-time data and its subtleties, > and who is trying to use in a consistent an robust way date-time data > that I'm asking this question. Or rather, reporting this issue. > > . It appears that you would like a timestamp of 2011-12-30 00:30:00 > which you can get. But even so, there are places in the world where > that time exists and other places in the world that it does not. > > If you try to force that timestamp into a zone where it doesn't > exist, PostgreSQL makes a reasonable interpretation of the intended > point in time. > > > I strongly disagree. I'm not trying "to force that timestamp into a > zone" at all. I'm just telling postgresl to parse the string '30/12/2011 > 00:30:00' as a TIMESTAMP (without time zone), that is, to > parse/understand/store it as the abstract/civil (wall calendar+clock) > local datetime "30 dec 2011, 00 30 00 am" with NO association with a > timezone. Again works here: test=> select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS ')::timestamp without time zone; to_timestamp --------------------- 2011-12-30 00:30:00 Postgreql does not need to interpret anything here, and indeed > it works pefectly with this datetime if I store it in a TIMESTAMP > WITHOUT TIMEZONE (it stores/manipulates it internally as UTC, but the > programmer doesn't care about it, that is internal). Actually that is how timestamp with timezone are stored:) If you don't want to deal with time zones keep tz out of the loop. Store the values in timestamp without time zone. If you at any point store it in a timestamp with timezone or cast it to same you will change the value based on whatever offset is in effect at that time. That is what is supposed to happen. > IT's only this particular function TO_TIMESTAMP() that have this > problem, because it insists in "interpret" the local date time as a > datetime with timezone (and can't even tell it to use UTC). This is just > wrong. Yes you can: test=> select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS ') at time zone 'UTC'; timezone --------------------- 2011-12-30 08:30:00 > Hernán -- Adrian Klaver adrian.klaver@gmail.com
On 06/23/2011 11:40 AM, hernan gonzalez wrote:Works here. I am in US PDT:Rather than being not viable, I'd argue that is is not correct.
Rather, a simple direct cast will suffice:
'2011-12-30 00:30:00'::timestamp without time zone
That works only for that particular format. The point is that, for
example, if I have some local date time
stored as a string in other format ('30/12/2011 00:30:00') I cannot
reliably parse it as a TIMESTAMP. Which I should.
select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS ')::timestamp with time zone;
to_timestamp
------------------------
2011-12-30 00:30:00-08
Again works here:
test=> select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS ')::timestamp without time zone;
to_timestamp
---------------------
2011-12-30 00:30:00
Yes you can:IT's only this particular function TO_TIMESTAMP() that have this
problem, because it insists in "interpret" the local date time as a
datetime with timezone (and can't even tell it to use UTC). This is just
wrong.
test=> select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS ') at time zone 'UTC';
timezone
---------------------
2011-12-30 08:30:00
...snip...On Thu, Jun 23, 2011 at 4:15 PM, Adrian Klaver <adrian.klaver@gmail.com> wrote:On 06/23/2011 11:40 AM, hernan gonzalez wrote:Works here. I am in US PDT:Rather than being not viable, I'd argue that is is not correct.
Rather, a simple direct cast will suffice:
'2011-12-30 00:30:00'::timestamp without time zone
That works only for that particular format. The point is that, for
example, if I have some local date time
stored as a string in other format ('30/12/2011 00:30:00') I cannot
reliably parse it as a TIMESTAMP. Which I should.
select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS ')::timestamp with time zone;
to_timestamp
------------------------
2011-12-30 00:30:00-08My point is to parse a TIMESTAMP WITHOUT TIME ZONE - and that that should NOT depend on the server/session TIMEZONE.Try this:# set TIMEZONE='XXX8';# select to_timestamp('2007-12-30 00:30:00','YYYY-MM-DD HH24:MI:SS')::timestamp;2007-12-30 00:30:00# set TIMEZONE='America/Argentina/Buenos_Aires';select to_timestamp('2007-12-30 00:30:00','YYYY-MM-DD HH24:MI:SS')::timestamp;2007-12-30 01:30:00
Every example here starts, at its core, with to_timestamp. That function returns a timestamp *with* time zone so of-course the current timezone setting will influence it. Stop using it - it doesn't do what you want.
If you cast directly to a timestamp *without* time zone you can take advantage of the many formats PostgreSQL supports.
See: http://www.postgresql.org/docs/9.0/interactive/datatype-datetime.html#DATATYPE-DATETIME-DATE-TABLE for supported formats. Note also that you can use "set datestyle" to match your MDY or DMY date formatting.
If the format you require is so obscure that PostgreSQL can't handle it out-of-the-box (and the one you have presented is completely vanilla), use the many string-handling functions to alter your input as necessary.
Cheers,
Steve
On 06/23/2011 01:07 PM, Steve Crawford wrote: > On 06/23/2011 12:30 PM, hernan gonzalez wrote: >> >> >> On Thu, Jun 23, 2011 at 4:15 PM, Adrian Klaver >> <adrian.klaver@gmail.com <mailto:adrian.klaver@gmail.com>> wrote: >> >> On 06/23/2011 11:40 AM, hernan gonzalez wrote: >> >> Rather than being not viable, I'd argue that is is not correct. >> Rather, a simple direct cast will suffice: >> '2011-12-30 00:30:00'::timestamp without time zone >> >> >> That works only for that particular format. The point is that, for >> example, if I have some local date time >> stored as a string in other format ('30/12/2011 00:30:00') I >> cannot >> reliably parse it as a TIMESTAMP. Which I should. >> >> >> Works here. I am in US PDT: >> >> select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS >> ')::timestamp with time zone; >> >> to_timestamp >> ------------------------ >> 2011-12-30 00:30:00-08 >> >> >> My point is to parse a TIMESTAMP WITHOUT TIME ZONE - and that that >> should NOT depend on the server/session TIMEZONE. >> >> Try this: >> >> # set TIMEZONE='XXX8'; >> # select to_timestamp('2007-12-30 00:30:00','YYYY-MM-DD >> HH24:MI:SS')::timestamp; >> 2007-12-30 00:30:00 >> # set TIMEZONE='America/Argentina/Buenos_Aires'; >> select to_timestamp('2007-12-30 00:30:00','YYYY-MM-DD >> HH24:MI:SS')::timestamp; >> 2007-12-30 01:30:00 > ...snip... > > Every example here starts, at its core, with to_timestamp. That function > returns a timestamp *with* time zone so of-course the current timezone > setting will influence it. Stop using it - it doesn't do what you want. > > If you cast directly to a timestamp *without* time zone you can take > advantage of the many formats PostgreSQL supports. > > See: > http://www.postgresql.org/docs/9.0/interactive/datatype-datetime.html#DATATYPE-DATETIME-DATE-TABLE > for supported formats. Note also that you can use "set datestyle" to > match your MDY or DMY date formatting. > > If the format you require is so obscure that PostgreSQL can't handle it > out-of-the-box (and the one you have presented is completely vanilla), > use the many string-handling functions to alter your input as necessary. Possibly: test=> select (to_date('30/12/2007','DD/MM/YYYY') + '00:30'::time)::timestamp; timestamp --------------------- 2007-12-30 00:30:00 (1 row) > > Cheers, > Steve > -- Adrian Klaver adrian.klaver@gmail.com
On 06/23/2011 01:07 PM, Steve Crawford wrote: > On 06/23/2011 12:30 PM, hernan gonzalez wrote: >> >> >> On Thu, Jun 23, 2011 at 4:15 PM, Adrian Klaver >> <adrian.klaver@gmail.com <mailto:adrian.klaver@gmail.com>> wrote: >> >> On 06/23/2011 11:40 AM, hernan gonzalez wrote: >> >> Rather than being not viable, I'd argue that is is not correct. >> Rather, a simple direct cast will suffice: >> '2011-12-30 00:30:00'::timestamp without time zone >> >> >> That works only for that particular format. The point is that, for >> example, if I have some local date time >> stored as a string in other format ('30/12/2011 00:30:00') I >> cannot >> reliably parse it as a TIMESTAMP. Which I should. >> >> >> Works here. I am in US PDT: >> >> select to_timestamp('30/12/2011 00:30:00', 'DD/MM/YYYY HH24:MI:SS >> ')::timestamp with time zone; >> >> to_timestamp >> ------------------------ >> 2011-12-30 00:30:00-08 >> >> >> My point is to parse a TIMESTAMP WITHOUT TIME ZONE - and that that >> should NOT depend on the server/session TIMEZONE. >> >> Try this: >> >> # set TIMEZONE='XXX8'; >> # select to_timestamp('2007-12-30 00:30:00','YYYY-MM-DD >> HH24:MI:SS')::timestamp; >> 2007-12-30 00:30:00 >> # set TIMEZONE='America/Argentina/Buenos_Aires'; >> select to_timestamp('2007-12-30 00:30:00','YYYY-MM-DD >> HH24:MI:SS')::timestamp; >> 2007-12-30 01:30:00 > ...snip... > > Every example here starts, at its core, with to_timestamp. That function > returns a timestamp *with* time zone so of-course the current timezone > setting will influence it. Stop using it - it doesn't do what you want. > > If you cast directly to a timestamp *without* time zone you can take > advantage of the many formats PostgreSQL supports. > > See: > http://www.postgresql.org/docs/9.0/interactive/datatype-datetime.html#DATATYPE-DATETIME-DATE-TABLE > for supported formats. Note also that you can use "set datestyle" to > match your MDY or DMY date formatting. > > If the format you require is so obscure that PostgreSQL can't handle it > out-of-the-box (and the one you have presented is completely vanilla), > use the many string-handling functions to alter your input as necessary. Possibly: test=> select (to_date('30/12/2007','DD/MM/YYYY') + '00:30'::time)::timestamp; timestamp --------------------- 2007-12-30 00:30:00 (1 row) > > Cheers, > Steve > -- Adrian Klaver adrian.klaver@gmail.com
test=# SET datestyle to DMY;
SET
test=# select '30/12/2011 00:30:00'::timestamp;
timestamp
---------------------
2011-12-30 00:30:00
(1 row)
test=#
Every example here starts, at its core, with to_timestamp. That function returns a timestamp *with* time zone so of-course the current timezone setting will influence it. Stop using it - it doesn't do what you want.
If you cast directly to a timestamp *without* time zone you can take advantage of the many formats PostgreSQL supports.
See: http://www.postgresql.org/docs/9.0/interactive/datatype-datetime.html#DATATYPE-DATETIME-DATE-TABLE for supported formats. Note also that you can use "set datestyle" to match your MDY or DMY date formatting.
> -----Original Message----- > > > > Every feature and function in PostgreSQL is "potentially dangerous" > > - understanding them and using them correctly is the responsibility > > of the programmer. Time handling has lots of subtleties that take > > time to digest > > > > > > Thanks for the advice. But it's precisely in the role of a programmer > > who has digested a good deal about date-time data and its subtleties, > > and who is trying to use in a consistent an robust way date-time data > > that I'm asking this question. Or rather, reporting this issue. > > > > . It appears that you would like a timestamp of 2011-12-30 00:30:00 > > which you can get. But even so, there are places in the world where > > that time exists and other places in the world that it does not. > > > > If you try to force that timestamp into a zone where it doesn't > > exist, PostgreSQL makes a reasonable interpretation of the intended > > point in time. > > > > > > I strongly disagree. I'm not trying "to force that timestamp into a > > zone" at all. I'm just telling postgresl to parse the string > > '30/12/2011 00:30:00' as a TIMESTAMP (without time zone), that is, to > > parse/understand/store it as the abstract/civil (wall calendar+clock) > > local datetime "30 dec 2011, 00 30 00 am" with NO association with a > > timezone. OK, let us try a different approach (summary located at the end for convenience) If you ignore TimeZone then you are saying "EVERY (well-formed) TIME STRING IS VALID". It is only within a TimeZone that specific times can be considered "Invalid" according to the rules for that TimeZone. Or rather, in the absence of TimeZone rules all times are valid. However, to_timestamp(text,text) is a "TimeZone" dependent function and so you cannot pass in a time that is invalid in the "current" TimeZone. The alternative to using "to_timestamp(text,text)" is to "SET datestyle = ..." and using casting; which while valid, is yet another hoop to jump through by the programmer. Since the to_timestamp(text, text) function exists (which could also be emulated by using "SET datestyle") apparently the practice of requiring "SET datestyle" was determined undue hardship and thus the function was written. Fine. If you DO pass in a time that is invalid you can either throw an error or coerce the value into a valid time. Neither decision is best and it would be nice to give the user the ability to choose the desired behavior. Also, the option to "warn" instead of error would be nice. But, given the existing decision to coerce (without warning) instead of throwing an exception the question comes down to: Is there a TimeZone in which ALL TIMES ARE VALID? If there is not then any input into "to_timestamp(text, text)" is potentially invalid and thus at risk for SILENT COERCION to an invalid value. If that is the case; what is the proper and fool-proof method to get the "invalid" timestamp back? If there is you still require the user to both know which TimeZone that is and to issue a "SET TIMEZONE" prior to any call of "to_timestamp(text, text)" if they do indeed want to treat all "well-formed time strings" as being valid. In the example, "December 30, 2007 12:30:00 AM" is not a valid DateTime in "Buenos_Aires" and so the coercion occurs and "1:30:00 AM" is returned instead. The same time in 2006 and 2008 ARE valid and so the issue does not appear if you use those data points. This itself seems strange but as I am not from Argentina whether this is correct or buggy behavior I cannot say. Mind you I am running 9.0.4 on Windows 7 64bit. PostgreSQL currently does not offer a to_timestamp(text, text) like-function that evaluates/returns a simple "Timestamp" as opposed to a "TimestampTZ" - using the supplied function always evaluates the input time relative to the in-session TimeZone and thus could cause the time to be coerced if the input time is invalid in that TimeZone. Even if you are afraid to change the behavior of the existing to_timestamp(text, text) function having TimeZone agnostic functions, that always evaluate relative to the "Safe" TimeZone, and output a "Timestamp Without Time Zone" seems reasonable to avoid having people code a custom function just so they can "SET TIMEZONE = 'UTC'" prior to calling the existing to_timestamp(text, text) function. I do think throwing a warning during coercion would be nice so at least those programs relying on to_timestamp(text, text) will know WHEN it happens and can react accordingly. As for "Time handling has lots of subtleties that take time to digest"; a good programmer and API do their best to minimize the number of hidden subtleties to be learned. Even if the SQL standard doesn't properly address the issue doesn't mean the PostgreSQL implementation shouldn't strive to do things better and make life easier for the programmer. Reading paragraphs of text to learn how something works (and how to work around its limitations) is not as good as seeing multiple groups of functions that are all similar but in which each group provides unique abilities and restrictions. In this case seeing "to_timestamp_local(text, text)" and "to_timestamp(text, text)" would make it much more obvious to the user that special considerations are present when dealing with "timestamptz" (not all times are valid) compared to "timestamp" (all times are considered valid) and that the necessary code to implement the specific behavior is hidden behind each function - whatever that code is. I would maybe even add a "to_timestamp_strict(text, text)" function which handles timestamptz values but errors instead of coercing. The "to_timestamp(text, text)" would coerce but would RAISE NOTICE when it happens so those who care (or who are oblivious) will know about it. I cannot imagine that so many invalid timestamptz values are input as to "overflow the logs" even if no-one is looking - and since it is a runtime data issue there really isn't an external tool that can look at the static database and evaluate things "out-of-transaction". I know exploding the API can be just as bad as having too minimal of one but just from the fact that the defenders of the status-quo feel that "Time handling has lots of subtleties..." means that efforts to make it less subtle are warranted. Yes, I do sit in the Peanut Gallery but, like the OP, am not a novice when it comes to programming; and for much of that last paragraph I am liberally regurgitating from reading/learning. Workarounds: 1) SET datestyle = 'DMY'; + use "CAST ( '30/12/2007 00:30:00' AS timestamp )"; at this moment this is the most direct (and possibly only) way to do this [not positive all formats can be represented however...]; Casting is fairly smart (I think) so you really just need to get the date ORDER correct as opposed to explicitly specifying the format. Time should be fairly straight forward as order does not normally come into play. 2) SET TIMEZONE = 'Some Safe TimeZone Where All Times Are Valid' + to_timestamp(text, text)::timestamp; if you are going to use "SET" you might as well just change "datestyle" if your desired format can be represented in that form 2a) If no "Safe" TimeZone exists then either a "correction algorithm" is required or the only option is to "SET datestyle" and CAST 3) Consider other "SET" possibilities (lc_time, intervalstyle) for more precision [not tested] Solution: 1) Add "to_timestamp_local(text, text) -> timestamp" function which does the same thing as "to_timestamp(text, text) -> timestamptz" but in a TimeZone agnostic way Modifications: 1) Add "to_timestamp_strict(text, text) -> timestamptz" function that errors instead of coercing 2) Modify "to_timestamp(text, text) -> timestamptz" to emit a warning whenever a coercion occurs David J.
On Thursday, June 23, 2011 1:42:42 pm hernan gonzalez wrote: > Fair enough: to_timestamp doesn't do what I want, I must resort to casting. > > But it's rather unfortunate that a to_timestamp() function doesnt actually > parse a "timestamp" (which, is a alias to "timestamp without timezone", at > least since PG 7.3) . To split hairs to_timestamp() parses a string and returns a timestamp with time zone. What you want is a function that parses a string and does not editorialize the result. FYI the code that handles this is located in the source at: src/backend/utils/adt as formatting.c > > BTW, the "set datestyle" workaround can be umpractical in some scenarios > (case in point: I have a view that returns some column as "timestamp" by > doing that parsing from a text colum ) I even can't put a "SET DATESTYLE" > (or a SET TIMEZONE=UTC)" in a function, unless I make it VOLATILE, which I > certainly would not like. Might want to look at one of the non-pgsql languages that have their own timezone/date handling code. Basically do an end run around Postgres. > > Regards > > Hernán J. González -- Adrian Klaver adrian.klaver@gmail.com
> > Possibly: > test=> select (to_date('30/12/2007','DD/MM/YYYY') + > '00:30'::time)::timestamp; > timestamp > --------------------- > 2007-12-30 00:30:00 > (1 row) > > Great, so now I have to capture the date and time portion of the string separately AND I need to use two parameters instead of one. Contemplating and enumerating alternative solutions to the problem doesn't change the fact that the "to_timestamp(text, text)" function exists even though all of these alternatives work equally well for timestamptz. A naïve (or busy) user likely will attempt to use "to_timestamp(text, text)" when their input can be in either MDY or DMY format - even if they ultimately want to only deal with "timestamp". It looks like it does the right thing 100% of the time (yes, they should use boundary values in test regressions but...) and thus they give it no further thought; they most certainly will not go looking for "SET dateformat" or contemplate breaking their tidy String into two parts and dealing with each part separately. They also haven't fully contemplated how PostgreSQL handles TimeZones or that the input will be SILENTLY COERCED if it is invalid for the current TimeZone (you used to do this with type-casting as well...). They just say: "It works, I'll use it". The only thing, aside from a big/bold "WARNING" right next to the function/description, that they will likely see is another function that seems to do the same thing. They will probably look through and compare both functions closely in order to figure out why two functions are being provided that seem to do the same thing. Also, is this coercion noted in the documentation anywhere? I looked in the obvious locations (Data Type, Function, Appendix B). There should probably be something obvious, in the Data Type section, like: "When a Time Stamp with time zone is created the 'effective' time zone is determined and the input value is evaluated according to that time zone. If, due to Daylight Savings Time changes, the indicated point-in-time does not exist the time component is interpreted as if it were Standard Time and then converted to DST (commonly +1 hours) For example: '2007-12-30 00:30:00 ART' does not exist because '2007-12-30' is the day of the change to DST; the attempt to create a timestamptz with this value will result in '2007-12-30 01:30:00 ART' which then is stored as '2007-12-29 10:30:00 GMT' (ART = GMT - 3). Be aware that during DST-to-STD changeover there are no 'missing' times but there is no way to reliably specify whether you are dealing with the first or the second occurrence of the time on that particular day. The TimeZone specification does not allow one to specifically state '1:30AM during DST (1)' or '1:30AM during STD (2)'." David J.
On Thursday, June 23, 2011 6:18:18 pm David Johnston wrote: > Also, is this coercion noted in the documentation anywhere? I looked in > the obvious locations (Data Type, Function, Appendix B). There should > probably be something obvious, in the Data Type section, like: > > "When a Time Stamp with time zone is created the 'effective' time zone is > determined and the input value is evaluated according to that time zone. > If, due to Daylight Savings Time changes, the indicated point-in-time does > not exist the time component is interpreted as if it were Standard Time and > then converted to DST (commonly +1 hours) For example: '2007-12-30 > 00:30:00 ART' does not exist because '2007-12-30' is the day of the change > to DST; the attempt to create a timestamptz with this value will result in > '2007-12-30 01:30:00 ART' which then is stored as '2007-12-29 10:30:00 > GMT' (ART = GMT - 3). Be aware that during DST-to-STD changeover there > are no 'missing' times but there is no way to reliably specify whether you > are dealing with the first or the second occurrence of the time on that > particular day. The TimeZone specification does not allow one to > specifically state '1:30AM during DST (1)' or '1:30AM during STD (2)'." As I understand it, documentation patches are welcomed:) > > David J. -- Adrian Klaver adrian.klaver@gmail.com
On 06/23/2011 02:45 PM, David Johnston wrote: > ... > As for "Time handling has lots of subtleties that take time to digest"; a > good programmer and API do their best to minimize the number of hidden > subtleties to be learned.... I meant that time-calculations themselves have lots of issues and subtleties. The length of a day, month and a year are all varying periods of time leading to things like adding then subtracting a month does not return the original date: select '2011-01-31'::date + '1 month'::interval - '1 month'::interval; ?column? --------------------- 2011-01-28 00:00:00 The 30/360 accounting method takes care of this by simply assuming that every month has 30 days and there are 360 days in a year. There are plenty of shift-work systems and contracts that simply decree a shift to be 8-hours regardless if your shift is actually 7- or 9-hours due to DST changeover. Since DST changes are not synchronized to a common point-in-time worldwide, one can easily attempt to scheduled synchronized early-morning jobs between East and West coast that will fail when the East changes DST several hours ahead of the West. Even tracking what rule to apply is tricky. Parts of Arizona observe daylight savings. Others do not. And while we're at it, what about those pesky leap-seconds? Calculations for long prior dates/times have things like a few minute jump when (at least in the US) an interval crosses Sunday, November 18, 1883 ("the day of two noons"). And although October 1582 (Catholic regions) or September 1752 (Protestant regions/Unix-assumption) or later (Orthodox) are missing 10-days, PostgreSQL follows the SQL standard which does not show those dates as missing at all. There is also an assumption that date calculations continue backward in history prior to the actual development of the concept of time-zones. And, lacking prescience, calculations for future dates assume that time-zone definitions won't change so the answer you get today may not be the answer you get if you run the same calculation tomorrow. There are different definitions of when a year starts so be sure not to grab the wrong week-number or day-number - ISO and Julian are not the same. And, of course, everything starts with the ethnocentric assumption of what calendar system to use. From my experience, there is not a lot of good SQL support for data using Islamic, Chinese, Hebrew, Hindu, Iranian, Coptic or Ethiopian calendars. Until one considers which of the many issues inherent to date calculation may be important, one will not even know what assumptions to check for in the software being used. Cheers, Steve
> I meant that time-calculations themselves have lots of issues and subtleties. Fair enough, and I agree there is no magic API to solve the difficulties of adapting rational, logic based systems to a Calendar system last edited by the Pope and based upon the imperfect movement of Sol relative to Earth. But we've already detailed why this specific case could use some more attention. > Calculations for long prior dates/times have things like a few minute jump > when (at least in the US) an interval crosses Sunday, November 18, > 1883 ("the day of two noons"). And although October 1582 (Catholic > regions) or September 1752 (Protestant regions/Unix-assumption) or later > (Orthodox) are missing 10-days, PostgreSQL follows the SQL standard which > does not show those dates as missing at all. > This really falls into application knowledge since the vast majority of use-cases use fairly recent dates. Those who deal with long-ago dates should be expected to understand the limitations of their reality and would devise means to accommodate them. Likewise, from the omitted next paragraph, those who are relying on time need to take into consideration that changes happen. The effort to deal with that change is then trade-off against the cost of the failure occurring. In some/most cases, over a short timeframe, the proper solution is to be flexible and/or relative. Examples: Scheduler: Run the schedule the first chance you get when the "system" hour/minute is after/greater than the "schedule" hour/minute. You may or may not need to ensure that "schedule" hour/minute entries are sorted so those with a longer delay are completed first - just as they would be in normal circumstances. Hospital: Dispense the next dose 5 hours from now (about the broadest unit you can safely use is DAY). In this case the software should be able to "count" forward minute-by-minute, using the TimeZone rules to skip around if necessary, and determine whether 5 hours from 0:30 is 4:30, 5:30, or 6:30. The API implements this "counting" via the "addition operator". In theory adding "months/years" should be forbidden and a "procedure" that applies a consistent "rule set" should be used instead. Some standard ones can be provided and the user can always create their own. One possible rule would be that adding or subtracting months to a date that is the last day of its month always returns the last of the resultant month. Another rule/function could implement the current behavior where the day does not change (and then you have two variations - if the new date is invalid you either fail or coerce). While the discussion or core vs. extension comes up consider that many users and evaluators are going to look at the core first and, as I've said before, if they see something that appears like it will work they will just use it. So you'd either want to have no (or very minimal) time-oriented API or have it be full-featured (and also have a "save me from myself" quality to it - given time's complexities). All this said, I am getting worked up by this particular thread but, in reality, the status-quo does not truly harm me that I know of - but my usage of PostgreSQL is very light/flexible (lots of spare capacity). I'm coming at this both from a desire to learn more and also "what would I do if I was starting from scratch?". The best approach, since we are not starting from scratch, would be for interested and capable parties to work on a full-blown "time" extension that, while maybe less user friendly, is safer to use and much more explicit. However, there are likely to be some components in such an extension that would be forward-only and thus could be introduced to the core API without any difficulty (a "to_timestamp_abstract(text, text)->timestamp" function comes to mind - note the name change; see other recent post for reason). And since interested and capable are not mutually inclusive those who are interested but not capable would probably appreciate more than just "here is a workaround" from the community. At the same time, interested parties need to put together a precise and coherent proposal that can be discussed and acted upon - with a clear (even if possibly incorrect) assertion about why something is either wrong or difficult to use. > > There are different definitions of when a year starts so be sure not to grab > the wrong week-number or day-number - ISO and Julian are not the same. > Agreed; but people who are going to choose a calendar other than the Gregorian Calendar should be expected to learn and abide by the rules of that Calendar. The responsibility of the API is to correctly apply those rules (and help the user abide by them where possible). > And, of course, everything starts with the ethnocentric assumption of what > calendar system to use. From my experience, there is not a lot of good SQL > support for data using Islamic, Chinese, Hebrew, Hindu, Iranian, Coptic or > Ethiopian calendars. Supply and Demand. Iran is thinking about creating their own Internet - let THEM fund and develop a PostgreSQL extension for their Calendar... Taking Hernan's comments even further a "point-in-time" is: { Calendar, Location, DATE, TIME, LocationCode } which would allow you to say: "[TIME=1:30 AM] {LocationCode=DST} on [DATE], while in {Location=America_NewYork}, using {Calendar=Gregorian}". Unless converted even if you are in Vienna if you look at that particular time that is what you see. Using the rules, since every valid instant exists everywhere, the corresponding time in a different location can always be calculated. Whether or how to convert between Calendars would be different matter altogether. Also, do NOT go down the path of whether a particular Calendar is in use (has rules) for a particular location on a particular date (i.e., is an Iranian Calendar in America_NewYork even valid?). At some point you simply put down default rules that will apply when more specific rules are not provided. David J. Note: I am writing this and a response to Hernan at the same time (no pun intended...)
First: I would suggest your use of “Local Time” is incorrect and that you would be better off thinking of it as “Abstract Time”. My responses below go into more detail but in short you obtain a “Local” time by “Localizing” and “Abstract” time. The process of “Localization” requires a relevant “Locale” input which, for date/time values, is a “TimeZone”. Since you define your “Local Time” as being “Without Timezone” this is an inconsistency and so, because we want to define something without a TimeZone we need to rename “Local Time” to “Abstract Time”. And, no, “Wall Time” will not work either since a “Wall” exists “Somewhere” and thus is “Localized”.
You say: (I am applying the above directly to your two definitions)
1) “TimestampTZ” is an “INSTANT” - but what exactly is an Instant? In this case we have a “Wall Calendar” and a “Wall Clock” as a means of describing an instant. However, that “Wall” has to be “Somewhere” and, in combination, the calendar and clock have to display real and valid values according to that physical location. So Instant, by definition, means Local, which requires a TimeZone. So “TimeStampTZ” DOES imply a “TimeZone” via a definition of “INSTANT”.
2) “Timestamp” is a[n] [ABSTRACT]DATETIME (reworded to remove the prefix LOCAL which, from above, I feel is misleading). That this does not use TimeZone is correct.
On the other hand, "TIMESTAMP WITHOUT TIMEZONE" is a wholy different concept (neither 'wider' or narrow' type than the other). It's just the local calendar time, it's (conceptually) like a tuple of numbers{year,month,day,hour,min,sec, fracsecs}, the "date and time" as it's understood in the business-civil world, with the timezone information missing.
This is the type that should be used for that concept, when it's critical for me that 'If I stored "2011,09,11 at 23:59:59", I want to get that precise calendar date…
Your definition of “calendar time” is incomplete (though I do get your point). The date component is “local” because you have (implicitly) specified that you are using a “Gregorian Calendar-like” rule set. However, by omitting the “Location Time Rules (TimeZone)” you are in fact creating an Abstract Time and not anything that is guaranteed to be valid (meaningful) when “Localized”. As soon as you say “local” you must tell the computer what “local” means by specifying a TimeZone. Otherwise you simply have an Abstract Time based on the (I think) Babylonian system.
Docs should make clear this, and the fact that Postgresql currently lacks a "FULL" datetime type. This would correspond conceptually to the tuple {INSTANT,TIMEZONE} (which implies univocally also a LOCALDATETIME). Nor does it implemented the SQL standard "TIMESTAMP WITH TIMEZONE" (actually a {INSTANT,OFFSET} tuple). I'm ok with that, I agree that the later is crippled, and the former is complex and not standard, and that both have little support from upper layers.
Fair enough; but in reality, other than the 25 hour day issue the chosen implementation is quite useful. Once you have created a valid instance of a “timestamptz” you can change it to any TimeZone, using the proper rules, and are guaranteed to still have a valid value. So you might as well normalize the “storage” TimeZone as PostgreSQL does. My only qualm is coercing the input so that a valid “timestamptz” is always created. But even that isn’t a big deal if you indeed want to ensure that the value entered is a valid “timestampz”.
In this case you are using a function that returns a “timestamptz” while you are working with “timestamp”. You just said that they are completely different so the fact that this fails should be of no surprise. That a suitable function, that returns a “timestamp”, does not exist is the only real complaint. It has already been shown that the described behavior of a PostgreSQL “timestamp” is consistent with what you describe it should be. That it can be auto-casted to a “timestamptz” is a debatable point. But this again comes simply back to the decision to coerce the input of “timestamptz”. That is, in the vast majority of cases where the conversion makes sense the ability to directly cast is great. Casting is always necessary IF you want to convert your Abstract Time (i.e., “timestamp”) into a Local Time (i.e., “timestamptz”).
I’m ignoring the concept of “OFFSET” intentionally as that is likely to confuse the issue and I haven’t had time to fully contemplate that aspect of things.
David J.
Note: I am writing this post and a response to Steve at the same time (no pun intended…)
First: I would suggest your use of “Local Time” is incorrect and that you would be better off thinking of it as “Abstract Time”. My responses below go into more detail but in short you obtain a “Local” time by “Localizing” and “Abstract” time. The process of “Localization” requires a relevant “Locale” input which, for date/time values, is a “TimeZone”.
An instant is a point in the universal time, it's a physical concept, unrelated to world calendars. The time point at which the man first landed on the moon is an instant, as is the moment at which my server restarted. It is not related to a Timezone at all. We can specified it by some arbitrary convention (milliseconds passed since the first atomic explosion at Hiroshima), or by some human calendar at some place/moment: for example, the "wall date and clock used at New York". If (only if) you use a Gregorian Calendar to specify/show a instant, you need a date, a time and a timezone. (but you have many timezones to choose from - as you have several calendars - a timezone is not determined by an instant). A full datetime (date, time, timezone) implies an instant - but an instant does not imply a timezone.
I suggest to take a look at the Joda time API, which is one of the very few date-time API ("key concepts") that is generally though to cover quite completely and consistently these issues.
Hernán J. González
http://hjg.com.ar/
On Friday, June 24, 2011 10:37:43 am hernan gonzalez wrote: > > As I understand it, documentation patches are welcomed:) > > I'd indeed wish some radical changes to the documentation. > > To start with, the fundamental data type names are rather misleading; SQL > standard sucks here, true, but Postgresql also has its idiosincracies, and > the docs do not help much: > > http://www.postgresql.org/docs/9.0/interactive/datatype-datetime.html > > The most confusing thing is that PG's "TIMESTAMP WITH TIMEZONE" is NOT... a > timestamp with timezone! (not even in the crippled sense of the SQL > standard, which thinks of a "timestamp with offset"). It actually has no > relation with timezones. It's actually a plain "timestamp", as the world is > commonly used in computing, the "INSTANT" of time in which (typically) an > event happened. It's a physical concept, not related with civil things > (even with calendars). The problem is you are over complicating things. The Postgres timestamp with time zone is just that. It is a timestamp with timezone of UTC and an offset of 0. That is how it is stored. Given that, it is related to all timezones and is calendar aware. The confusion arises in how the value is represented to the end user. That is determined by the needs of the programmer/application/user. > > This is the type that should normally be used to record the time at which > an event happened (typically a record modification - like MYSQL uses the > world "TIMESTAMP"). > > On the other hand, "TIMESTAMP WITHOUT TIMEZONE" is a wholy different > concept (neither 'wider' or narrow' type than the other). It's just the > local calendar time, it's (conceptually) like a tuple of > numbers{year,month,day,hour,min,sec, fracsecs}, the "date and time" as it's > understood in the business-civil world, with the timezone information > missing. I am not sure what business-civil world you live in, but the one I live in is very timezone aware:) Local time is only relevant locally. The problem is sooner or later you will need to interface with someone who is not local and then the time comparison issues rear their head. At that point time zones become essential. > Hernan J Gonzalez -- Adrian Klaver adrian.klaver@gmail.com
On Sunday, June 26, 2011 12:57:15 pm hernan gonzalez wrote: > > An instant is a point in the universal time, it's a physical concept, > unrelated to world calendars. The time point at which the man first landed > on the moon is an instant, as is the moment at which my server restarted. > It is not related to a Timezone at all. We can specified it by some > arbitrary convention (milliseconds passed since the first atomic explosion > at Hiroshima), or by some human calendar at some place/moment: for > example, the "wall date and clock used at New York". If (only if) you use > a Gregorian Calendar to specify/show a instant, you need a date, a time > and a timezone. (but you have many timezones to choose from - as you have > several calendars - a timezone is not determined by an instant). A full > datetime (date, time, timezone) implies an instant - but an instant does > not imply a timezone. You might want to review the Theories of Relativity, which pretty much blew away the notion of an absolute time and introduced the notion of frame of reference for time. -- Adrian Klaver adrian.klaver@gmail.com
You might want to review the Theories of Relativity, which pretty much blew away
the notion of an absolute time and introduced the notion of frame of reference
for time.
--
Hernán J. González
http://hjg.com.ar/
On Sunday, June 26, 2011 12:57:15 pm hernan gonzalez wrote:
>
> An instant is a point in the universal time, it's a physical concept,
> unrelated to world calendars. The time point at which the man first landed
> on the moon is an instant, as is the moment at which my server restarted.
> It is not related to a Timezone at all. We can specified it by some
> arbitrary convention (milliseconds passed since the first atomic explosion
> at Hiroshima), or by some human calendar at some place/moment: for
> example, the "wall date and clock used at New York". If (only if) you use
> a Gregorian Calendar to specify/show a instant, you need a date, a time
> and a timezone. (but you have many timezones to choose from - as you have
> several calendars - a timezone is not determined by an instant). A full
> datetime (date, time, timezone) implies an instant - but an instant does
> not imply a timezone.
>
> I suggest to take a look at the Joda time API, which is one of the very few
> date-time API ("key concepts") that is generally though to cover quite
> completely and consistently these issues.
Took you advice and looked up the Joda API definition of an instant:
http://joda-time.sourceforge.net/key_instant.html
"The most frequently used concept in Joda-Time is that of the instant. An Instant is defined as an instant in the datetime continuum specified as a number of milliseconds from 1970-01-01T00:00Z. This definition of milliseconds is consistent with that of the JDK in Date or Calendar. Interoperating between the two APIs is thus simple. "
Look a lot like the Unix Epoch:)
"Within Joda-Time an instant is represented by the ReadableInstant interface. There are four implementations of the interface provided:
- Instant - A simple immutable implementation which is restricted to the UTC time zone and is intended for time zone and calendar neutral data transfer
- DateTime - The most commonly used class in the library, and an immutable representation of a date and time with calendar and time zone
- DateMidnight - Similar to DateTime and also immutable but with the time component forced to be midnight (at the start of a day)
- MutableDateTime - A mutable representation of date and time with calendar and time zone
We recommend the immutable implementations for general usage. "
There are those pesky time zones and calendars again.
--
Adrian Klaver
adrian.klaver@gmail.com
On Sunday, June 26, 2011 1:36:32 pm hernan gonzalez wrote: > > You might want to review the Theories of Relativity, which pretty much > > blew away > > the notion of an absolute time and introduced the notion of frame of > > reference > > for time. > > Well, I give up. As it happens I am currently reading a biography of Einstein, so my college physics memories are being refreshed. Makes time zone issues look mundane:) -- Adrian Klaver adrian.klaver@gmail.com
On Sunday, June 26, 2011 12:57:15 pm hernan gonzalez wrote:
>
> An instant is a point in the universal time, it's a physical concept,
> unrelated to world calendars. The time point at which the man first landed
> on the moon is an instant, as is the moment at which my server restarted.
> It is not related to a Timezone at all. We can specified it by some
> arbitrary convention (milliseconds passed since the first atomic explosion
> at Hiroshima), or by some human calendar at some place/moment: for
> example, the "wall date and clock used at New York". If (only if) you use
> a Gregorian Calendar to specify/show a instant, you need a date, a time
> and a timezone. (but you have many timezones to choose from - as you have
> several calendars - a timezone is not determined by an instant). A full
> datetime (date, time, timezone) implies an instant - but an instant does
> not imply a timezone.
While I agree with the relativity comment I would offer that you do have a solid (or at least practically accepted) understanding of date/time but seem to lack the ability to describe exactly what it is that the PostgreSQL modeling of date/time lacks. Using you “physical” definition for “Instant” what can you not do with a PostgreSQL “timestamptz” (or is much harder than you would like) that you would like to do?
I wholly agree that TimeZones are an “arbitrary convention” that are not “required” in order to describe time. However, they are in use by humans and ultimately the databases we create are “models” and thus those models reflect human conventions. Thus, it really does not matter, in the context of PostgreSQL, whether you can or even should describe an “Instant” without using a TimeZone. You have already said that a “TimeStampTZ” should represent an “Instant” and we have shown that it indeed does so.
Also, generally, the concept of “TimeZone” is not restricted only to a “Gregorian Calendar” and in fact could be used with any calendar. With respect to PostgreSQL currently you are correct. But now you’ve mixed “physical” concept assertions and “conventional” concept assertions into the same paragraph…
In PostgreSQL: An instant is represented by a “timestamptz” which by definition uses a TimeZone and thus “Instant”, in PostgreSQL, implies that there IS “TimeZone” (not which one, since, as you said, the choice of TimeZone is arbitrary on output). However, on input, this implication means that there HAS TO BE an in-effect TimeZone when interpreting and storing an “Instant”. Since you did not specify that you were dealing with an “Abstract” Instant in your original post we presume, this being a PostgreSQL list, that your definitions where meant to be taken in context of PostgreSQL.
Back to my original point; you seem to have something positive to contribute but for whatever reason you are failing to communicate in a way that we can understand. If it is because you are focusing on some theoretically perfect model of date/time I would suggest come down off the theory and put things into more practical terms. Focus less on definitions and more on properties and behaviors. If you cannot do that, giving up (and maybe coming back to it later) is probably a good idea. But don’t let a one-liner scare you off. Take it as a sign that you probably need to change your approach. Or, in this specific case, your abrupt introduction of “physical concept” 10 posts into the thread provoked an off-hand remark kind of like: “why didn’t you say you wanted to discuss theory – your bashing of the timestamptz data type made us think you actually wanted to deal with something practical”.
FWIW
David J.
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of hernan gonzalez
Sent: Sunday, June 26, 2011 3:57 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] to_timestamp() and timestamp without time zone
An analogy: a "localtime" is like a "relative path" in a filesystem.
A full datetime spefication (date, time, timezone) would correspond to a
absolute path.
So let us call it “Relative Time” instead of “Abstract/Local Time”. I like this better since the term relative begs the question: “relative to what” which then needs to be answered (by adding a TimeZone) to possibly represent an Instant (or absolute time). However there is no guarantee that the process of turning the relative path into an absolute path will succeed just as there is no guarantee that a relative time exists when associated with a specific TimeZone.
First: I would suggest your use of “Local Time” is incorrect and that you would be better off thinking of it as “Abstract Time”. My responses below go into more detail but in short you obtain a “Local” time by “Localizing” and “Abstract” time. The process of “Localization” requires a relevant “Locale” input which, for date/time values, is a “TimeZone”.
That's not the way in which the expression "Local (date) time" is normally used,
rather the opossite.
A "localtime" is normally a time which is to be understood relatively to some
unespecified timezone.
Precisely, when we (in common usage)
specify a datetime, we say a date, a time and then either add some
specific timezone OR do not state it : just say "localtime". That is, we either say
"The event happened 2011-10-03 12:00:00 , Eastern Time"
OR we say
"The event happened 2011-10-03 12:00:00 (localtime)"
"International offices will close the first semester at 2011-06-31 23:59:00 (localtime)"
In all the above examples the use of the “localtime” simply means “in the timezone in which the event physically happened or in which the post offices are physically located. It is exactly in this sense that I suggest LocalTime be used. The TimeZone itself may or may not be known but it is implied and present none-the-less. This make sense because any time specification of a physical event can and is specified by an instant. In fact, in pretty all common usage we are dealing with Instances and not “Relative Time”. This is because we are dealing with the past and thus the “location” is already known. It is when the location is unknown, in the future or if there could be multiple, that you supply the Abstract Time to the user and let them figure out at what Instant they should do something. So, saying “go to bed at 2AM local time” means “bed time is 2AM no matter where you are; when you arrive someplace attach your current TimeZone to the 2AM Abstract Time and go to bed at that Instant.
That's also how the word is used in APIs (see Joda time)
The Joda-Time API indeed defines its “LocalTime” is the same manner as I’ve suggested defining “Relative Time”; I would argue that their use of “LocalTime” is also mis-guided for the reasons already stated. Saying that someone else does something is generally a poor defense for a position. Group-think is useful but, as in this case, your supporting group is fallible; they made the same mistake as you (or rather your parroting of their position reflected their mistake as well). It is a subtle and inconsequential mistake since most people would make the same one and the actual definitions and implementations are consistent and serve the needed purpose. In the same way “timestamp” and “timestamptz” have their own “mistakes” surrounding them (due to the re-characterization during the version 7.x timeline) but they are consistently defined and implemented and provide the same two models that Joda-Time does – just under different labels. One sensible paraphrase of “LocalTime” - to include the prefix “local” - is “A time value that needs to be localized in order to possibly represent an instant.” This works but for this, as for other things generally, I first determine what it is I need to represent and then I scan the documentation for Names that seems to do similar things. I then read the description so see whether I have found the correct one. Naming perfection is not required; as long as the label is close in meaning I’ll generally find what is needed if it is there. Then I just associate that API/label with the behavior/feature I need and use it simply as-is – without an in-depth analysis of whether it is fully logical since – in programming – it is difficult to fix public “mistakes”.
David J.