Thread: timestamp with time zone a la sql99
I've made a partial implementation of a datatype "timestamp with time zone" as described in the sql standard. The current type "timestamptz" does not store the time zone as a standard one should do. So I've made a new type I've called timestampstdtz that does store the time zone as the standard demands. Let me show a bit of what currently works in my implementation: dennis=# CREATE TABLE foo ( a timestampstdtz, primary key (a) ); dennis=# INSERT INTO foo VALUES ('1993-02-04 13:00 UTC'); dennis=# INSERT INTO foo VALUES ('1999-06-0114:00 CET'); dennis=# INSERT INTO foo VALUES ('2003-08-21 15:00 PST'); dennis=# SELECT a FROM foo; a ------------------------ 1993-02-04 13:00:00+00 1999-06-01 14:00:00+01 2003-08-2115:00:00-08 dennis=# SELECT a AT TIME ZONE 'CET' FROM foo; timezone ------------------------ 1993-02-0414:00:00+01 1999-06-01 14:00:00+01 2003-08-22 00:00:00+01 My plan is to make a GUC variable so that one can tell PG that constructs like "timestamp with time zone" will map to timestampstdtz instead of timestamptz (some old databases might need the old so unless we want to break old code this is the easiest solution I can find). I've made an implicit cast from timestampstdtz to timestamptz that just forgets about the time zone. In the other direction I've made an assignment cast that make a timestamp with time zone 0 (that's what a timestamptz is anyway). Would it be possible to make it implicit in both directions? I currently don't think that you want that, but is it possible? With the implicit cast in place I assume it would be safe to change functions like now() to return a timestampstdtz? I've not tried yet but I will. As far as I can tell the cast would make old code that use now() to still work as before. Any comments before I invest more time into this subject? -- /Dennis Björklund
Dennis Bjorklund <db@zigo.dhs.org> writes: > I've made a partial implementation of a datatype "timestamp with time > zone" as described in the sql standard. The current type "timestamptz" > does not store the time zone as a standard one should do. I'm aware that there are aspects of the spec behavior that appear to require that, but is it really an improvement over the implementation we have? This is an area in which the standard is pretty brain-dead --- the entire concept of a "time with time zone" datatype is rather suspect, for instance. In particular, I wonder how you will handle daylight-savings issues. The spec definition seems to preclude doing anything intelligent with DST, as they equate a timezone with a fixed offset from UTC. That's not how it works in (large parts of) the real world. regards, tom lane
On Thu, 21 Oct 2004, Tom Lane wrote: > > I've made a partial implementation of a datatype "timestamp with time > > zone" as described in the sql standard. The current type "timestamptz" > > does not store the time zone as a standard one should do. > > I'm aware that there are aspects of the spec behavior that appear to > require that, but is it really an improvement over the implementation > we have? Improvement and improvement. The actual time value is of course the same (the utc part of a timestamp) and the only thing extra you get is that the time zone is stored. The extra information you do have now, when stored in this way, is that you store both a utc time and a local time. Will any application ever need that? Who knows? I think it makes sense and is an easier model to think about then what pg uses today. So I would use it even if it means using 2 bytes more storage then what timestamptz do Just that it is standard also makes it useful. The more things of the standard we support the easier it is to move between databases. This is important to me. I also want to make a general statement that I think that whenever we use standard syntax we should give it a standard semantics. I don't mind extensions at all, but as much as we can we should make sure that they don't clash with standard syntax and semantics. > This is an area in which the standard is pretty brain-dead > --- the entire concept of a "time with time zone" datatype is rather > suspect, for instance. I havn't look that much at "time with time zone" yet, just timestamps. I can't see why time with time zone should not also be supported. I can't really imagine it being used without a date, but if someone wants to store timestamps as a date+time with time zone, then why not. It would be extra work tu is it instead of a timestamp (especially for cases where the time wraps over to the prev/next day), but hey. > In particular, I wonder how you will handle daylight-savings issues. > The spec definition seems to preclude doing anything intelligent with > DST, as they equate a timezone with a fixed offset from UTC. That's > not how it works in (large parts of) the real world. The tz in the standard is a offset from utc, yes. So when you store a value you tell it what offset you use. If you are using daylight-savings time it might be +02 and if not dst it might be +01. What else would you want to do with it? It's not like you can do anything else with it in pg as of today, can you? The stored tz does not say what region of the globe you are in, it says the distance away from utc in minutes that you are. I could imagine another datatype that stores the time zone as name, but that's not what timestamp with time zone does. -- /Dennis Björklund
On Thursday 21 October 2004 11:01, Dennis Bjorklund wrote: > On Thu, 21 Oct 2004, Tom Lane wrote: > > I'm aware that there are aspects of the spec behavior that appear to > > require that, but is it really an improvement over the implementation > > we have? > > Improvement and improvement. The actual time value is of course the same > (the utc part of a timestamp) and the only thing extra you get is that the > time zone is stored. The extra information you do have now, when stored in > this way, is that you store both a utc time and a local time. Will any > application ever need that? Who knows? I think it makes sense and is an > easier model to think about then what pg uses today. So I would use it > even if it means using 2 bytes more storage then what timestamptz do > In a fit of early morning, pre-coffee thoughts, I'm thinking this might be just what I've been looking for. In one of my apps we take calls from around the country for customers and store the time that call came in. Unfortunately we need to know things like how many calls did we take in an hour across customers, but also how many calls did we take at 6AM local time to the customer. The way PostgreSQL works now, you have to store some extra bits of info in another column and then reassemble it to be able to determine those two queries, but it sounds like your timestampstdtz would allow that information to be stored together, as it should be. -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
Robert Treat <xzilla@users.sourceforge.net> writes: > In a fit of early morning, pre-coffee thoughts, I'm thinking this might be > just what I've been looking for. In one of my apps we take calls from around > the country for customers and store the time that call came in. Unfortunately > we need to know things like how many calls did we take in an hour across > customers, but also how many calls did we take at 6AM local time to the > customer. The way PostgreSQL works now, you have to store some extra bits > of info in another column and then reassemble it to be able to determine > those two queries, but it sounds like your timestampstdtz would allow that > information to be stored together, as it should be. As far as I can tell, Dennis is planning slavish adherence to the spec, which will mean that the datatype is unable to cope effectively with daylight-savings issues. So I'm unconvinced that it will be very helpful to you for remembering local time in addition to true (universal) time. regards, tom lane
On Fri, 22 Oct 2004, Tom Lane wrote: > As far as I can tell, Dennis is planning slavish adherence to the spec, > which will mean that the datatype is unable to cope effectively with > daylight-savings issues. So I'm unconvinced that it will be very > helpful to you for remembering local time in addition to true > (universal) time. And exactly what issues is it that you see? The only thing I can think of is if you have a timestamp and then add an interval to it so we jump past the daylight saving time change date. Then the new timestamp will keep the old timezone data of say +01 even though we now have jumped into the daylight saving period of +02. If you are just storing actual timestamps then the standard definition works just fine. If I store '2004-10-22 16:20:04 +02' then that's exactly what I get back. No problem what so ever. There is no DST problem with that. It's possible that I will introduce some daylight saving bit or something like that, I'm not sure yet and I will not commit to anything until I've thought it over. I don't think there are that much of a problem as you claim however. Could you give a concret example where it will be a problem? My current thinking is that storing the time zone value as HH:MM is just fine and you avoid all the problems with political changes of when the DST is in effect or not. -- /Dennis Björklund
On Fri, Oct 22, 2004 at 16:28:12 +0200, Dennis Bjorklund <db@zigo.dhs.org> wrote: > On Fri, 22 Oct 2004, Tom Lane wrote: > > > As far as I can tell, Dennis is planning slavish adherence to the spec, > > which will mean that the datatype is unable to cope effectively with > > daylight-savings issues. So I'm unconvinced that it will be very > > helpful to you for remembering local time in addition to true > > (universal) time. > > And exactly what issues is it that you see? The only thing I can think of > is if you have a timestamp and then add an interval to it so we jump past > the daylight saving time change date. Then the new timestamp will keep the > old timezone data of say +01 even though we now have jumped into the > daylight saving period of +02. I think for just storing values you are fine. When it comes to adding or subtracting intervals you might get some unexpected results.
Dennis Bjorklund <db@zigo.dhs.org> writes: > And exactly what issues is it that you see? The only thing I can think of > is if you have a timestamp and then add an interval to it so we jump past > the daylight saving time change date. Then the new timestamp will keep the > old timezone data of say +01 even though we now have jumped into the > daylight saving period of +02. Isn't that sufficient? You can't design a datatype by thinking only of the data values it stores; you have to think about the operations you intend to provide as well. A non-DST-capable timestamp datatype is inherently a few bricks shy of a load. (BTW we really need to fix the interval type as well...) At bottom, what I want to be able to do is say'2004-10-22 10:50:16.916003 America/New_York' and have the datatype preserve *all* of the information in that. You are complaining because the existing type only remembers the equivalent universal time and not the timezone spec. Why should I be satisfied if it stores only the GMT offset and not the knowledge of which timezone this really is? > My current thinking is that storing the time zone value as HH:MM is > just fine and you avoid all the problems with political changes of when > the DST is in effect or not. This is fundamentally misguided. Time zones *are* political whether you like it or not, and people *do* expect DST-awareness whether you like it or not. If you still use any computer systems that need to be reset twice a year because their designers thought DST was not their problem, don't you roundly curse them every time you have to do it? If you were planning to store a real (potentially DST-aware) timezone spec in the data values, I'd be happy. But storing a fixed GMT offset is going to be a step backwards compared to existing functionality. The fact that it's sufficient to satisfy the DST-ignorant SQL spec does not make it a reasonable design for the real world. One way to do this would be to create a system catalog with entries for all known timezones, and then represent timestamptz values as universal time plus an OID from that catalog. There are other ways that small integer codes could be mapped to timezones of course. regards, tom lane
On Fri, 22 Oct 2004, Tom Lane wrote: > At bottom, what I want to be able to do is say > '2004-10-22 10:50:16.916003 America/New_York' Yes, that's what we said in the last mail and I think there is a value in having something like this. > universal time and not the timezone spec. Why should I be satisfied if > it stores only the GMT offset and not the knowledge of which timezone > this really is? You don't need to be satisfied with it. I think a type like the above would be fine to have. It should however not be called "TIMESTAMP WITH TIME ZONE" because there is already a definition of that type. We can not hijack standard types. I would not mind a type like TIMESTAMP WITH TIME ZONE NAME (or some other name). I could even imagine that I could implement something like that one day. > > My current thinking is that storing the time zone value as HH:MM is > > just fine and you avoid all the problems with political changes of when > > the DST is in effect or not. > > This is fundamentally misguided. Time zones *are* political whether you > like it or not, and people *do* expect DST-awareness whether you like it > or not. And I never said that time zones are not political, just that HH:MM is a usable approximation that works fairly well. > But storing a fixed GMT offset is going to be a step backwards compared > to existing functionality. It's not a step backwards since you can do everything you can do with the current type plus a little bit more. It's however not a step to the datatype discussed above. > One way to do this would be to create a system catalog with entries for > all known timezones, and then represent timestamptz values as universal > time plus an OID from that catalog. There are other ways that small > integer codes could be mapped to timezones of course. This is just fine. You try to make it sound like I am against such a datatype, I am not. It's however not the datatype that we can expect applications and other databases to use. So why should we settle for only that type. Just because you can make a perfect datatype it doesn't mean that the standard datatype should just be ignored. What would you store when the user supplies a timestamp like '2004-10-22 17:21:00 +0200'. Should you reject that because you don't know the time zone name? So your datatype will not work for applications that try to be compatable with many databases by using the standard? Maybe one could make a datatype called TIMESTAMP WITH TIME ZONE that can accept both HH:MM and TimeZoneName. Whenever you store values with HH:MM time zones you will get the same problem when you add an interval as the standard type has. -- /Dennis Björklund
Dennis Bjorklund <db@zigo.dhs.org> writes: > You don't need to be satisfied with it. I think a type like the above > would be fine to have. It should however not be called "TIMESTAMP WITH > TIME ZONE" because there is already a definition of that type. We can not > hijack standard types. Sure we can, as long as they are upward compatible with the standard behavior. The spec says you can put a numeric-GMT-offset zone in and get a numeric-GMT-offset zone out. We can do that and also support named, possibly DST-aware zones. This seems a whole lot better to me than having two different types (the idea of a GUC variable to choose which one is selected by a given type name is just horrid). >> But storing a fixed GMT offset is going to be a step backwards compared >> to existing functionality. > It's not a step backwards since you can do everything you can do with the > current type plus a little bit more. ... except get useful answers from interval addition ... > What would you store when the user supplies a timestamp like '2004-10-22 > 17:21:00 +0200'. Should you reject that because you don't know the > time zone name? You are attacking a straw man. We have put a great deal of work into 8.0 to add the ability to support real-world zones fully. We did not import src/timezone because we needed it to implement the SQL spec; we did so because we needed it to implement what real users want. We are not fully there yet (can't do AT TIME ZONE conversions with all zones yet, for instance) but I am hoping to be there by 8.1. It would be folly to invent a timestamp with time zone type that is going in the other direction while we are trying to bring the rest of the system up to full speed by allowing all timezone kinds everywhere. regards, tom lane
On Fri, 22 Oct 2004, Tom Lane wrote: > than having two different types (the idea of a GUC variable to choose > which one is selected by a given type name is just horrid). That is needed no matter what change you do if you want old programs that use the current timestamp with time zone to work. Today you don't get back the same time zone as you insert, programs might depend on that. > We are not fully there yet (can't do AT TIME ZONE conversions with all > zones yet, for instance) Why is that? When one start with a utc value, performing a AT TIME ZONE operation doesn't look so complicated. -- /Dennis Björklund
Dennis Bjorklund <db@zigo.dhs.org> writes: > On Fri, 22 Oct 2004, Tom Lane wrote: >> than having two different types (the idea of a GUC variable to choose >> which one is selected by a given type name is just horrid). > That is needed no matter what change you do if you want old programs that > use the current timestamp with time zone to work. Today you don't get back > the same time zone as you insert, programs might depend on that. [ shrug... ] We've made much larger changes than that in the name of standards compliance. In practice I think the majority of apps are working in contexts where they will get back the same zone as they inserted, if they inserted a zone explicitly at all, so the risk of breakage is not that high. Having a GUC variable that changes the semantics underneath you is *much* riskier, to judge by past experience. >> We are not fully there yet (can't do AT TIME ZONE conversions with all >> zones yet, for instance) > Why is that? Because it's not done yet. There's a set of GMT-offset-only zone names wired into the datetime code (look in the "datetime token table") and those are what AT TIME ZONE knows how to deal with. We need to unify that old stuff with the src/timezone code, but we ran out of time to do it in 8.0. The way I see it, we have three sorts of zones to deal with: fixed numeric offsets from UTC, names that represent fixed offsets (eg, "EST" is the same as UTC-5), and names that represent DST-variable offsets (eg, "EST5EDT"). For what are now entirely historical reasons, various parts of the system cope with different subsets of these three types. I want to get to a state where you can use any of them in any context and it Just Works. (While we are at it, we need to make the set of recognized zone names user-configurable; the australian_timezones kluge satisfies our contributors Down Under, but there are a lot of unhappy people still, because for instance IST means different things in Israel and India.) regards, tom lane
>> That is needed no matter what change you do if you want old programs that >> use the current timestamp with time zone to work. Today you don't get back >> the same time zone as you insert, programs might depend on that. > [ shrug... ] We've made much larger changes than that in the name of > standards compliance. BTW, even if you do want output like that, that doesn't make two datatypes a good idea. It'd be better to add a couple of DateStyle-like formatting options:* rotate all timestamps into current TimeZone for display, or not;* display the timezone numerically,or as originally given. A DateStyle kind of GUC variable is a lot less dangerous than what you were proposing, because getting it wrong doesn't mean you have the wrong data stored in the database ... regards, tom lane
On Fri, 22 Oct 2004, Tom Lane wrote: > behavior. The spec says you can put a numeric-GMT-offset zone in and > get a numeric-GMT-offset zone out. We can do that and also support > named, possibly DST-aware zones. So if I understand you correctly you are planning to extend the current timestamp type to work with both named time zones and HH:MM ones? I didn't think you wanted the last one since your plan was to store a UTC+OID where the OID pointed to a named time zone. And I guess that you don't plan to add 00:00, 00:01, 00:02, ... as named zones with an OID. -- /Dennis Björklund
Tom, > As far as I can tell, Dennis is planning slavish adherence to the spec, > which will mean that the datatype is unable to cope effectively with > daylight-savings issues. So I'm unconvinced that it will be very > helpful to you for remembering local time in addition to true > (universal) time. As somebody who codes calendar apps, I have to say that I have yet to see an implementation of time zones which is at all useful for this purpose, including the current implementation. My calendar apps on PostgreSQL 7.4 use "timestamp without time zone" and keep the time zone in a seperate field. The reason is simple: our current implementation, which does include DST, does not include any provision for the exceptions to DST -- such as Arizona -- or for the difference between "1 day" and "24 hours". (Try adding "30 days" to "2004-10-05 10:00 PDT", you'll see what I mean). Nor do I see a way out of this without raising the complexity, and configurability, level of timezones significantly. So if we're going to be broken (at least from the perspective of calendar applications) we might as well be broken in a spec-compliant way. -- Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > The reason is simple: our current implementation, which does include DST, > does not include any provision for the exceptions to DST -- such as Arizona Say what? regression=# set timezone to 'MST7MDT'; SET regression=# select now(); now -------------------------------2004-10-25 11:52:47.093538-06 (1 row) regression=# set timezone to 'US/Arizona'; SET regression=# select now(); now -------------------------------2004-10-25 10:52:49.441559-07 (1 row) > -- or for the difference between "1 day" and "24 hours". (Try adding "30 > days" to "2004-10-05 10:00 PDT", you'll see what I mean). This is the point about how interval needs to treat "day" as different from "24 hours". I agree with that; the fact that it's not done already is just a reflection of limited supply of round tuits. I think it's orthogonal to the question of how flexible timestamp with time zone needs to be, though. > Nor do I see a way out of this without raising the complexity, and > configurability, level of timezones significantly. This does not seem to me to be an argument why timestamp with time zone ought to be incapable of dealing with DST-aware time zones. That simply guarantees that calendar apps won't be able to use the datatype. If they still can't use it when it can do that, then we can look at the next blocking factor. > So if we're going to be broken (at least from the perspective of calendar > applications) we might as well be broken in a spec-compliant way. I have not said that we can't comply with the spec. I have said that our ambitions need to be higher than merely complying with the spec. regards, tom lane
Tom, > regression=# set timezone to 'US/Arizona'; > SET > regression=# select now(); > now > ------------------------------- > 2004-10-25 10:52:49.441559-07 Wow! When did that get fixed? How do I keep track of this stuff if you guys keep fixing it? ;-) Of course, it would be very helpful if the result above could display "Arizona" instead of the non-specific "-07", but I'm pretty sure that's already a TODO. > This is the point about how interval needs to treat "day" as different > from "24 hours". I agree with that; the fact that it's not done already > is just a reflection of limited supply of round tuits. Well, when I first brought up the issue (2001) I was shot down on the basis of spec-compliance, since SQL92 recognizes only Year/Month and Day/Hour/Minute/etc. partitions. Glad it's up for consideration again. Come to think of it, it was Thomas Lockhart who shot down the idea of fixing Interval, and he's retired now ... > This does not seem to me to be an argument why timestamp with time zone > ought to be incapable of dealing with DST-aware time zones. That simply > guarantees that calendar apps won't be able to use the datatype. If > they still can't use it when it can do that, then we can look at the > next blocking factor. That's definitely a progressive attitude .... pardon me for being pessimistic. > I have not said that we can't comply with the spec. I have said that > our ambitions need to be higher than merely complying with the spec. Hmmm ... well, does the spec specifically prohibit DST, or just leave it out? -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: >> regression=# set timezone to 'US/Arizona'; >> SET >> regression=# select now(); >> now >> ------------------------------- >> 2004-10-25 10:52:49.441559-07 > Wow! When did that get fixed? How do I keep track of this stuff if you > guys keep fixing it? ;-) > Of course, it would be very helpful if the result above could display > "Arizona" instead of the non-specific "-07", but I'm pretty sure that's > already a TODO. Well, that is *exactly what I'm talking about*. I want timestamp with time zone to carry "US/Arizona" not just "-07". Obviously there needs to be some option to get the latter displayed when that's all you want, but internally a value of the datatype needs to be able to carry full knowledge of which timezone it's supposed to be in. Dumbing that down to a simple numeric GMT offset isn't good enough. >> I have not said that we can't comply with the spec. I have said that >> our ambitions need to be higher than merely complying with the spec. > Hmmm ... well, does the spec specifically prohibit DST, or just leave it out? It just doesn't talk about it AFAICS. To comply with the spec we definitely need to be *able* to support timezone values that are simple numeric GMT offsets. But I think we ought also to be able to store values that are references to any of the zic database entries. This looks to me like a straightforward extension of the spec. We went to all the trouble of importing src/timezone in order that we could make a significant upgrade in our timezone capability, and now it's time to take the steps that that enables. Before we were limited to the lowest-common-denominator of the libc timezone routines on all our different platforms, but now we are not... regards, tom lane
On Mon, 25 Oct 2004, Josh Berkus wrote: > Hmmm ... well, does the spec specifically prohibit DST, or just leave it > out? It doesn't discuss it. According to the spec a timestamp with time zone is a UTC value + a HH:MM offset from GMT. And intervals in the spec is either a year-month value or a day-time value. One can only compare year-month values with each other and day-time values with each other. So they avoid the problem of the how many days is a month by not allowing it. The spec is not a full solution, it's also not a useless solution. I'm happy as long as the spec is a subset of what pg implements. If not then I would like to be able to have both but with different names or something similar (but I think that should not be needed). -- /Dennis Björklund
Dennis, > It doesn't discuss it. According to the spec a timestamp with time zone is > a UTC value + a HH:MM offset from GMT. And intervals in the spec is either > a year-month value or a day-time value. One can only compare year-month > values with each other and day-time values with each other. So they avoid > the problem of the how many days is a month by not allowing it. That's not what Tom and I were talking about. The issue is that the spec defines Days/Weeks as being an agglomeration of hours and not an atomic entity like Months/Years are. This leads to some wierd and calendar-breaking behavior when combined with DST, for example: template1=> select '2004-10-09 10:00 PDT'::TIMESTAMPTZ + '45 days'::INTERVAL template1-> ; ?column? ------------------------2004-11-23 09:00:00-08 (1 row) Because of the DST shift, you get an hour shift which is most decidely not anything real human beings would expect from a calendar. The answer is to try-partition INTERVAL values, as: Hour/Minute/Second/ms Day/Week Month/Year However, this could be considered to break the spec; certainly Thomas thought it did. My defense is that the SQL committee made some mistakes, and interval is a big one. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: >> It doesn't discuss it. According to the spec a timestamp with time zone is >> a UTC value + a HH:MM offset from GMT. And intervals in the spec is either >> a year-month value or a day-time value. One can only compare year-month >> values with each other and day-time values with each other. So they avoid >> the problem of the how many days is a month by not allowing it. > That's not what Tom and I were talking about. The issue is that the spec > defines Days/Weeks as being an agglomeration of hours and not an atomic > entity like Months/Years are. I think though that these points are closely related. The reason the spec does that is exactly that they are ignoring DST and so they can assume that 1 day == 24 hours == 86400 seconds. In a DST-aware world you have to make a separation between days and the smaller units, just as months are separated from smaller units because there's not a fixed conversion factor. To some extent the interval and timestamptz issues are orthogonal, but I think it would be good to fix them in the same release if possible. There will undoubtedly be some backwards-compatibility problems, and I suppose that users would prefer to take them all at once than via the chinese water torture method ... > However, this could be considered to break the spec; certainly Thomas > thought it did. My defense is that the SQL committee made some > mistakes, and interval is a big one. I'm not clear to what extent we have to actually break the spec, as opposed to extend it, in order to do this to the "interval" type. To do everything the spec says we need to do, we'll have to be able to make some comparisons that aren't strictly valid (which amounts to assuming that 1 day == 24 hours for some limited purposes) but we already do much the same things with respect to months. (See other thread about whether 1 year == 360 days...) regards, tom lane
On Mon, 25 Oct 2004, Josh Berkus wrote: > Dennis, > > > It doesn't discuss it. According to the spec a timestamp with time zone is > > a UTC value + a HH:MM offset from GMT. And intervals in the spec is either > > a year-month value or a day-time value. One can only compare year-month > > values with each other and day-time values with each other. So they avoid > > the problem of the how many days is a month by not allowing it. > > That's not what Tom and I were talking about. You wanted to know what the standard said, and I told what I knew. > The issue is that the spec defines Days/Weeks as being an agglomeration > of hours and not an atomic entity like Months/Years are. I don't know what you mean with this. The standard does treat them as year month day hour minute second (with fractions) There is no weeks there, if that is what you mean. > This leads to some wierd and calendar-breaking behavior when combined > with DST, for example: > > template1=> select '2004-10-09 10:00 PDT'::TIMESTAMPTZ + '45 days'::INTERVAL > template1-> ; > ?column? > ------------------------ > 2004-11-23 09:00:00-08 > (1 row) > > Because of the DST shift, you get an hour shift which is most decidely not > anything real human beings would expect from a calendar. I don't see how the above can be caused by the representation of an interval. The above timestamp is 2004-10-09 10:00 PDT which in the standard would be 2004-10-09 10:00 -07 and after the additon would be 2004-11-23 10:00:00-07 Here the time zone is wrong since the standard does not know about named zones and dst. An implementation like the one Tom (and I) want would start with 2004-10-09 10:00 PDT and then after the addition one would get 2004-11-23 10:00:00 PST At least that's my understanding of what we want and what we can get (plus that we also need to support HH:MM tz values since those also exist in the world, check this emails header for example). It's possible that you discuss something else, but that has been lost on me so far. -- /Dennis Björklund
On Mon, 25 Oct 2004, Josh Berkus wrote: > Hour/Minute/Second/ms > Day/Week > Month/Year And just when I pressed "send" on the previous mail I got the problem :-) -- /Dennis Björklund
On Mon, 25 Oct 2004, Josh Berkus wrote: > Hour/Minute/Second/ms > Day/Week > Month/Year This is embarrasing. I'm still a bit confused :-) The standard treat days as a separate entry, it does not assume that a day is 24 hours. It restricts the hour field to the interval 0-23 so one can never have something like 25 hours. So it does not need to worry about how many days that translate to. And why do we need weeks also? Well, this is the last mail I send before I've been thinking about this for a while more :-) -- /Dennis Björklund
On Mon, Oct 25, 2004 at 21:18:52 +0200, Dennis Bjorklund <db@zigo.dhs.org> wrote: > On Mon, 25 Oct 2004, Josh Berkus wrote: > > > Hour/Minute/Second/ms > > Day/Week > > Month/Year > > This is embarrasing. I'm still a bit confused :-) > > The standard treat days as a separate entry, it does not assume that a day > is 24 hours. It restricts the hour field to the interval 0-23 so one can > never have something like 25 hours. So it does not need to worry about how > many days that translate to. > > And why do we need weeks also? For convenience. Just like years are a group of months, weeks are a group of days.
Dennis Bjorklund <db@zigo.dhs.org> writes: > The standard treat days as a separate entry, it does not assume that a day > is 24 hours. SQL92 says 4.5.2 Intervals There are two classes of intervals. One class, called year-month intervals, has an express or implied datetimeprecision that in- cludes no fields other than YEAR and MONTH, though not both are required. The otherclass, called day-time intervals, has an ex- press or implied interval precision that can include any fields other than YEAR or MONTH. AFAICS the reason for this rule is that they expect all Y/M intervals to be comparable (which they are) and they also expect all D/H/M/S intervals to be comparable, which you can only do by assuming that 1 D == 24 H. It seems to me though that we can store days separately and do interval comparisons with the assumption 1 D == 24 H, and be perfectly SQL-compatible as far as that goes, and still make good use of the separate day info when adding to a timestamptz that has a DST-aware timezone. In a non-DST-aware timezone the addition will act the same as if we weren't distinguishing days from h/m/s. Therefore, an application using only the spec-defined features (ie, only fixed-numeric-offset timezones) will see no deviation from the spec behavior. regards, tom lane
On Mon, 25 Oct 2004, Tom Lane wrote: > There are two classes of intervals. One class, called year-month > intervals, has an express or implied datetime precision that in- > cludes no fields other than YEAR and MONTH, though not both are > required. The other class, called day-time intervals, has an ex- > press or implied interval precision that can include any fields > other than YEAR or MONTH. > > AFAICS the reason for this rule is that they expect all Y/M intervals to > be comparable (which they are) and they also expect all D/H/M/S intervals > to be comparable, which you can only do by assuming that 1 D == 24 H. I said I was not going to send any more mails, but here we go again :-) The standard restrict the hour field to the interval 0-23, so there can never be any compare between for example '1 day 1 hour' and '25 hours'. This means that one can not add two intervals together to get a bigger one but that it would still work to do timestamp+interval+interval. > It seems to me though that we can store days separately and do interval > comparisons with the assumption 1 D == 24 H, and be perfectly > SQL-compatible as far as that goes, and still make good use of the > separate day info when adding to a timestamptz that has a DST-aware > timezone. In a non-DST-aware timezone the addition will act the same as > if we weren't distinguishing days from h/m/s. Therefore, an application > using only the spec-defined features (ie, only fixed-numeric-offset > timezones) will see no deviation from the spec behavior. I agree with this. -- /Dennis Björklund
Dennis, > An implementation like the one Tom (and I) want would start with > > 2004-10-09 10:00 PDT > > and then after the addition one would get > > 2004-11-23 10:00:00 PST Sounds like we're on the same page then. > The standard restrict the hour field to the interval 0-23, so there can > never be any compare between for example '1 day 1 hour' and '25 hours'. > This means that one can not add two intervals together to get a bigger > one but that it would still work to do timestamp+interval+interval. Hour field of the timestamp, or hour field of interval? There a world of difference. As long as we're willing to live with the understanding that +1day 1 hour may produce a slightly different result than + 25 hours, I don't see the problem. Currently I can add +900 hours if I like, postgreSQL will support it. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
On Mon, 25 Oct 2004, Josh Berkus wrote: > > The standard restrict the hour field to the interval 0-23, so there can > > never be any compare between for example '1 day 1 hour' and '25 hours'. > > This means that one can not add two intervals together to get a bigger > > one but that it would still work to do timestamp+interval+interval. > > Hour field of the timestamp, or hour field of interval? There a world of > difference. Hour field of an interval can be 0-23 according to the spec (doesn't say that we need that restriction, but we do need to understand what the spec say). -- /Dennis Björklund
Josh Berkus <josh@agliodbs.com> writes: > As long as we're willing to live with the understanding that +1day 1 hour may > produce a slightly different result than + 25 hours, I don't see the problem. Right, which is exactly why we can't accept the spec's restriction that the hour field be limited to 0-23. People may legitimately want to add 48 hours to a timestamp, and *not* have that mean the same as adding "2 days". Besides, we would have a backwards-compatibility problem if we tried to forbid it, since as you note we've always accepted such input. regards, tom lane
>>>regression=# set timezone to 'US/Arizona'; >>>SET >>>regression=# select now(); >>>now >>>------------------------------- >>>2004-10-25 10:52:49.441559-07 > > >>Wow! When did that get fixed? How do I keep track of this stuff if you >>guys keep fixing it? ;-) That's worked for ages. What doesn't work is this: usatest=# select current_timestamp at time zone 'US/Arizona'; ERROR: time zone "us/arizona" not recognized Chris
Dennis Bjorklund <db@zigo.dhs.org> writes: > So if I understand you correctly you are planning to extend the current > timestamp type to work with both named time zones and HH:MM ones? I didn't > think you wanted the last one since your plan was to store a UTC+OID where > the OID pointed to a named time zone. And I guess that you don't plan to > add 00:00, 00:01, 00:02, ... as named zones with an OID. I missed getting back to you on this, but I think we can do both. Some random points: * Once we expand timestamptz to bigger than 8 bytes, there's essentially zero cost to making it 12 bytes, and for that matter we could go to 16 without much penalty, because of alignment considerations. So there's plenty of space. * What we need is to be able to represent either a fixed offset from UTC or a reference of some kind to a zic database entry. The most bit-splurging way of doing the former is a signed offset in seconds from Greenwich, which would take 17 bits. It'd be good enough to represent the offset in minutes, which needs only 11 bits. * I suggested OIDs for referencing zic entries, but we don't have to do that; any old mapping table will do. 16 bits would surely be plenty to assign a unique label to every present and future zic entry. * My inclination therefore is to extend timestamptz with two 16-bit fields, one being the offset from UTC (in minutes) and one being the zic identifier. If the identifier is zero then it's a straight numeric offset from UTC and the offset field is all you need (this is the SQL spec compatible case). If the identifier is not zero then it gives you an index to look up the timezone rules. However, there is no need for the offset field to go to waste; we should store the offset anyway, since that might save a trip to the zic database in some cases. * It's not clear to me yet whether the stored offset in the second case should be the zone's standard UTC offset (thus always the same for a given zone ID) or the current-time offset for the timestamp (thus different if the timestamp is in daylight-savings or standard time). * If we store the current-time offset then it almost doesn't matter whether the timestamp itself is stored as a UTC or local time value; you can trivially translate either to the other by adding or subtracting the offset (*60). But I'm inclined to store UTC for consistency with past practice, and because it will make comparisons a bit faster: you can compare the timestamps without adjusting first. Generally I think comparisons ought to be the best-optimized operations in a Postgres datatype, because index operations will do a ton of 'em. (We definitely do NOT want to have to visit the zic database in order to compare two timestamptz values.) regards, tom lane
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes: > That's worked for ages. What doesn't work is this: > usatest=# select current_timestamp at time zone 'US/Arizona'; > ERROR: time zone "us/arizona" not recognized Right, and similarly you can do regression=# select '2004-10-25 21:32:33.430222 MST'::timestamptz; timestamptz -------------------------------2004-10-26 00:32:33.430222-04 (1 row) but not regression=# select '2004-10-25 21:32:33.430222 US/Arizona'::timestamptz; ERROR: invalid input syntax for type timestamp with time zone: "2004-10-25 21:32:33.430222 US/Arizona" I would like to see both of these cases working in 8.1; and furthermore I'd like to see the timezone specs coming back as entered, not as bare numeric offsets. (This will need to be adjustable via a DateStyle option, of course, but I want the information to be in there whether it is displayed or not.) regards, tom lane
Added to TODO: * Once we expand timestamptz to bigger than 8 bytes, there's essentially --------------------------------------------------------------------------- Tom Lane wrote: > Dennis Bjorklund <db@zigo.dhs.org> writes: > > So if I understand you correctly you are planning to extend the current > > timestamp type to work with both named time zones and HH:MM ones? I didn't > > think you wanted the last one since your plan was to store a UTC+OID where > > the OID pointed to a named time zone. And I guess that you don't plan to > > add 00:00, 00:01, 00:02, ... as named zones with an OID. > > I missed getting back to you on this, but I think we can do both. Some > random points: > > * Once we expand timestamptz to bigger than 8 bytes, there's essentially > zero cost to making it 12 bytes, and for that matter we could go to 16 > without much penalty, because of alignment considerations. So there's > plenty of space. > > * What we need is to be able to represent either a fixed offset from UTC > or a reference of some kind to a zic database entry. The most > bit-splurging way of doing the former is a signed offset in seconds from > Greenwich, which would take 17 bits. It'd be good enough to represent > the offset in minutes, which needs only 11 bits. > > * I suggested OIDs for referencing zic entries, but we don't have to do > that; any old mapping table will do. 16 bits would surely be plenty to > assign a unique label to every present and future zic entry. > > * My inclination therefore is to extend timestamptz with two 16-bit > fields, one being the offset from UTC (in minutes) and one being the > zic identifier. If the identifier is zero then it's a straight numeric > offset from UTC and the offset field is all you need (this is the SQL > spec compatible case). If the identifier is not zero then it gives you > an index to look up the timezone rules. However, there is no need for > the offset field to go to waste; we should store the offset anyway, > since that might save a trip to the zic database in some cases. > > * It's not clear to me yet whether the stored offset in the second case > should be the zone's standard UTC offset (thus always the same for a > given zone ID) or the current-time offset for the timestamp (thus > different if the timestamp is in daylight-savings or standard time). > > * If we store the current-time offset then it almost doesn't matter > whether the timestamp itself is stored as a UTC or local time value; > you can trivially translate either to the other by adding or subtracting > the offset (*60). But I'm inclined to store UTC for consistency with > past practice, and because it will make comparisons a bit faster: you > can compare the timestamps without adjusting first. Generally I think > comparisons ought to be the best-optimized operations in a Postgres > datatype, because index operations will do a ton of 'em. (We definitely > do NOT want to have to visit the zic database in order to compare two > timestamptz values.) > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
This thread has been added as a link on the TODO list under TODO.detail. --------------------------------------------------------------------------- Dennis Bjorklund wrote: > I've made a partial implementation of a datatype "timestamp with time > zone" as described in the sql standard. The current type "timestamptz" > does not store the time zone as a standard one should do. So I've made a > new type I've called timestampstdtz that does store the time zone as the > standard demands. > > Let me show a bit of what currently works in my implementation: > > dennis=# CREATE TABLE foo ( > a timestampstdtz, > > primary key (a) > ); > dennis=# INSERT INTO foo VALUES ('1993-02-04 13:00 UTC'); > dennis=# INSERT INTO foo VALUES ('1999-06-01 14:00 CET'); > dennis=# INSERT INTO foo VALUES ('2003-08-21 15:00 PST'); > > dennis=# SELECT a FROM foo; > a > ------------------------ > 1993-02-04 13:00:00+00 > 1999-06-01 14:00:00+01 > 2003-08-21 15:00:00-08 > > dennis=# SELECT a AT TIME ZONE 'CET' FROM foo; > timezone > ------------------------ > 1993-02-04 14:00:00+01 > 1999-06-01 14:00:00+01 > 2003-08-22 00:00:00+01 > > My plan is to make a GUC variable so that one can tell PG that constructs > like "timestamp with time zone" will map to timestampstdtz instead of > timestamptz (some old databases might need the old so unless we want to > break old code this is the easiest solution I can find). > > I've made an implicit cast from timestampstdtz to timestamptz that just > forgets about the time zone. In the other direction I've made an > assignment cast that make a timestamp with time zone 0 (that's what a > timestamptz is anyway). Would it be possible to make it implicit in both > directions? I currently don't think that you want that, but is it > possible? > > With the implicit cast in place I assume it would be safe to change > functions like now() to return a timestampstdtz? I've not tried yet but I > will. As far as I can tell the cast would make old code that use now() to > still work as before. > > Any comments before I invest more time into this subject? > > -- > /Dennis Bj?rklund > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073