Re: Timezones (in 8.5?) - Mailing list pgsql-hackers

From hgonzalez@gmail.com
Subject Re: Timezones (in 8.5?)
Date
Msg-id 000e0cd70ece3a08d90478be4580@google.com
Whole thread Raw
In response to Re: Timezones (in 8.5?)  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List pgsql-hackers
On Nov 19, 2009 1:18am, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:<br />> <br />> Right, but
includingmore data in a single type is the wrong approach,<br />> since it complicates the semantics and interferes
withnormalization.<br />> For example, if you have a type T which incorporates a timestamp and a<br />> timezone,
whatsemantics does the T = T operator have? What semantics<br />> apply if the definitions of timezones change? <br
/><br/>I dont get the thing about normalization, there's complete ortoghonality in my approach.<br />And when you say
"complicatesthe semantic" I'd say "enrich the semantics" (and even<br />clarify it) so that it fits more neatly to the
typicalusage of dates and times in real life.<br />For "datetimes with tz" the equality (and comparison) operator is
nottrivial, roughly <br />in the same sense that date-time arithmetic is not trivial when one stops thinking of<br
/>datetimesas "physical time". So is life. Should the datetimes <br />'2010-07-27 9:30 Chile' and '2010-07-27 10:30
Argentine'(GMT+4 and GMT+3 respec)<br />be considered equal? It's arguable; but the ambiguity (just a matter of
adoption)reflects<br />reality. We can discuss it and adopt some consistent criteria.<br /><br />> What if you're
storing<br/>> times of events at specific places; in that case you want to associate<br />> the timezone with the
_place_not the event (so that if the timezone<br />> rules change, moving the place from one timezone to another,
youonly<br />> have to change the place, not all the events that refer to it).<br /><br />I'm not sure I undestand
youhere. I'm claiming that timezone rules alterations<br />(zic files changes) should always be supported by the db
implementation,without<br />needing of touching your data. And I believe that timestamps (i.e. physical times)<br />are
inpractice almost never associated to timezone information. If you want to store<br />"the instant of last solar
eclipse"you normally store the timestamp, a timezone<br />might only be useful for displaying (or as an adittional
info,not really associated to the event)<br />A border case would be "store the instant of the death of John Lennon".
Youmight<br />store the TZ here if you are interested in the civil time (so you can answer, for example,<br />¿how many
rockstars died in morning/afternoon?). But then, again, you are here actually<br />storing a civil date (local
date-timeplus TZ). The only problematic case i can envision<br />is to intend to store a physical time in the future
withTZ, but frankly it is difficult to<br />think of this scenario (and even more difficult to think of needing to
operatewith that<br />data as a whole; hence, in this case, to store the two fields separatadely makes sense).<br /><br
/>I'mbeing dense, and this might be a lost cause, but anyway, perhaps some day in the future<br />this might be of some
use:<br/><br />I strongly believe that, if one could sample the real needings and usage of date-time types in <br
/>applicationsin this world, and taking apart types DATE (very frequent, but rather straightforward), <br />and TIME
(notso relevant) and intervals (other issues here, much related to datetimes), the <br />overwhelming majority would
fall( conceptually) into these three types:<br /><br />- TIMESTAMP (physical time - no TZ - no civil time implied)<br
/>-LOCAL DATETIME (civil time, no TZ)<br />- DATETIME (civil time with TZ => togheter with zic tables, implies a
physicaltime)<br /><br />And of these three -I'd bet- the first is (conceptually) the most common, by a wide margin.
<br/><br />As the name TIMESTAMP implies, it frequently records the moment of a event (in the DB corresponds<br />
frequentlyto the creation or alteration of a record, frequently via a "now()" default or such). <br />Examples: the
timestampof messages in a mailing list, or issues in a bugtracker, or posts/articles in a blog/Cms.<br />Sometimes it
ismodifiable by the user. Sometimes it is displayed (as a civil date, of course) according <br />to some TZ implied
somewhereelse. It's normal that users with differnt TZ sees this event each <br />with its own TZ; and one is not
directlyinterested on obtaining (say) an "inherent" civil datetime for the<br />event (for example one is not
interestedin asking what posts where generated at midnight<br />acording to the localtime of the user that created
it).<br/><br />The LOCAL DATETIME is only of use for civil date-times, when one is not directly interested in<br
/>asociateevents with real (physicial time) - this cannot be compared with a real time (it cant trigger alarms, eg)<br
/>Or,more rarely, when the TZ is implied somehere else (in the application, not it the DB server!).<br /><br />The
DATETIMEis equivalent to the compound type {LOCAL_DATETIME,TZ}. Here the "civil date-time" is again<br />the primary
conceptone deals with, but in a given place in the world (TZ), so it implies also (with the assistance of a zic
table)<br/>a real time. This type is, IMHO, less frequent than the others. The typical use is for calendars or
schedulers.<br/><br />One could, a propos Andrew's observation, consider a fourth type: TIMESTAMP WITH TZ. But it seems
overkill:<br />except for ZIC changes, the correspondence with DATETIME is univocal (BTW, this is why in the Jodatime
API<br/>-which does not deal with persistence- this concepts are strictly equivalent). Given this nearly-equivalence,
andthat<br /> the needing of this type in real life is (IMO) almost null, I think that DATETIME is the one to
survive.<br/><br />(One could even propose a fifht type: a TIMESTAMP WITH GMT OFFSET (roughly the ANSI proposal),
which<br/>would be equivalent to have a timestamp AND a local datetime; this is more easy to deal with than
timezones,<br/>but (as was discussed here before) is too limited (does not allow artithmetic) and is not orthogonal
withthe real useful types.)<br /><br />Regarding implementation:<br /><br /> TIMESTAMP is straightforward, more or less
thesame as today: stored as UTC, can be <br />input/output in ISO 8601 format (the client/server can use the offset
theylike, internally it's translated to GMT+0)<br /><br />LOCAL_DATETIME also is straightforward, also stored as UTC
(asin GMT+0) . BUT <br />- input/output in ISO 8601 format should not allow/produce GMT offset<br />- the similarity of
implementationshould not leak upwards. This types are incompatible, cannot be compared, etc<br /><br />DATETIME is the
difficultone, of course.<br /> - Equivalent to the pair {LOCAL_DATETIME,TX_id} (occupies more space)<br /> - Requires
somecatalog table or something akin to codify consistently the timezones as numbers (included in pg_dump output?)<br />
-Requires new definitions for input/output (and deal with some ambiguities, particulary in DST transitions)<br /> -
Requiressome semantic definitions (orderig, equality) <br /> - Some arithmetics (which involve convertion to physic
time)may be expensive, might require some aggresive caching of time (zic) calculations.<br /><br />Global
considerations:<br /> - Backward compatibility?<br /> - SQL spec compatibility? (is worsened?) (deprecate TIMESTAMP
WITHTIMEZONE?)<br /> - implement conversion functions - castings (how strict?)<br /> - discuss/implement interval
types/functions<br/> - interfaces (JDBC...)<br /><br />A bit of work, granted... (I might help)... but I bet that the
mostdifficult work, by far, is to <br />reach an agreement :-)<br /><br />Thanks for reading.<br /><br />Hernán J.
González

pgsql-hackers by date:

Previous
From: hernan gonzalez
Date:
Subject: Re: Timezones (in 8.5?)
Next
From: Itagaki Takahiro
Date:
Subject: Re: Deleted WAL files held open by backends in Linux