Re: Timestamp/Interval proposals: Part 2 - Mailing list pgsql-hackers

From Thomas Lockhart
Subject Re: Timestamp/Interval proposals: Part 2
Date
Msg-id 3D01626F.1A5C9072@fourpalms.org
Whole thread Raw
In response to Timestamp/Interval proposals: Part 2  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Timestamp/Interval proposals: Part 2
Re: Timestamp/Interval proposals: Part 2
List pgsql-hackers
> Please give me feedback on this...
> There are a few problems currently with the Interval data type.  The biggest
> is that the current rules give us no clear path for implementation of a full
> set of operators.  The SQL92 standard is no help here; its implementation is
> unintuitive and extremely limited ... more limited, in fact, than the current
> incomplete implementation in PostgreSQL.

Please define "a full set of operators". Or do the subsequent proposals
defining new behaviors and some operations constitute that list?

> Proposal #3:  We should support the addition of "whole days".
> Description: Interval should support a "Weeks to Days" increment which is
> atomic per day, and not as a aggregate of hours.
> Reason: Currently, the "days" increment in Interval is treated as "x 24 hours"
> and not as whole days.  This can cause some confusion when date calculations
> break over a DST change; users do *not* expect events to get an hour earlier
> or later in the fall or the spring.  The current result is that a lot of
> users give up on utilizing time zones because they can't deal with the time
> shift in calendar applications.

You are overstating the problem imho, but there is a problem for some
users. SQL9x avoids the issue by defining *only* constant offsets for
time zones. That doesn't work in the real world :/

We would expand the storage size by at least 4 bytes to accomodate the
"qualitative day" information. Currently takes 12 bytes, and will take
16 or more. We will need to check for overflows during date/time math,
we will need some heuristics for conversions between hours and days
during calculations, and some users will need to cope with the changed
behavior. Operations like math and comparisons will be more expensive
(though may not be a hugely noticable effect).

> Proposal #4:  Create to_char(INTERVAL, 'format string') Function.
> Reason:  self-evident, I think.

Oh. Didn't know it wasn't already there.

> Proposal #5:  Two alternate proposals for overhaul of the interval data type.
> Description:  Interval needs some radical changes to calculations and
> operators.
> Reason:  Currently, it is nearly impossible to conceive of an implementation
> for a full set of operators for the interval data type ( + - / * ) because of
> the variability of conversions from one interval increment to another.  For
> example, what exactly should be the result of '3 months' / '4 days'?  Here
> are two alternatives.
> Alternative #1: Treat Interval Increments as Atomic, and Round

Yuck (imho of course ;)

> If we implemented this, each of the 3 sub-types of Interval (Year to Month,
> Week to Day, and Hour to Millesecond per proposal #3) would be treated as
> "atomic" and not renderable in terms of smaller increments, in the same way
> that integers are not divisible beyond a prime.  In fact, rather than
> expressing remainders in smaller increments, the modulo ( % ) operator would
> be used to express the remainder.
> 
> Further, we would need to create a set of casting functions that allows for
> the conversion of one interval subtype into another, using rounding by
> approximates, such as 1 year = 365 days, 1 month = 30 days, 1 day = 24 hours,
> etc.  This is not that different from how the DATE data type works.  If users
> attempt multiplication and division with intervals of different subtypes, an
> implicit cast would be made into the subtype of the smallest value.


> Finally, multiplication and division by floats would be disallowed and
> replaced by multiplication and division by integers.  Thus:

Overly restrictive I think. There *is* a use for maintaining precision
during math operations, though apparently not for your use cases.

> '1 month' + '33 days' = '1 month 33 days'
> '1 month 33 days'::INTERVAL WEEK TO DAY = '63 days'
> '1 month' + '33 days'::INTERVAL YEAR TO MONTH = '2 months'
> '5 months' / '2 months' = 2
> '5 months' % '2 months' = '1 month'
> '5 months' / 2 = '2 months'
> '5 months' % 2 = '1 month'
> '9 months' / '2 weeks' = '270 days' / '14 days' = 19
> '15 hours' * 20 = '300 hours' (not '12 days 12 hours')
> etc.
> 
> Pros:  It's simple and relatively intuitive.   This approach also is similar
> to the SQL92 spec, which focuses on interval subtypes.
> Cons:  It requires an annoying implementation of subtypes, which is cumbersome
> and difficult to manage when you have mixed intervals (e.g. '4 days 8 hours 9
> minutes').  And, with every operation, rounding is being used which can
> result in some ghastly inequalities:
> '1 year'/12 --> '1 month'::INTERVAL WEEK TO DAY --> '30 days' * 12
> --> '360 days' / '1 year' = 0
> 
> Alternative #2: Tie Intervals to a Specific Timestamp

Double yuck. You already have this capability by your choice of schema;
intervals are intervals and timestamps are timestamps. The behaviors you
discuss above (both current and possible) handle this.

> ---------------------------------------------
> And, a re-hash of Part I:
> 
> PROPOSAL FOR ADJUSTMENTS OF POSTGRESQL TIMESTAMP AND INTERVAL HANDLING
> Draft 0.2
> Proposal #2: We need more time zones.
> Description:  We need to add, or be able to add, many new time zones to
> Postgresql.  Ideal would be some kind of "create time zone" statement.
> Reason:  Current included time zones do not cover all real-world time zones,
> and the situation is likely to get worse as various governments play with
> their calendars.  For example, there is no current time zone which would be
> appropriate for the state of Arizona, i.e. "Central Standard Time without
> Daylight Savings Time".

Bad example, and I'm not following your argument here. PostgreSQL
supports *many* time zones (Peter E. has said "too many") and any change
for the Arizona example will be at odds with how dates and times are
expected to be handled in, uh, Arizona. They use Mountain Standard Time
(MST), except for years when they didn't, and are covered by specifying
"MST" on input and "SET TIME ZONE 'America/Phoenix'" (and perhaps others
too; it seems that "MST6" gives me consistant behavior on my Linux box).

> Further:  A CREATE TIME ZONE statement would have the following syntax:
> CREATE TIME ZONE GMT_adjustment, abbreviation, uses_DST, DST_starts
> (optional),
> DST_ends (optional)
> This would allow, to some degree, DBA creation of time zones to take into
> account local laws and wierdnesses.
> Alternative:  We can allow users to designate timezones according to GMT
> offset and whether or not they support DST. Example "-8:00 DST" for PST/PDT,
> and "-7:00 NDS" for the Arizona example above.

I can't imagine that you are not finding a workable solution with the
current capabilities. That said, we are considering adopting the
historic zinc package to support time zones within PostgreSQL (sounds
like you might be doing some of the development ;). And for time zone
lookup (not supported in the zinc API) it *would* be nice to move to a
DBMS table-based implementation, rather than the hardcoded tables we
have now. They may have been good enough for the last 12 years, but
certainly lookup stuff seems like it should be in a database table, eh?
                 - Thomas


pgsql-hackers by date:

Previous
From: "Rod Taylor"
Date:
Subject: postgresql.conf -> debug_level
Next
From: Bruce Momjian
Date:
Subject: Re: [SQL] PostgreSQL on AIX