Re: Range types - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Range types
Date
Msg-id 1260914527.13414.1977.camel@monkey-cat.sm.truviso.com
Whole thread Raw
In response to Re: Range types  (David Fetter <david@fetter.org>)
Responses Re: Range types  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2009-12-15 at 11:49 -0800, David Fetter wrote:
> That sounds like a recipe for disaster.  Whatever timestamp ranges
> are, float and int64 should be treated the same way so as not to get
> "surprises" due to implementation details.

It's a compile-time option that will change the semantics of timestamps
regardless. Anyone who changes between float and int64 timestamps may
experience problems for a number of different reasons. What was unique
before might no longer be unique, for instance.

> FWIW, I think it would be a good idea to treat timestamps as
> continuous in all cases.

I disagree. There is a lot of value in treating timestamp ranges as
discrete.

One big reason is that the ranges can be translated between the
different input/output forms, and there's a canonical form. As we know,
a huge amount of the value in an RDBMS is unifying data from multiple
applications with different conventions.

So, let's say one application uses (] and another uses [). If you are
mixing the data and returning it to the application, you want to be able
to provide the result according to its convention. You can't do that
with a continuous range.

And things get more interesting: if you mix (] and [), then range_union
will produce () and range_intersect will produce []. So now you have all
four conventions floating around the same database.

Another reason you might mix conventions: say you have log data from
several sources, and some sources provide timestamps for an event which
is essentially "instantaneous" and other sources will log the period of
time over which the event occurred, in [) format. To mix the data
coherently, the correct thing to do is call the instantaneous points a
singleton range; but the only way to represent a singleton continuous
range is by using [].

Whenever you mix conventions, you either have to be able to change the
format (which is only possible with discrete ranges) or teach the
application to understand your convention. And if you don't have a
canonical form (which is only possible with discrete ranges), you can't
reliably compare values for equality, or see if they are adjacent.

Saying that discrete ranges are unnecessary is essentially saying that
there's only a use for one convention; or that the conventions will
never be mixed; or that applications will always be smart enough to sort
it out for themselves.

Regards,Jeff Davis



pgsql-hackers by date:

Previous
From: Scott Bailey
Date:
Subject: Re: Range types
Next
From: Marko Kreen
Date:
Subject: Re: Patch: Remove gcc dependency in definition of inline functions