Thread: mark the timestamptz variant of date_bin() as stable
(Starting a new thread for greater visibility)
The attached is a fairly straightforward correction. I did want to make sure it was okay to bump the catversion in the PG14 branch also. I've seen fixes where doing that during beta was in question.
--
Attachment
John Naylor <john.naylor@enterprisedb.com> writes: > (Starting a new thread for greater visibility) > The attached is a fairly straightforward correction. I did want to make > sure it was okay to bump the catversion in the PG14 branch also. I've seen > fixes where doing that during beta was in question. Yeah, you need to bump catversion. regards, tom lane
On Tue, Aug 31, 2021 at 3:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> John Naylor <john.naylor@enterprisedb.com> writes:
> > (Starting a new thread for greater visibility)
> > The attached is a fairly straightforward correction. I did want to make
> > sure it was okay to bump the catversion in the PG14 branch also. I've seen
> > fixes where doing that during beta was in question.
>
> Yeah, you need to bump catversion.
Done, thanks for confirming.
--
John Naylor
EDB: http://www.enterprisedb.com
>
> John Naylor <john.naylor@enterprisedb.com> writes:
> > (Starting a new thread for greater visibility)
> > The attached is a fairly straightforward correction. I did want to make
> > sure it was okay to bump the catversion in the PG14 branch also. I've seen
> > fixes where doing that during beta was in question.
>
> Yeah, you need to bump catversion.
Done, thanks for confirming.
--
John Naylor
EDB: http://www.enterprisedb.com
John Naylor <john.naylor@enterprisedb.com> writes: > On Tue, Aug 31, 2021 at 3:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, you need to bump catversion. > Done, thanks for confirming. For future reference --- I think it's potentially confusing to use the same catversion number in different branches, except for the short time after a new branch where the initial catalog contents are actually identical. So the way I'd have done this is to use 202108311 in the back branch and 202108312 in HEAD. It's not terribly important, but something to note for next time. regards, tom lane
Hi John, By looking at timestamptz_bin() implementation I don't see why it should be STABLE. Its return value depends only on the input values. It doesn't look at the session parameters. timestamptz_in() and timestamptz_out() are STABLE, that's true, but this is no concern of timestamptz_bin(). Am I missing something? -- Best regards, Aleksander Alekseev
On Wed, Sep 1, 2021 at 5:32 AM Aleksander Alekseev <aleksander@timescale.com> wrote:
>
> By looking at timestamptz_bin() implementation I don't see why it
> should be STABLE. Its return value depends only on the input values.
> It doesn't look at the session parameters. timestamptz_in() and
> timestamptz_out() are STABLE, that's true, but this is no concern of
> timestamptz_bin().
I'm not quite willing to bet the answer couldn't change if the timezone changes, but it's possible I'm the one missing something.
--
John Naylor
EDB: http://www.enterprisedb.com
John Naylor <john.naylor@enterprisedb.com> writes: > On Wed, Sep 1, 2021 at 5:32 AM Aleksander Alekseev <aleksander@timescale.com> > wrote: >> By looking at timestamptz_bin() implementation I don't see why it >> should be STABLE. Its return value depends only on the input values. >> It doesn't look at the session parameters. timestamptz_in() and >> timestamptz_out() are STABLE, that's true, but this is no concern of >> timestamptz_bin(). > I'm not quite willing to bet the answer couldn't change if the timezone > changes, but it's possible I'm the one missing something. After playing with it for awhile, it seems like the behavior is indeed not TZ-dependent, but the real question is should it be? As an example, regression=# set timezone to 'America/New_York'; SET regression=# select date_bin('1 day', '2021-11-01 00:00 +00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz); date_bin ------------------------ 2021-10-31 00:00:00-04 (1 row) regression=# select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz); date_bin ------------------------ 2021-11-08 23:00:00-05 (1 row) I see that these two answers are both exactly multiples of 24 hours away from the given origin. But if I'm binning on the basis of "days" or larger units, I would sort of expect to get local midnight, and I'm not getting that once I cross a DST boundary. If this is indeed the behavior we want, I concur with Aleksander that date_bin isn't TZ-sensitive and needn't be marked STABLE. regards, tom lane
Justin Pryzby <pryzby@telsasoft.com> writes: > On Wed, Sep 01, 2021 at 01:26:26PM -0400, John Naylor wrote: >> I'm not quite willing to bet the answer couldn't change if the timezone >> changes, but it's possible I'm the one missing something. > ts=# SET timezone='-12'; > ts=# SELECT date_bin('1hour', '2021-07-01 -1200', '2021-01-01'); > date_bin | 2021-07-01 00:00:00-12 > ts=# SET timezone='+12'; > ts=# SELECT date_bin('1hour', '2021-07-01 -1200', '2021-01-01'); > date_bin | 2021-07-02 00:00:00+12 Yeah, but those are the same timestamptz value. Another problem with this example as written is that the origin values being used are not the same in the two cases ... so I think it's a bit accidental that the answers come out the same. regards, tom lane
On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> regression=# set timezone to 'America/New_York';
> SET
> regression=# select date_bin('1 day', '2021-11-01 00:00 +00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz);
> date_bin
> ------------------------
> 2021-10-31 00:00:00-04
> (1 row)
>
> regression=# select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz);
> date_bin
> ------------------------
> 2021-11-08 23:00:00-05
> (1 row)
>
> I see that these two answers are both exactly multiples of 24 hours away
> from the given origin. But if I'm binning on the basis of "days" or
> larger units, I would sort of expect to get local midnight, and I'm not
> getting that once I cross a DST boundary.
Hmm, that's seems like a reasonable expectation. I can get local midnight if I recast to timestamp:
# select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp, '2021-09-01 00:00 -04'::timestamptz::timestamp);
date_bin
---------------------
2021-11-09 00:00:00
(1 row)
date_bin
---------------------
2021-11-09 00:00:00
(1 row)
It's a bit unintuitive, though.
John Naylor <john.naylor@enterprisedb.com> writes: > On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I see that these two answers are both exactly multiples of 24 hours away >> from the given origin. But if I'm binning on the basis of "days" or >> larger units, I would sort of expect to get local midnight, and I'm not >> getting that once I cross a DST boundary. > Hmm, that's seems like a reasonable expectation. I can get local midnight > if I recast to timestamp: > # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp, > '2021-09-01 00:00 -04'::timestamptz::timestamp); > date_bin > --------------------- > 2021-11-09 00:00:00 > (1 row) Yeah, and then back to timestamptz if that's what you really need :-( > It's a bit unintuitive, though. Agreed. If we keep it like this, adding some documentation around the point would be a good idea I think. regards, tom lane
On Wed, Sep 1, 2021 at 3:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> John Naylor <john.naylor@enterprisedb.com> writes:
> > On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I see that these two answers are both exactly multiples of 24 hours away
> >> from the given origin. But if I'm binning on the basis of "days" or
> >> larger units, I would sort of expect to get local midnight, and I'm not
> >> getting that once I cross a DST boundary.
>
> > Hmm, that's seems like a reasonable expectation. I can get local midnight
> > if I recast to timestamp:
>
> > # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
> > '2021-09-01 00:00 -04'::timestamptz::timestamp);
> > date_bin
> > ---------------------
> > 2021-11-09 00:00:00
> > (1 row)
>
> Yeah, and then back to timestamptz if that's what you really need :-(
>
> > It's a bit unintuitive, though.
>
> Agreed. If we keep it like this, adding some documentation around
> the point would be a good idea I think.
Having heard no votes on changing this behavior (and it would be a bit of work), I'll start on a documentation patch. And I'll go ahead and re-mark the function as immutable tomorrow barring objections.
--
John Naylor
EDB: http://www.enterprisedb.com
On Wed, Sep 1, 2021 at 3:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> John Naylor <john.naylor@enterprisedb.com> writes:
> > On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I see that these two answers are both exactly multiples of 24 hours away
> >> from the given origin. But if I'm binning on the basis of "days" or
> >> larger units, I would sort of expect to get local midnight, and I'm not
> >> getting that once I cross a DST boundary.
>
> > Hmm, that's seems like a reasonable expectation. I can get local midnight
> > if I recast to timestamp:
>
> > # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
> > '2021-09-01 00:00 -04'::timestamptz::timestamp);
> > date_bin
> > ---------------------
> > 2021-11-09 00:00:00
> > (1 row)
>
> Yeah, and then back to timestamptz if that's what you really need :-(
>
> > It's a bit unintuitive, though.
>
> Agreed. If we keep it like this, adding some documentation around
> the point would be a good idea I think.
Attached is a draft doc patch using the above examples. Is there anything else that would be useful to mention?
--
John Naylor
EDB: http://www.enterprisedb.com
Attachment
> On Wed, Sep 1, 2021 at 3:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > John Naylor <john.naylor@enterprisedb.com> writes:
> > > On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> I see that these two answers are both exactly multiples of 24 hours away
> > >> from the given origin. But if I'm binning on the basis of "days" or
> > >> larger units, I would sort of expect to get local midnight, and I'm not
> > >> getting that once I cross a DST boundary.
> >
> > > Hmm, that's seems like a reasonable expectation. I can get local midnight
> > > if I recast to timestamp:
> >
> > > # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
> > > '2021-09-01 00:00 -04'::timestamptz::timestamp);
> > > date_bin
> > > ---------------------
> > > 2021-11-09 00:00:00
> > > (1 row)
> >
> > Yeah, and then back to timestamptz if that's what you really need :-(
> >
> > > It's a bit unintuitive, though.
> >
> > Agreed. If we keep it like this, adding some documentation around
> > the point would be a good idea I think.
>
> Attached is a draft doc patch using the above examples. Is there anything else that would be useful to mention?
Any thoughts on the doc patch?
--
John Naylor
EDB: http://www.enterprisedb.com
Hi John, > Any thoughts on the doc patch? It so happened that I implemented a similar feature in TimescaleDB [1]. I discovered that it's difficult from both developer's and user's perspectives to think about the behavior of the function in the context of given TZ and its complicated rules, as you are trying to do in the doc patch. So what we did instead is saying: for timestamptz the function works as if it was timestamp. E.g. time_bucket_ng("3 month", "2021 Oct 03 12:34:56 TZ") = "2021 Jan 01 00:00:00 TZ" no matter what TZ it is and what rules (DST, corrections, etc) it has. It seems to be not only logical and easy to understand, but also easy to implement [2]. Do you think it would be possible to adopt a similar approach in respect of documenting for date_bin()? To be honest, I didn't try to figure out what is the actual implementation of date_bin() for TZ case. [1]: https://docs.timescale.com/api/latest/hyperfunctions/time_bucket_ng/ [2]: https://github.com/timescale/timescaledb/blob/master/src/time_bucket.c#L470 -- Best regards, Aleksander Alekseev
Hi hackers, > the function works as if it was timestamp. E.g. time_bucket_ng("3 > month", "2021 Oct 03 12:34:56 TZ") = "2021 Jan 01 00:00:00 TZ" no That was a typo. What I meant was: time_bucket_ng("3 month", "2021 Feb 03 12:34:56 TZ") February, not October. Sorry for the confusion. -- Best regards, Aleksander Alekseev
On Thu, Sep 23, 2021 at 4:13 AM Aleksander Alekseev <aleksander@timescale.com> wrote:
>
> Hi John,
>
> > Any thoughts on the doc patch?
>
> It so happened that I implemented a similar feature in TimescaleDB [1].
>
> I discovered that it's difficult from both developer's and user's
> perspectives to think about the behavior of the function in the
> context of given TZ and its complicated rules, as you are trying to do
> in the doc patch. So what we did instead is saying: for timestamptz
> the function works as if it was timestamp. E.g. time_bucket_ng("3
> month", "2021 Oct 03 12:34:56 TZ") = "2021 Jan 01 00:00:00 TZ" no
> matter what TZ it is and what rules (DST, corrections, etc) it has. It
> seems to be not only logical and easy to understand, but also easy to
> implement [2].
>
> Do you think it would be possible to adopt a similar approach in
> respect of documenting for date_bin()? To be honest, I didn't try to
> figure out what is the actual implementation of date_bin() for TZ
> case.
I think you have a point that it could be stated more simply and generally. I'll try to move in that direction.
I wrote:
> I think you have a point that it could be stated more simply and generally. I'll try to move in that direction.
On second thought, I don't think this is helpful. Concrete examples are easier to reason about.
--
John Naylor
EDB: http://www.enterprisedb.com