Thread: mark the timestamptz variant of date_bin() as stable

mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:
(Starting a new thread for greater visibility)

The attached is a fairly straightforward correction. I did want to make sure it was okay to bump the catversion in the PG14 branch also. I've seen fixes where doing that during beta was in question.

--
Attachment

Re: mark the timestamptz variant of date_bin() as stable

From
Tom Lane
Date:
John Naylor <john.naylor@enterprisedb.com> writes:
> (Starting a new thread for greater visibility)
> The attached is a fairly straightforward correction. I did want to make
> sure it was okay to bump the catversion in the PG14 branch also. I've seen
> fixes where doing that during beta was in question.

Yeah, you need to bump catversion.

            regards, tom lane



Re: mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:
On Tue, Aug 31, 2021 at 3:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> John Naylor <john.naylor@enterprisedb.com> writes:
> > (Starting a new thread for greater visibility)
> > The attached is a fairly straightforward correction. I did want to make
> > sure it was okay to bump the catversion in the PG14 branch also. I've seen
> > fixes where doing that during beta was in question.
>
> Yeah, you need to bump catversion.

Done, thanks for confirming.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: mark the timestamptz variant of date_bin() as stable

From
Tom Lane
Date:
John Naylor <john.naylor@enterprisedb.com> writes:
> On Tue, Aug 31, 2021 at 3:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, you need to bump catversion.

> Done, thanks for confirming.

For future reference --- I think it's potentially confusing to use
the same catversion number in different branches, except for the
short time after a new branch where the initial catalog contents
are actually identical.  So the way I'd have done this is to use
202108311 in the back branch and 202108312 in HEAD.  It's not
terribly important, but something to note for next time.

            regards, tom lane



Re: mark the timestamptz variant of date_bin() as stable

From
Aleksander Alekseev
Date:
Hi John,

By looking at timestamptz_bin() implementation I don't see why it
should be STABLE. Its return value depends only on the input values.
It doesn't look at the session parameters. timestamptz_in() and
timestamptz_out() are STABLE, that's true, but this is no concern of
timestamptz_bin().

Am I missing something?

-- 
Best regards,
Aleksander Alekseev



Re: mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:

On Wed, Sep 1, 2021 at 5:32 AM Aleksander Alekseev <aleksander@timescale.com> wrote:
>
> By looking at timestamptz_bin() implementation I don't see why it
> should be STABLE. Its return value depends only on the input values.
> It doesn't look at the session parameters. timestamptz_in() and
> timestamptz_out() are STABLE, that's true, but this is no concern of
> timestamptz_bin().

I'm not quite willing to bet the answer couldn't change if the timezone changes, but it's possible I'm the one missing something.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: mark the timestamptz variant of date_bin() as stable

From
Tom Lane
Date:
John Naylor <john.naylor@enterprisedb.com> writes:
> On Wed, Sep 1, 2021 at 5:32 AM Aleksander Alekseev <aleksander@timescale.com>
> wrote:
>> By looking at timestamptz_bin() implementation I don't see why it
>> should be STABLE. Its return value depends only on the input values.
>> It doesn't look at the session parameters. timestamptz_in() and
>> timestamptz_out() are STABLE, that's true, but this is no concern of
>> timestamptz_bin().

> I'm not quite willing to bet the answer couldn't change if the timezone
> changes, but it's possible I'm the one missing something.

After playing with it for awhile, it seems like the behavior is indeed
not TZ-dependent, but the real question is should it be?
As an example,

regression=# set timezone to 'America/New_York';
SET
regression=# select date_bin('1 day', '2021-11-01 00:00 +00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz);
        date_bin
------------------------
 2021-10-31 00:00:00-04
(1 row)

regression=# select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz);
        date_bin
------------------------
 2021-11-08 23:00:00-05
(1 row)

I see that these two answers are both exactly multiples of 24 hours away
from the given origin.  But if I'm binning on the basis of "days" or
larger units, I would sort of expect to get local midnight, and I'm not
getting that once I cross a DST boundary.

If this is indeed the behavior we want, I concur with Aleksander
that date_bin isn't TZ-sensitive and needn't be marked STABLE.

            regards, tom lane



Re: mark the timestamptz variant of date_bin() as stable

From
Tom Lane
Date:
Justin Pryzby <pryzby@telsasoft.com> writes:
> On Wed, Sep 01, 2021 at 01:26:26PM -0400, John Naylor wrote:
>> I'm not quite willing to bet the answer couldn't change if the timezone
>> changes, but it's possible I'm the one missing something.

> ts=# SET timezone='-12';
> ts=# SELECT date_bin('1hour', '2021-07-01 -1200', '2021-01-01');
> date_bin | 2021-07-01 00:00:00-12

> ts=# SET timezone='+12';
> ts=# SELECT date_bin('1hour', '2021-07-01 -1200', '2021-01-01');
> date_bin | 2021-07-02 00:00:00+12

Yeah, but those are the same timestamptz value.

Another problem with this example as written is that the origin
values being used are not the same in the two cases ... so I
think it's a bit accidental that the answers come out the same.

            regards, tom lane



Re: mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:

On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> regression=# set timezone to 'America/New_York';
> SET
> regression=# select date_bin('1 day', '2021-11-01 00:00 +00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz);
>         date_bin      
> ------------------------
>  2021-10-31 00:00:00-04
> (1 row)
>
> regression=# select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz);
>         date_bin      
> ------------------------
>  2021-11-08 23:00:00-05
> (1 row)
>
> I see that these two answers are both exactly multiples of 24 hours away
> from the given origin.  But if I'm binning on the basis of "days" or
> larger units, I would sort of expect to get local midnight, and I'm not
> getting that once I cross a DST boundary.

Hmm, that's seems like a reasonable expectation. I can get local midnight if I recast to timestamp:

# select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp, '2021-09-01 00:00 -04'::timestamptz::timestamp);
      date_bin
---------------------
 2021-11-09 00:00:00
(1 row)

It's a bit unintuitive, though.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: mark the timestamptz variant of date_bin() as stable

From
Tom Lane
Date:
John Naylor <john.naylor@enterprisedb.com> writes:
> On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I see that these two answers are both exactly multiples of 24 hours away
>> from the given origin.  But if I'm binning on the basis of "days" or
>> larger units, I would sort of expect to get local midnight, and I'm not
>> getting that once I cross a DST boundary.

> Hmm, that's seems like a reasonable expectation. I can get local midnight
> if I recast to timestamp:

> # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
> '2021-09-01 00:00 -04'::timestamptz::timestamp);
>       date_bin
> ---------------------
>  2021-11-09 00:00:00
> (1 row)

Yeah, and then back to timestamptz if that's what you really need :-(

> It's a bit unintuitive, though.

Agreed.  If we keep it like this, adding some documentation around
the point would be a good idea I think.

            regards, tom lane



Re: mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:

On Wed, Sep 1, 2021 at 3:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> John Naylor <john.naylor@enterprisedb.com> writes:
> > On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I see that these two answers are both exactly multiples of 24 hours away
> >> from the given origin.  But if I'm binning on the basis of "days" or
> >> larger units, I would sort of expect to get local midnight, and I'm not
> >> getting that once I cross a DST boundary.
>
> > Hmm, that's seems like a reasonable expectation. I can get local midnight
> > if I recast to timestamp:
>
> > # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
> > '2021-09-01 00:00 -04'::timestamptz::timestamp);
> >       date_bin
> > ---------------------
> >  2021-11-09 00:00:00
> > (1 row)
>
> Yeah, and then back to timestamptz if that's what you really need :-(
>
> > It's a bit unintuitive, though.
>
> Agreed.  If we keep it like this, adding some documentation around
> the point would be a good idea I think.

Having heard no votes on changing this behavior (and it would be a bit of work), I'll start on a documentation patch. And I'll go ahead and re-mark the function as immutable tomorrow barring objections.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:

On Wed, Sep 1, 2021 at 3:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> John Naylor <john.naylor@enterprisedb.com> writes:
> > On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I see that these two answers are both exactly multiples of 24 hours away
> >> from the given origin.  But if I'm binning on the basis of "days" or
> >> larger units, I would sort of expect to get local midnight, and I'm not
> >> getting that once I cross a DST boundary.
>
> > Hmm, that's seems like a reasonable expectation. I can get local midnight
> > if I recast to timestamp:
>
> > # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
> > '2021-09-01 00:00 -04'::timestamptz::timestamp);
> >       date_bin
> > ---------------------
> >  2021-11-09 00:00:00
> > (1 row)
>
> Yeah, and then back to timestamptz if that's what you really need :-(
>
> > It's a bit unintuitive, though.
>
> Agreed.  If we keep it like this, adding some documentation around
> the point would be a good idea I think.

Attached is a draft doc patch using the above examples. Is there anything else that would be useful to mention?

--
John Naylor
EDB: http://www.enterprisedb.com
Attachment

Re: mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:

> On Wed, Sep 1, 2021 at 3:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > John Naylor <john.naylor@enterprisedb.com> writes:
> > > On Wed, Sep 1, 2021 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> I see that these two answers are both exactly multiples of 24 hours away
> > >> from the given origin.  But if I'm binning on the basis of "days" or
> > >> larger units, I would sort of expect to get local midnight, and I'm not
> > >> getting that once I cross a DST boundary.
> >
> > > Hmm, that's seems like a reasonable expectation. I can get local midnight
> > > if I recast to timestamp:
> >
> > > # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
> > > '2021-09-01 00:00 -04'::timestamptz::timestamp);
> > >       date_bin
> > > ---------------------
> > >  2021-11-09 00:00:00
> > > (1 row)
> >
> > Yeah, and then back to timestamptz if that's what you really need :-(
> >
> > > It's a bit unintuitive, though.
> >
> > Agreed.  If we keep it like this, adding some documentation around
> > the point would be a good idea I think.
>
> Attached is a draft doc patch using the above examples. Is there anything else that would be useful to mention?

Any thoughts on the doc patch?

--
John Naylor
EDB: http://www.enterprisedb.com

Re: mark the timestamptz variant of date_bin() as stable

From
Aleksander Alekseev
Date:
Hi John,

> Any thoughts on the doc patch?

It so happened that I implemented a similar feature in TimescaleDB [1].

I discovered that it's difficult from both developer's and user's
perspectives to think about the behavior of the function in the
context of given TZ and its complicated rules, as you are trying to do
in the doc patch. So what we did instead is saying: for timestamptz
the function works as if it was timestamp. E.g. time_bucket_ng("3
month", "2021 Oct 03 12:34:56 TZ") = "2021 Jan 01 00:00:00 TZ" no
matter what TZ it is and what rules (DST, corrections, etc) it has. It
seems to be not only logical and easy to understand, but also easy to
implement [2].

Do you think it would be possible to adopt a similar approach in
respect of documenting for date_bin()? To be honest, I didn't try to
figure out what is the actual implementation of date_bin() for TZ
case.

[1]: https://docs.timescale.com/api/latest/hyperfunctions/time_bucket_ng/
[2]: https://github.com/timescale/timescaledb/blob/master/src/time_bucket.c#L470

-- 
Best regards,
Aleksander Alekseev



Re: mark the timestamptz variant of date_bin() as stable

From
Aleksander Alekseev
Date:
Hi hackers,

> the function works as if it was timestamp. E.g. time_bucket_ng("3
> month", "2021 Oct 03 12:34:56 TZ") = "2021 Jan 01 00:00:00 TZ" no

That was a typo. What I meant was:

time_bucket_ng("3 month", "2021 Feb 03 12:34:56 TZ")

February, not October. Sorry for the confusion.

-- 
Best regards,
Aleksander Alekseev



Re: mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:

On Thu, Sep 23, 2021 at 4:13 AM Aleksander Alekseev <aleksander@timescale.com> wrote:
>
> Hi John,
>
> > Any thoughts on the doc patch?
>
> It so happened that I implemented a similar feature in TimescaleDB [1].
>
> I discovered that it's difficult from both developer's and user's
> perspectives to think about the behavior of the function in the
> context of given TZ and its complicated rules, as you are trying to do
> in the doc patch. So what we did instead is saying: for timestamptz
> the function works as if it was timestamp. E.g. time_bucket_ng("3
> month", "2021 Oct 03 12:34:56 TZ") = "2021 Jan 01 00:00:00 TZ" no
> matter what TZ it is and what rules (DST, corrections, etc) it has. It
> seems to be not only logical and easy to understand, but also easy to
> implement [2].
>
> Do you think it would be possible to adopt a similar approach in
> respect of documenting for date_bin()? To be honest, I didn't try to
> figure out what is the actual implementation of date_bin() for TZ
> case.

I think you have a point that it could be stated more simply and generally. I'll try to move in that direction.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: mark the timestamptz variant of date_bin() as stable

From
John Naylor
Date:

I wrote:

> I think you have a point that it could be stated more simply and generally. I'll try to move in that direction.

On second thought, I don't think this is helpful. Concrete examples are easier to reason about. 

--
John Naylor
EDB: http://www.enterprisedb.com