Thread: BUG #14294: Problem in generate series between dates

BUG #14294: Problem in generate series between dates

From
pablopumarino@gmail.com
Date:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz
aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDI5NApMb2dnZWQgYnk6ICAg
ICAgICAgIFBhYmxvIFB1bWFyaW5vCkVtYWlsIGFkZHJlc3M6ICAgICAgcGFi
bG9wdW1hcmlub0BnbWFpbC5jb20KUG9zdGdyZVNRTCB2ZXJzaW9uOiA5LjUu
NApPcGVyYXRpbmcgc3lzdGVtOiAgIDQuNC4wLTM0LWdlbmVyaWMgIzUzLVVi
dW50dQpEZXNjcmlwdGlvbjogICAgICAgIAoKV2hlbiBJIHVzZSBwb3N0Z3Jl
cyBmdW5jdGlvbjoNCg0KU0VMRUNUIGdlbmVyYXRlX3NlcmllcygnMjAxNi0w
OC0wMScsICcyMDE2LTA4LTMxJywgJzEKZGF5Jzo6aW50ZXJ2YWwpOjpkYXRl
DQoNCml0IGdpdmVzIG1lIHRoZSBjb3JyZXNwb25kaW5nIGxpc3Qgb2YgZGF5
cyBidXQgZm9yIHNvbWUgcmVhc29uIGl0IG1pc3NlcyB0aGUKZGF5IDIwMTYt
MDgtMzEuIA0KLUkndmUgdHJpZWQgdGhpcyB3aXRoIG90aGVyIG1vbnRocyBh
bmQgaXQgd29ya3MgY29ycmVjdGx5LCBhbHNvIGlmIEkgdXNlIGl0CmZvciB0
aGUgc2FtZSBkYXRlcyBidXQgb24gMjAxNSBpdCB3b3JrcyBmaW5lLiANCi1U
aGlzIHN0YXJ0ZWQgZmFpbGluZyBhZnRlciB3ZSB1cGdyYWRlZCBvdXIgc3lz
dGVtIHRvIHRoZSB2ZXJzaW9uIDkuNS40LApJJ3ZlIHRyeWVkIGl0IGluIGFu
IG9sZGVyIHJlbGVhc2UgdGhhdCB3ZSBoYWQgaW5zdGFsbGVkIGFuZCBpdCB3
b3JrZWQuDQoNCkJlc3QgcmVnYXJkcy4KCg==

Re: BUG #14294: Problem in generate series between dates

From
Andrew Gierth
Date:
>>>>> "pablopumarino" == pablopumarino  <pablopumarino@gmail.com> writes:

 pablopumarino> SELECT generate_series('2016-08-01', '2016-08-31', '1 day'::interval)::date

Does it work better if you do:

SELECT generate_series(timestamp '2016-08-01', '2016-08-31', '1 day'::interval)::date;

?

(Your version is using timestamp with time zone, which is probably not a
good idea)

Also, what is the output you get for:

SELECT d, d::date FROM generate_series('2016-08-01', '2016-08-31', '1 day'::interval) AS d;

--
Andrew (irc:RhodiumToad)

Re: BUG #14294: Problem in generate series between dates

From
Tom Lane
Date:
pablopumarino@gmail.com writes:
> When I use postgres function:
> SELECT generate_series('2016-08-01', '2016-08-31', '1
> day'::interval)::date
> it gives me the corresponding list of days but for some reason it misses the
> day 2016-08-31.

In addition to Andrew's questions:

* What time zone are you using?
* In that zone, is there a DST rule change near 2016-08-31?

            regards, tom lane

Re: BUG #14294: Problem in generate series between dates

From
John R Pierce
Date:
On 8/24/2016 11:29 AM, pablopumarino@gmail.com wrote:
> SELECT generate_series('2016-08-01', '2016-08-31', '1
> day'::interval)::date
>
> it gives me the corresponding list of days but for some reason it misses the
> day 2016-08-31.
> -I've tried this with other months and it works correctly, also if I use it
> for the same dates but on 2015 it works fine.
> -This started failing after we upgraded our system to the version 9.5.4,
> I've tryed it in an older release that we had installed and it worked.

I would write that as...

select generate_series(timestamp '2016-08-01', timestamp '2016-08-31',
interval '1 day')::date;


the notation   type 'string value'   is a native constant in that type,
while the notation 'string value'::type  generates a text constant and
typecasts it.

--
john r pierce, recycling bits in santa cruz

Re: BUG #14294: Problem in generate series between dates

From
Tom Lane
Date:
John R Pierce <pierce@hogranch.com> writes:
> the notation   type 'string value'   is a native constant in that type,
> while the notation 'string value'::type  generates a text constant and
> typecasts it.

Actually, no, there's no difference.  If the argument of ::type is
a simple string literal, it's just fed to the type's input parser.

            regards, tom lane

Re: BUG #14294: Problem in generate series between dates

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> When I use postgres function:
 >> SELECT generate_series('2016-08-01', '2016-08-31', '1
 >> day'::interval)::date
 >> it gives me the corresponding list of days but for some reason it misses the
 >> day 2016-08-31.

 Tom> In addition to Andrew's questions:

 Tom> * What time zone are you using?

I discovered that this is reproducible in America/Santiago

 Tom> * In that zone, is there a DST rule change near 2016-08-31?

The DST change seems to be at midnight local on 2016-08-14 (who puts
their DST changes at midnight? that breaks things here)

This _looks_ wrong:

set timezone = 'America/Santiago';
select timestamptz '2016-08-13', timestamptz '2016-08-13' + interval '1 day';
      timestamptz       |        ?column?
------------------------+------------------------
 2016-08-13 00:00:00-04 | 2016-08-14 01:00:00-03
(1 row)

but I guess it's inevitable, since 2016-08-14 00:00:00 doesn't exist in
the local time, so there's no other possible result to return.

--
Andrew (irc:RhodiumToad)

Re: BUG #14294: Problem in generate series between dates

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> This _looks_ wrong:

> set timezone = 'America/Santiago';
> select timestamptz '2016-08-13', timestamptz '2016-08-13' + interval '1 day';
>       timestamptz       |        ?column?
> ------------------------+------------------------
>  2016-08-13 00:00:00-04 | 2016-08-14 01:00:00-03
> (1 row)

> but I guess it's inevitable, since 2016-08-14 00:00:00 doesn't exist in
> the local time, so there's no other possible result to return.

Yeah.  And after that, the shift persists, eg.

# select '2016-08-14 01:00:00-03'::timestamptz + '1 day'::interval;
        ?column?
------------------------
 2016-08-15 01:00:00-03
(1 row)

That's a bit annoying, because it works if you skip over that day:

# select '2016-08-13'::timestamptz + '2 days'::interval;
        ?column?
------------------------
 2016-08-15 00:00:00-03
(1 row)

In other words, we could make this scenario "work" if we defined
generate_series as base plus N times the increment, rather than as
repeated addition of the increment.  But I wouldn't be surprised
if that would break other corner cases (and it would certainly
be slower).  Daylight-savings time is not one of the more consistent
things in our world ... not that anything at all about the civil
calendar is mathematically nice :-(.

Certainly, the right answer in this example case is to use
the timestamp not timestamptz flavor of generate_series.

Or you could use the integer flavor and add the results to
a base date using the date + integer operator.

            regards, tom lane

Re: BUG #14294: Problem in generate series between dates

From
Pablo Pumarino Delgado
Date:
Hello,

Thanks to everyone, actually the timezone is America/Santiago (no idea how
you noticed this really), and it actually appears that the problem is with
the 14th 00:00:00 not existing.

I've tried what you suggested SELECT generate_series(timestamp
'2016-08-01', '2016-08-31', '1 day'::interval)::date; and it worked.

Another solution, that is the one i'm actually using right now is:

SELECT ('2016-08-01'::date + (interval '1' day * generate_series(0,30)) )::
date

Thanks for your help.

Best regards,



2016-08-24 17:10 GMT-03:00 Tom Lane <tgl@sss.pgh.pa.us>:

> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> > This _looks_ wrong:
>
> > set timezone = 'America/Santiago';
> > select timestamptz '2016-08-13', timestamptz '2016-08-13' + interval '1
> day';
> >       timestamptz       |        ?column?
> > ------------------------+------------------------
> >  2016-08-13 00:00:00-04 | 2016-08-14 01:00:00-03
> > (1 row)
>
> > but I guess it's inevitable, since 2016-08-14 00:00:00 doesn't exist in
> > the local time, so there's no other possible result to return.
>
> Yeah.  And after that, the shift persists, eg.
>
> # select '2016-08-14 01:00:00-03'::timestamptz + '1 day'::interval;
>         ?column?
> ------------------------
>  2016-08-15 01:00:00-03
> (1 row)
>
> That's a bit annoying, because it works if you skip over that day:
>
> # select '2016-08-13'::timestamptz + '2 days'::interval;
>         ?column?
> ------------------------
>  2016-08-15 00:00:00-03
> (1 row)
>
> In other words, we could make this scenario "work" if we defined
> generate_series as base plus N times the increment, rather than as
> repeated addition of the increment.  But I wouldn't be surprised
> if that would break other corner cases (and it would certainly
> be slower).  Daylight-savings time is not one of the more consistent
> things in our world ... not that anything at all about the civil
> calendar is mathematically nice :-(.
>
> Certainly, the right answer in this example case is to use
> the timestamp not timestamptz flavor of generate_series.
>
> Or you could use the integer flavor and add the results to
> a base date using the date + integer operator.
>
>                         regards, tom lane
>



--
Pablo Pumarino D.
+569 82781776

Re: BUG #14294: Problem in generate series between dates

From
Christoph Berg
Date:
Re: Tom Lane 2016-08-24 <31168.1472069440@sss.pgh.pa.us>
> In other words, we could make this scenario "work" if we defined
> generate_series as base plus N times the increment, rather than as
> repeated addition of the increment.  But I wouldn't be surprised
> if that would break other corner cases (and it would certainly
> be slower).  Daylight-savings time is not one of the more consistent
> things in our world ... not that anything at all about the civil
> calendar is mathematically nice :-(.

To avoid DST problems, wouldn't a "date"-based generate_series be the
safe way to go?

Christoph

Re: BUG #14294: Problem in generate series between dates

From
Andrew Gierth
Date:
>>>>> "Christoph" == Christoph Berg <myon@debian.org> writes:

 Christoph> To avoid DST problems, wouldn't a "date"-based
 Christoph> generate_series be the safe way to go?

The lack of generate_series(date,date,integer) is sometimes annoying,
even though it can be worked around using the timestamp-without-timezone
variant of generate_series.

--
Andrew (irc:RhodiumToad)

Re: BUG #14294: Problem in generate series between dates

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Christoph" == Christoph Berg <myon@debian.org> writes:
>  Christoph> To avoid DST problems, wouldn't a "date"-based
>  Christoph> generate_series be the safe way to go?

> The lack of generate_series(date,date,integer) is sometimes annoying,
> even though it can be worked around using the timestamp-without-timezone
> variant of generate_series.

Or you can do it with "base_date + generate_series(integer...)".

I think we looked at this when the timestamp generate_series functions
were put in, and were worried about overloading the name so far that
common use-cases would get ambiguous-function failures.  If that can
be shown not to happen, though, it'd be worth adding such a function
IMO.

            regards, tom lane

Re: BUG #14294: Problem in generate series between dates

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> The lack of generate_series(date,date,integer) is sometimes
 >> annoying, even though it can be worked around using the
 >> timestamp-without-timezone variant of generate_series.

 Tom> Or you can do it with "base_date + generate_series(integer...)".

 Tom> I think we looked at this when the timestamp generate_series
 Tom> functions were put in, and were worried about overloading the name
 Tom> so far that common use-cases would get ambiguous-function
 Tom> failures.  If that can be shown not to happen, though, it'd be
 Tom> worth adding such a function IMO.

I don't see why there would be ambiguity. date_part already has
overloads for every date/time type without causing any issues (but
date_trunc does not, which is another source of subtle timezone bugs).

Some experimentation with creating
pg_catalog.generate_series(date,date,integer) and trying the usual
use-cases doesn't seem to turn up any issues. Looking at the list of
implicit casts also doesn't suggest that there would be any problems.

--
Andrew (irc:RhodiumToad)

Re: BUG #14294: Problem in generate series between dates

From
Dean Rasheed
Date:
On 31 August 2016 at 23:07, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
> Some experimentation with creating
> pg_catalog.generate_series(date,date,integer) and trying the usual
> use-cases doesn't seem to turn up any issues. Looking at the list of
> implicit casts also doesn't suggest that there would be any problems.
>

Adding generate_series(date,date,integer) was attempted a recently
[1], but led to some ambiguities that broke some queries that
currently work. At the time the new function was presented merely as a
convenience to save having to cast results, but I think this
discussion adds more weight to the argument for such a function.
Perhaps adding generate_series(date,date,interval) might work.

[1] https://www.postgresql.org/message-id/flat/56EAB874.9040205%40pgmasters.net

Regards,
Dean

Re: BUG #14294: Problem in generate series between dates

From
Dean Rasheed
Date:
On 1 September 2016 at 00:05, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> Perhaps adding generate_series(date,date,interval) might work.

On second thoughts, that would change the return type of some existing
queries, which would be problematic.

Regards,
Dean

Re: BUG #14294: Problem in generate series between dates

From
Andrew Gierth
Date:
>>>>> "Dean" == Dean Rasheed <dean.a.rasheed@gmail.com> writes:

 >> Perhaps adding generate_series(date,date,interval) might work.

 Dean> On second thoughts, that would change the return type of some
 Dean> existing queries, which would be problematic.

But as this bug report is showing, those existing queries are at best
fragile and at worst silently wrong...

--
Andrew (irc:RhodiumToad)

Re: BUG #14294: Problem in generate series between dates

From
Dean Rasheed
Date:
On 1 September 2016 at 00:39, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>>>>>> "Dean" == Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>
>  >> Perhaps adding generate_series(date,date,interval) might work.
>
>  Dean> On second thoughts, that would change the return type of some
>  Dean> existing queries, which would be problematic.
>
> But as this bug report is showing, those existing queries are at best
> fragile and at worst silently wrong...
>

Hmm, maybe, but since this is timezone-dependent, the existing code
might be perfectly safe in the user's part of the world. I don't
really have a feel for how likely this is to break people's code, but
I think it's something we have to consider.

Regards,
Dean

Re: BUG #14294: Problem in generate series between dates

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On 1 September 2016 at 00:39, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
> "Dean" == Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>>>> Perhaps adding generate_series(date,date,interval) might work.

> Hmm, maybe, but since this is timezone-dependent, the existing code
> might be perfectly safe in the user's part of the world. I don't
> really have a feel for how likely this is to break people's code, but
> I think it's something we have to consider.

Consider this perfectly reasonable use-case:

# select generate_series(current_date,current_date+1,interval '1 hour');
    generate_series
------------------------
 2016-09-01 00:00:00-04
 2016-09-01 01:00:00-04
 2016-09-01 02:00:00-04
 2016-09-01 03:00:00-04
 2016-09-01 04:00:00-04
 2016-09-01 05:00:00-04
 2016-09-01 06:00:00-04
 2016-09-01 07:00:00-04
 2016-09-01 08:00:00-04
 2016-09-01 09:00:00-04
 2016-09-01 10:00:00-04
 2016-09-01 11:00:00-04
 2016-09-01 12:00:00-04
 2016-09-01 13:00:00-04
 2016-09-01 14:00:00-04
 2016-09-01 15:00:00-04
 2016-09-01 16:00:00-04
 2016-09-01 17:00:00-04
 2016-09-01 18:00:00-04
 2016-09-01 19:00:00-04
 2016-09-01 20:00:00-04
 2016-09-01 21:00:00-04
 2016-09-01 22:00:00-04
 2016-09-01 23:00:00-04
 2016-09-02 00:00:00-04
(25 rows)

We can't make the data type of the output dependent on the interval size,
so decreeing that this now produces date not timestamp would break any
case with a fractional-day interval.

After reviewing the previous thread, I have no real desire to take
this up again.  The consensus then was that the added utility didn't
outweigh the likelihood of breaking existing queries, and we've not
covered anything here that wasn't discussed before.

Maybe we just need an example in the docs about working with dates.

            regards, tom lane