Thread: generate_series() Interpretation

generate_series() Interpretation

From
"David E. Wheeler"
Date:
Hackers,

I'm curious about behavior such as this:

bric=# select generate_series('2011-05-31'::timestamp , '2012-04-01'::timestamp, '1 month');  generate_series
---------------------2011-05-31 00:00:002011-06-30 00:00:002011-07-30 00:00:002011-08-30 00:00:002011-09-30
00:00:002011-10-3000:00:002011-11-30 00:00:002011-12-30 00:00:002012-01-30 00:00:002012-02-29 00:00:002012-03-29
00:00:00

It seems to me that this is subject to interpretation. If I was building a calendaring app, for example, I might rather
thatthe results were: 
  generate_series
---------------------2011-05-31 00:00:002011-06-30 00:00:002011-07-31 00:00:002011-08-31 00:00:002011-09-30
00:00:002011-10-3100:00:002011-11-30 00:00:002011-12-31 00:00:002012-01-31 00:00:002012-02-29 00:00:002012-03-31
00:00:00

Is there some way to change the interpretation of interval calculation like this? Or would I just have to write my own
functionto do it the way I want? 

Thanks,

David



Re: generate_series() Interpretation

From
Steve Crawford
Date:
On 06/27/2011 10:49 AM, David E. Wheeler wrote:
> Hackers,
>
> I'm curious about behavior such as this:
>
> bric=# select generate_series('2011-05-31'::timestamp , '2012-04-01'::timestamp, '1 month');
>     generate_series
> ---------------------
>   2011-05-31 00:00:00
>   2011-06-30 00:00:00
>   2011-07-30 00:00:00
>   2011-08-30 00:00:00
>   2011-09-30 00:00:00
>   2011-10-30 00:00:00
>   2011-11-30 00:00:00
>   2011-12-30 00:00:00
>   2012-01-30 00:00:00
>   2012-02-29 00:00:00
>   2012-03-29 00:00:00
>
> It seems to me that this is subject to interpretation. If I was building a calendaring app, for example, I might
ratherthat the results were:
 
>
>     generate_series
> ---------------------
>   2011-05-31 00:00:00
>   2011-06-30 00:00:00
>   2011-07-31 00:00:00
>   2011-08-31 00:00:00
>   2011-09-30 00:00:00
>   2011-10-31 00:00:00
>   2011-11-30 00:00:00
>   2011-12-31 00:00:00
>   2012-01-31 00:00:00
>   2012-02-29 00:00:00
>   2012-03-31 00:00:00
>
> Is there some way to change the interpretation of interval calculation like this? Or would I just have to write my
ownfunction to do it the way I want?
 
>
> Thanks,
>
> David
>
>

That's just how intervals that represent varying periods of time work. 
You would need to write your own. But a series of end-of-month dates is 
pretty easy:
select generate_series('2011-06-01'::timestamp , 
'2012-04-01'::timestamp, '1 month') - '1 day'::interval;      ?column?
--------------------- 2011-05-31 00:00:00 2011-06-30 00:00:00 2011-07-31 00:00:00 2011-08-31 00:00:00 2011-09-30
00:00:002011-10-31 00:00:00 2011-11-30 00:00:00 2011-12-31 00:00:00 2012-01-31 00:00:00 2012-02-29 00:00:00 2012-03-31
00:00:00

Cheers,
Steve



Re: generate_series() Interpretation

From
"David E. Wheeler"
Date:
On Jun 27, 2011, at 10:54 AM, Steve Crawford wrote:

> That's just how intervals that represent varying periods of time work. You would need to write your own. But a series
ofend-of-month dates is pretty easy: 
> select generate_series('2011-06-01'::timestamp , '2012-04-01'::timestamp, '1 month') - '1 day'::interval;

Yeah, but it's trickier if you have a calendaring app and don't know that date a user has chosen for a monthly
recurringevent. They might have selected June 30, in which case only February would ever need to be different than the
default.

Best,

David



Re: generate_series() Interpretation

From
"Kevin Grittner"
Date:
"David E. Wheeler" <david@kineticode.com> wrote:
> generate_series   
> ---------------------
>  2011-05-31 00:00:00
>  2011-06-30 00:00:00
>  2011-07-31 00:00:00
>  2011-08-31 00:00:00
>  2011-09-30 00:00:00
>  2011-10-31 00:00:00
>  2011-11-30 00:00:00
>  2011-12-31 00:00:00
>  2012-01-31 00:00:00
>  2012-02-29 00:00:00
>  2012-03-31 00:00:00
> 
> Is there some way to change the interpretation of interval
> calculation like this? Or would I just have to write my own
> function to do it the way I want?
It is precisely to support such fancy things that some products
support a more abstract date type which allows 31 days in any month,
and then normalizes to real dates as needed.  The PostgreSQL
developer community has generally not been receptive to such use
cases.  I think you need to iterate through month intervals and add
those to the starting date for now.  If you want to start with the
last day of a month with less than 31 days, you may need to back up
a month or two to find a suitable month and offset your intervals by
the appropriate number of months.
I'd bet that if you encapsulate all that in a PostgreSQL function,
you're not the only one who would find it useful.
-Kevin


Re: generate_series() Interpretation

From
"David E. Wheeler"
Date:
On Jun 27, 2011, at 11:03 AM, Kevin Grittner wrote:

> It is precisely to support such fancy things that some products
> support a more abstract date type which allows 31 days in any month,
> and then normalizes to real dates as needed.  The PostgreSQL
> developer community has generally not been receptive to such use
> cases.  I think you need to iterate through month intervals and add
> those to the starting date for now.  If you want to start with the
> last day of a month with less than 31 days, you may need to back up
> a month or two to find a suitable month and offset your intervals by
> the appropriate number of months.
>
> I'd bet that if you encapsulate all that in a PostgreSQL function,
> you're not the only one who would find it useful.

Yeah, did that a while ago:
 http://www.justatheory.com/computers/databases/postgresql/recurring_events.html

I think it could be simpler now, with generate_series() for some intervals.

Best,

David



Re: generate_series() Interpretation

From
Steve Crawford
Date:
On 06/27/2011 10:56 AM, David E. Wheeler wrote:
> On Jun 27, 2011, at 10:54 AM, Steve Crawford wrote:
>
>> That's just how intervals that represent varying periods of time work. You would need to write your own. But a
seriesof end-of-month dates is pretty easy:
 
>> select generate_series('2011-06-01'::timestamp , '2012-04-01'::timestamp, '1 month') - '1 day'::interval;
> Yeah, but it's trickier if you have a calendaring app and don't know that date a user has chosen for a monthly
recurringevent. They might have selected June 30, in which case only February would ever need to be different than the
default.
>
> Best,
>
> David
>
>
>
The query is marginally trickier. But the better calendaring apps give a 
variety of options when selecting "repeat": A user who selects June 30, 
2011 and wants a monthly repeat might want:

30th of every month - skip months without a 30th
30th of every month - move to end-of-month if 30th doesn't exist
Last day of every month
Last Thursday of every month

Typical payday repeats are "the 15th and last -day-of-month if a workday 
or the closest preceding workday if not", "second and last Friday", 
"every other Friday"...

No matter how '1 month' is interpreted in generate_series, the 
application programmer will still need to write the queries required to 
handle whatever calendar-repeat features are deemed necessary.

Cheers,
Steve




Re: generate_series() Interpretation

From
"David E. Wheeler"
Date:
On Jun 27, 2011, at 11:36 AM, Steve Crawford wrote:

> The query is marginally trickier. But the better calendaring apps give a variety of options when selecting "repeat":
Auser who selects June 30, 2011 and wants a monthly repeat might want: 
>
> 30th of every month - skip months without a 30th
> 30th of every month - move to end-of-month if 30th doesn't exist
> Last day of every month
> Last Thursday of every month
>
> Typical payday repeats are "the 15th and last -day-of-month if a workday or the closest preceding workday if not",
"secondand last Friday", "every other Friday"... 
>
> No matter how '1 month' is interpreted in generate_series, the application programmer will still need to write the
queriesrequired to handle whatever calendar-repeat features are deemed necessary. 

Yeah, which is why I said it was subject to interpretation. Of course there's no way to tell generate_series() which to
use,which is what I figured. 

Thanks,

David




Re: generate_series() Interpretation

From
Michael Nolan
Date:


On Mon, Jun 27, 2011 at 1:38 PM, David E. Wheeler <david@kineticode.com> wrote:

Yeah, which is why I said it was subject to interpretation. Of course there's no way to tell generate_series() which to use, which is what I figured.

generate_series() is doing exactly what it was designed to do, the imprecision regarding adding '1 month' to something that may or may not have been intended to be 'last day of the month' is a limitation in the interval code.

One way to change this would be to implement another interval type such as 'full_month'  which would take a date that is know to be the last day of the month and make it the last day of the appropriate month.  If the starting date is NOT the last day of a month, the existing logic would suffice. 

Or you can do as I have done and create your own last_day() function that takes any date and makes it the last day of that month, and apply it to the output of generate_series();
--
Mike Nolan
nolan@tssi.com

Re: generate_series() Interpretation

From
Steve Crawford
Date:
> Yeah, which is why I said it was subject to interpretation. Of course there's no way to tell generate_series() which
touse, which is what I figured.
 
>
Fortunately PostgreSQL uses the same interpretation for '1 month'  when 
used in generate_series that it does everywhere else - to do otherwise 
would be hella confusing. :)

Cheers,
Steve



Re: generate_series() Interpretation

From
Christopher Browne
Date:
On Mon, Jun 27, 2011 at 1:49 PM, David E. Wheeler <david@kineticode.com> wrote:
> Hackers,
>
> I'm curious about behavior such as this:
>
> bric=# select generate_series('2011-05-31'::timestamp , '2012-04-01'::timestamp, '1 month');
>   generate_series
> ---------------------
>  2011-05-31 00:00:00
>  2011-06-30 00:00:00
>  2011-07-30 00:00:00
>  2011-08-30 00:00:00
>  2011-09-30 00:00:00
>  2011-10-30 00:00:00
>  2011-11-30 00:00:00
>  2011-12-30 00:00:00
>  2012-01-30 00:00:00
>  2012-02-29 00:00:00
>  2012-03-29 00:00:00
>
> It seems to me that this is subject to interpretation. If I was building a calendaring app, for example, I might
ratherthat the results were: 
>
>   generate_series
> ---------------------
>  2011-05-31 00:00:00
>  2011-06-30 00:00:00
>  2011-07-31 00:00:00
>  2011-08-31 00:00:00
>  2011-09-30 00:00:00
>  2011-10-31 00:00:00
>  2011-11-30 00:00:00
>  2011-12-31 00:00:00
>  2012-01-31 00:00:00
>  2012-02-29 00:00:00
>  2012-03-31 00:00:00
>
> Is there some way to change the interpretation of interval calculation like this? Or would I just have to write my
ownfunction to do it the way I want? 

It's not hugely difficult to get something pretty appropriate:

emp@localhost->  select generate_series('2011-06-01'::timestamp ,
'2012-04-01'::timestamp, '1 month')- '1 day' ::interval;     ?column?
---------------------2011-05-31 00:00:002011-06-30 00:00:002011-07-31 00:00:002011-08-31 00:00:002011-09-30
00:00:002011-10-3100:00:002011-11-30 00:00:002011-12-31 00:00:002012-01-31 00:00:002012-02-29 00:00:002012-03-31
00:00:00
(11 rows)

That's more or less a bit of "cleverness."  But it's not so grossly
clever as to seem too terribly frightful.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


Re: generate_series() Interpretation

From
Christopher Browne
Date:
On Mon, Jun 27, 2011 at 2:36 PM, Steve Crawford
<scrawford@pinpointresearch.com> wrote:
> On 06/27/2011 10:56 AM, David E. Wheeler wrote:
>>
>> On Jun 27, 2011, at 10:54 AM, Steve Crawford wrote:
>>
>>> That's just how intervals that represent varying periods of time work.
>>> You would need to write your own. But a series of end-of-month dates is
>>> pretty easy:
>>> select generate_series('2011-06-01'::timestamp , '2012-04-01'::timestamp,
>>> '1 month') - '1 day'::interval;
>>
>> Yeah, but it's trickier if you have a calendaring app and don't know that
>> date a user has chosen for a monthly recurring event. They might have
>> selected June 30, in which case only February would ever need to be
>> different than the default.
>>
>> Best,
>>
>> David
>>
>>
>>
> The query is marginally trickier. But the better calendaring apps give a
> variety of options when selecting "repeat": A user who selects June 30, 2011
> and wants a monthly repeat might want:
>
> 30th of every month - skip months without a 30th
> 30th of every month - move to end-of-month if 30th doesn't exist
> Last day of every month
> Last Thursday of every month
>
> Typical payday repeats are "the 15th and last -day-of-month if a workday or
> the closest preceding workday if not", "second and last Friday", "every
> other Friday"...
>
> No matter how '1 month' is interpreted in generate_series, the application
> programmer will still need to write the queries required to handle whatever
> calendar-repeat features are deemed necessary.

If you look up David Skoll's "remind" application
<http://www.roaringpenguin.com/products/remind>, you'll find something
that does this kind of pattern matching quite, quite well, at a rather
sophisticated level.

I find that I don't want to go through the struggle of figuring out
how to correctly describe those recurrences.

The other way of doing this sort of thing, which seems to be generally
more intuitive, is to treat these calendars as sets, which are a
structure that SQL is rather will designed to manipulate, and use
inclusions/exclusions/intersections to determine which days are of
interest.

I wrote something on this on pgsql-general about 5 years ago that
still seems pretty relevant.

http://archives.postgresql.org/pgsql-general/2006-02/msg00159.php
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


Re: generate_series() Interpretation

From
"David E. Wheeler"
Date:
On Jun 27, 2011, at 12:36 PM, Christopher Browne wrote:

> I wrote something on this on pgsql-general about 5 years ago that
> still seems pretty relevant.
>
> http://archives.postgresql.org/pgsql-general/2006-02/msg00159.php

iwantsandy.com (now defunct) originally had a solution like this. However it supported a slew of recurrences:

* hours
* 2xday
* days
* weeks
* months
* quarters
* years
* decades

We had materializations of all of these going out 5 years or so. It took up an incredible amount of database space and
wasreally slow. I replaced it with a variation on the code described in this blog post: 
 http://www.justatheory.com/computers/databases/postgresql/recurring_events.html

The database was a fraction of the original size and, because views were usually limited to a month at most, the number
ofrows generated for a query to show recurring events was quite limited (no one had an hourly reminder that when for
morethan a couple of days). Queries were a lot faster, too. 

So I think the materialization of dates can work in certain limited cases such as your "vacations 2005" example, and
willbe easier to use thanks to JOINs, I found that it performed poorly and was unnecessarily resource-intensive for our
usage.And I suspect the same would be try for anyone building a calendar app with more than one simple kind of limited
recurrence.

Best,

David