Thread: Fastest way / best practice to calculate "next birthdays"
The subject basically says it all, I'm looking for the fastest (indexable) way to calculate the next birthdays relative to NOW() from a dataset of about 1 million users. I'm currently using a function based index, but leap year handling / mapping February 29 to February 28 gives me some headaches. Is there any best practice to do that in PostgreSQL? -- Best regards, Hannes Dorbath
Hannes Dorbath wrote: > The subject basically says it all, I'm looking for the fastest > (indexable) way to calculate the next birthdays relative to NOW() from a > dataset of about 1 million users. > > I'm currently using a function based index, but leap year handling / > mapping February 29 to February 28 gives me some headaches. > > Is there any best practice to do that in PostgreSQL? postgres=# SELECT current_date|| ' a ' || to_char(current_date, 'Day'), current_date + '1 Year'::interval || ' a ' || to_char(current_date + '1 Year'::interval, 'Day') as next_birthday; ?column? | next_birthday ------------------------+--------------------------------- 2008-05-04 a Sunday | 2009-05-04 00:00:00 a Monday ? Sincerely, Joshua D. Drake
Joshua D. Drake wrote: > postgres=# SELECT current_date|| ' a ' || to_char(current_date, 'Day'), > current_date + '1 Year'::interval || ' a ' || to_char(current_date + '1 > Year'::interval, 'Day') as next_birthday; > ?column? | next_birthday > ------------------------+--------------------------------- > 2008-05-04 a Sunday | 2009-05-04 00:00:00 a Monday > > ? Sorry, I think I phrased the question badly. What I'm after basically is: http://www.depesz.com/index.php/2007/10/26/who-has-birthday-tomorrow/ -- Best regards, Hannes Dorbath
Hannes Dorbath írta: > Joshua D. Drake wrote: >> postgres=# SELECT current_date|| ' a ' || to_char(current_date, 'Day'), >> current_date + '1 Year'::interval || ' a ' || to_char(current_date + >> '1 Year'::interval, 'Day') as next_birthday; >> ?column? | next_birthday >> ------------------------+--------------------------------- >> 2008-05-04 a Sunday | 2009-05-04 00:00:00 a Monday >> >> ? > > Sorry, I think I phrased the question badly. What I'm after basically is: > > http://www.depesz.com/index.php/2007/10/26/who-has-birthday-tomorrow/ If you define the same functional index as in the above link: CREATE OR REPLACE FUNCTION indexable_month_day(date) RETURNS TEXT as $BODY$ SELECT to_char($1, 'MM-DD'); $BODY$ language 'sql' IMMUTABLE STRICT; create table user_birthdate ( id serial not null primary key, birthdate date ); create index user_birthdate_day_idx on user_birthdate ( indexable_month_day(birthdate) ); Then you can use this query: select count(*) from user_birthdate where indexable_month_day(birthdate) > '02-28' and indexable_month_day(birthdate) <= '03-01'; In a generic and parametrized way: select * from user_birthdate where indexable_month_day(birthdate) > indexable_month_day(now()::date) and indexable_month_day(birthdate) <= indexable_month_day((now() + '1 days'::interval)::date); This will still use the index and it will work for the poor ones who have birthday every 4 years, too. Assume, it's 02-08 today, 03-01 the next day. The now() < X <= now() + 1 day range will find 02-29. -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/
Hannes Dorbath wrote: > Sorry, I think I phrased the question badly. What I'm after basically is: > > http://www.depesz.com/index.php/2007/10/26/who-has-birthday-tomorrow/ > OK So what I came up with is - (the times are from a G4 1.25Ghz) CREATE TABLE birthdaytest ( id serial PRIMARY KEY, birthdate date ); CREATE INDEX idx_bday_month ON birthdaytest USING btree(extract(month from birthdate)); CREATE INDEX idx_bday_day ON birthdaytest USING btree(extract(day from birthdate)); insert into birthdaytest (birthdate) values ('1930-01-01'::date+generate_series(0,365*70)); ... I repeated this another 15 times to load some data vacuum analyse birthdaytest; \timing select count(*) from birthdaytest; > count > -------- > 408816 > (1 row) > > Time: 233.501 ms select * from birthdaytest where extract(month from birthdate) = 5 and extract(day from birthdate) between 6 and 12; > id | birthdate > --------+------------ > 126 | 1930-05-06 > 127 | 1930-05-07 > 128 | 1930-05-08 > ... > ... > 408613 | 1999-05-11 > 408614 | 1999-05-12 > (7840 rows) > > Time: 211.237 ms select * from birthdaytest where extract(month from birthdate) = extract(month from current_date) and extract(day from birthdate) between extract(day from current_date) and extract(day from current_date+14); > id | birthdate > --------+------------ > 125 | 1930-05-05 > 126 | 1930-05-06 > 127 | 1930-05-07 > ... > ... > 408619 | 1999-05-17 > 408620 | 1999-05-18 > 408621 | 1999-05-19 > (16800 rows) > > Time: 483.915 ms -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz
Re: Fastest way / best practice to calculate "next birthdays"
From
"er.tejaspatel88@gmail.com"
Date:
If I have to find upcoming birthdays in current week and the current week fall into different months - how would you handle that? -- View this message in context: http://postgresql.nabble.com/Fastest-way-best-practice-to-calculate-next-birthdays-tp2068398p5849705.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
On Monday, May 18, 2015, er.tejaspatel88@gmail.com <er.tejaspatel88@gmail.com> wrote:
If I have to find upcoming birthdays in current week and the current week
fall into different months - how would you handle that?
Extract(week from timestamptz_column)
ISO weeks are not affected by month boundaries but do start on Monday.
David J.
On 05/20/15 20:22, David G. Johnston wrote: > On Monday, May 18, 2015, er.tejaspatel88@gmail.com < > er.tejaspatel88@gmail.com> wrote: > >> If I have to find upcoming birthdays in current week and the current week >> fall into different months - how would you handle that? >> > > Extract(week from timestamptz_column) > > ISO weeks are not affected by month boundaries but do start on Monday. There is the year start/end boundary conditions to worry about there. If the current week covers Dec28-Jan02 then week of year won't help for a birthday on Jan01 or Jan02 if 'today' is in the Dec portion. Ditto for birthday in Dec portion when 'today' is in the Jan portion. There is probably a better way to do it than what I'm showing here, but here's an example: with x as ( select now() - (extract(dow from now()) || ' days')::interval as weekstart ) select to_char(x.weekstart, 'YYYY-MM-DD') as first_day, to_char(x.weekstart + '6 days', 'YYYY-MM-DD') as last_day from x; You could probably make some of that into a function that accepts a timestamptz and generates the two days. Or even does the compare too. HTH. Bosco.
On Thursday, May 21, 2015, Bosco Rama <postgres@boscorama.com> wrote:
On 05/20/15 20:22, David G. Johnston wrote:
> On Monday, May 18, 2015, er.tejaspatel88@gmail.com <
> er.tejaspatel88@gmail.com> wrote:
>
>> If I have to find upcoming birthdays in current week and the current week
>> fall into different months - how would you handle that?
>>
>
> Extract(week from timestamptz_column)
>
> ISO weeks are not affected by month boundaries but do start on Monday.
There is the year start/end boundary conditions to worry about there.
If the current week covers Dec28-Jan02 then week of year won't help for
a birthday on Jan01 or Jan02 if 'today' is in the Dec portion. Ditto
for birthday in Dec portion when 'today' is in the Jan portion.
You need to read the documentation regarding ISO year and ISO week more carefully. There is no issue with years only ensuring that your definition of week starts with Monday and contains 7 days. The ISO year for January 1st can be different than the Gregorian year for the same.
David J.
On 5/21/15 11:15 AM, Bosco Rama wrote: > You could probably make some of that into a function that accepts a > timestamptz and generates the two days. You'll be better off if instead of 2 days it gives you a daterange: http://www.postgresql.org/docs/9.4/static/rangetypes.html I don't know about the exact ISO details, but your approach is the correct one: find the date that the current week started on and then build a range of [week start, week start + 7 days). Also, note the use of [ vs ). That is the ONLY correct way to do this if you're comparing to a timestamp. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
"er.tejaspatel88@gmail.com" <er.tejaspatel88@gmail.com> wrote: > If I have to find upcoming birthdays in current week and the > current week fall into different months - how would you handle > that? If you don't need to cross from December into January, I find the easiest is: SELECT * FROM person WHERE (EXTRACT(MONTH FROM dob), EXTRACT(DAY FROM dob)) BETWEEN (6, 28) AND (7, 4); That is logicically the same as: SELECT * FROM person WHERE (EXTRACT(MONTH FROM dob) >= 6 AND (EXTRACT(MONTH FROM dob) > 6 OR (EXTRACT(DAY FROM dob) >= 28))) AND (EXTRACT(MONTH FROM dob) <= 7 AND (EXTRACT(MONTH FROM dob) < 7 OR (EXTRACT(DAY FROM dob) <= 4))); That's the generalized case; with the months adjacent, this simpler form is also equivalent: SELECT * FROM person WHERE (EXTRACT(MONTH FROM dob) = 6 AND EXTRACT(DAY FROM dob) >= 28) OR (EXTRACT(MONTH FROM dob) = 7 AND EXTRACT(DAY FROM dob) <= 4); The first query I showed is faster than either of the alternatives, especially if there is an index on dob. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company