Re: group by week (ww), plus -S performance - Mailing list pgsql-general

From Michael Blakeley
Subject Re: group by week (ww), plus -S performance
Date
Msg-id p04320400b557aa32162c@blakeley.com
Whole thread Raw
In response to Re: group by week (ww), plus -S performance  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: group by week (ww), plus -S performance
Re: group by week (ww), plus -S performance
List pgsql-general
At 12:54 AM -0400 5/29/2000, Tom Lane wrote:
>Michael Blakeley <mike@blakeley.com> writes:
>>  SELECT to_date(date_part('year',stamp),'yyyy')-5+7*date_part('week',stamp)
>>    as week,count(*) FROM EVENTS GROUP BY week ORDER BY week;
>
>Seems like the hard way.  You are relying on an implicit conversion from
>the float8 output of date_part() to the text input to_date expects,
>which is kind of slow (and IMHO it's a bug in 7.0 that it will do such
>a conversion silently, anyhow).  Better to use date_trunc to save the
>conversion step:
>
>SELECT date_trunc('year',stamp)-5+7*date_part('week',stamp)
>    as week,count(*) FROM EVENTS GROUP BY week ORDER BY week;

Thanks - that seems to be about 10% faster. I also got a 25% boost by
recompiling PG7 with -O3.

>But the real problem is that this is going to use a start-of-week day
>that is offset five days from whatever day-of-the-week 1 January is.
>If you'd tried a few other years than 2000 you'd likely have been
>dissatisfied with the results...

Here's something that works a bit better, at the expense of falling
in line with 'dow' and starting each week on Sunday, which I don't
mind. I'd only gone with Monday in the first place because that was
what 'ww' seemed to do.

$ cal 12 1999
    December 1999
  S  M Tu  W Th  F  S
           1  2  3  4
  5  6  7  8  9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31

$ cal 1 2000
    January 2000
  S  M Tu  W Th  F  S
                    1
  2  3  4  5  6  7  8
  9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

$ cal 1 2001
    January 2001
  S  M Tu  W Th  F  S
     1  2  3  4  5  6
  7  8  9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

$ D="'2000-01-01'"; psql -c "SELECT $D::date-date_part('dow',$D::date);"
   ?column?
------------
  1999-12-26
(1 row)

$ D="'2001-01-01'"; psql -c "SELECT $D::date-date_part('dow',$D::date);"
   ?column?
------------
  2000-12-31
(1 row)

$ D="'2000-01-15'"; psql -c "SELECT $D::date-date_part('dow',$D::date);"
   ?column?
------------
  2000-01-09
(1 row)

$ D="'2000-01-16'"; psql -c "SELECT $D::date-date_part('dow',$D::date);"
   ?column?
------------
  2000-01-16
(1 row)

It's also about 35% faster, and hits me with a healthy "duh!" factor
since it's so obvious and so much cleaner. The total time for my
original query has been cut by more than half - from 70 sec to 28
sec. Thanks!

>It seems like the right answer is that date_trunc('week',stamp) ought
>to do what you want, but it doesn't seem to be implemented.  That's
>definitely a shortcoming --- anyone want to fix it?

You know what they say - if you want the right answer, just post the
wrong answer and wait to be corrected :-).

If you can't find fault with the query above, maybe it could be
plugged into date_trunc('week', stamp) as-is? I haven't even looked
at the source code, myself....

-- Mike

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: group by week (ww), plus -S performance
Next
From: Erich
Date:
Subject: Re: group by week (ww), plus -S performance