Thread: generate_series with left join
Greetings. I'm having some difficulties with my first use of the generate_series function. Situation:cause | integer date | timestamp(2) without time zone cause | date ------+------------+-----------+ 1 | 2006-03-23 15:07:53.63 | 2 | 2006-02-02 12:13:23.11 | 2 | 2006-11-12 16:43:11.45 | 1 | 2005-03-13 18:34:44.13 | 3 | 2006-01-23 11:24:41.31 | (etc) What i need to do, is to count the 'cause' column for the values '1' and '2', and group them by year, using left joins in order to also have the serialized years with empty values in the output. My needed output for a series of (2005,2007) would be:year | one | two ------+------+------2005 | 1 | 02006 | 1 | 22007 | 0 | 0 I have tried something like #select s, (select count(cause) from mytable where cause=1 ) as one, COUNT (cause) as two from generate_series(2006,2009) AS s(d) left JOIN mytable o ON (substr(o.date,1,4) = s.d and cause=2) GROUP BY s.d ORDER BY 1; which obviously is wrong, because of the results: s | one | two ------+------+------2006 | 3769 | 16582007 | 3769 | 02008 | 3769 | 02009 | 3769 | 0 As far as the 'two', the left join was successful, however i can not find a way to join the 'one'. The output value is correct, but the result shown should be only for the year 2006, not for all the values of the series. Maybe i've looked at it TOO much or maybe i'm completely failing to find a working logic. Any suggestions? Any and all help is humbly appreciated. \\pb -- This message has been scanned for viruses and dangerous content at MsgLab.com and is believed to be clean.
How about one of these two:
select
year_list.year,
count(one.*),
count(two.*)
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable as one on (
date_part('year', one.date) = year_list.years
and one.cause = 1
)
left outer join mytable as two on (
date_part('year', two.date) = year_list.years
and two.cause = 2
)
group by
year_list.year
;
select
year_list.year,
mytable.cause,
count(mytable.*)
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable on (
date_part('year', mytable.date) = year_list.years
)
group by
year_list.year,
mytable.cause
;
I think one of the problems many people have is the writing of their SQL in paragraph form. It makes the SQL really hard to read and even harder to understand and debug. Formatting your SQL like I did above may make it easier to see what is wrong.
-Aaron Bono
select
year_list.year,
count(one.*),
count(two.*)
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable as one on (
date_part('year', one.date) = year_list.years
and one.cause = 1
)
left outer join mytable as two on (
date_part('year', two.date) = year_list.years
and two.cause = 2
)
group by
year_list.year
;
select
year_list.year,
mytable.cause,
count(mytable.*)
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable on (
date_part('year', mytable.date) = year_list.years
)
group by
year_list.year,
mytable.cause
;
I think one of the problems many people have is the writing of their SQL in paragraph form. It makes the SQL really hard to read and even harder to understand and debug. Formatting your SQL like I did above may make it easier to see what is wrong.
-Aaron Bono
On 6/28/06, Pedro B. <pedro.borracha@msglab.com> wrote:
Greetings.
I'm having some difficulties with my first use of the generate_series
function.
Situation:
cause | integer
date | timestamp(2) without time zone
cause | date
------+------------+-----------+
1 | 2006-03-23 15:07:53.63 |
2 | 2006-02-02 12:13:23.11 |
2 | 2006-11-12 16:43:11.45 |
1 | 2005-03-13 18:34:44.13 |
3 | 2006-01-23 11:24:41.31 |
(etc)
What i need to do, is to count the 'cause' column for the values '1' and
'2', and group them by year, using left joins in order to also have the
serialized years with empty values in the output.
My needed output for a series of (2005,2007) would be:
year | one | two
------+------+------
2005 | 1 | 0
2006 | 1 | 2
2007 | 0 | 0
I have tried something like
#select s, (select count(cause) from mytable where cause=1 ) as one,
COUNT (cause) as two from generate_series(2006,2009) AS s(d) left JOIN
mytable o ON (substr(o.date,1,4) = s.d and cause=2) GROUP BY s.d ORDER
BY 1;
which obviously is wrong, because of the results:
s | one | two
------+------+------
2006 | 3769 | 1658
2007 | 3769 | 0
2008 | 3769 | 0
2009 | 3769 | 0
As far as the 'two', the left join was successful, however i can not
find a way to join the 'one'. The output value is correct, but the
result shown should be only for the year 2006, not for all the values of
the series.
Maybe i've looked at it TOO much or maybe i'm completely failing to find
a working logic.
Any suggestions?
Any and all help is humbly appreciated.
\\pb
On Wed, 2006-06-28 at 13:34 -0500, Aaron Bono wrote: > select > year_list.year, > count(one.*), > count(two.*) > from ( > select years > from generate_series(2006,2009) as years > ) year_list > left outer join mytable as one on ( > date_part('year', one.date) = year_list.years > and one.cause = 1 > ) > left outer join mytable as two on ( > date_part('year', two.date) = year_list.years > and two.cause = 2 > ) > group by > year_list.year > ; > > > select > year_list.year, > mytable.cause, > count(mytable.*) > from ( > select years > from generate_series(2006,2009) as years > ) year_list > left outer join mytable on ( > date_part('year', mytable.date) = year_list.years > ) > group by > year_list.year, > mytable.cause > ; > Aaron, Thank you so much for your reply. However, the 2 examples you provided have "weird" outputs: The first:years | count | count -------+---------+--------- 2009 | 0 | 0 2008 | 0 | 0 2007 | 0 | 0 2006 | 7802080 | 7802080 (4 rows) Time: 87110.753 ms << yay. The second: years | cause | count -------+---------+------- 2009 | | 0 2008 | | 0 2007 | | 0 2006 | 6 | 1 2006| 1 | 4030 2006 | 2 | 1936 2006 | 3 | 4078 2006 | 100 | 3159 2006 | 98 | 2659 2006 | 99 | 2549 My need is really to only group the counts of where cause=1 and cause=2 for each year, none of the others. > I think one of the problems many people have is the writing of their > SQL in paragraph form. It makes the SQL really hard to read and even > harder to understand and debug. Formatting your SQL like I did above > may make it easier to see what is wrong. Indeed. Note taken, i'll improve my formatting. \\pb -- This message has been scanned for viruses and dangerous content at MsgLab.com and is believed to be clean.
Sorry, I think I see the mistake - it is getting all the rows for 1 and all the rows for 2 and joining them. Try splitting up the query into two inner queries like so:
select
one_list.year,
one_list.one_count,
two_list.two_count
FROM
(
select
year_list.year,
count(one.*) as one_count
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable as one on (
date_part('year', one.date) = year_list.years
and one.cause = 1
)
group by
year_list.year
) one_list,
(
select
year_list.year,
count(two.*) as two_count
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable as two on (
date_part('year', two.date ) = year_list.years
and two.cause = 2
)
group by
year_list.year
) two_list
WHERE one_list.year = two_list.year
;
select
one_list.year,
one_list.one_count,
two_list.two_count
FROM
(
select
year_list.year,
count(one.*) as one_count
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable as one on (
date_part('year', one.date) = year_list.years
and one.cause = 1
)
group by
year_list.year
) one_list,
(
select
year_list.year,
count(two.*) as two_count
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable as two on (
date_part('year', two.date ) = year_list.years
and two.cause = 2
)
group by
year_list.year
) two_list
WHERE one_list.year = two_list.year
;
On 6/28/06, Pedro B. <pedro.borracha@msglab.com> wrote:
On Wed, 2006-06-28 at 13:34 -0500, Aaron Bono wrote:
> select
> year_list.year,
> count(one.*),
> count(two.*)
> from (
> select years
> from generate_series(2006,2009) as years
> ) year_list
> left outer join mytable as one on (
> date_part('year', one.date) = year_list.years
> and one.cause = 1
> )
> left outer join mytable as two on (
> date_part('year', two.date) = year_list.years
> and two.cause = 2
> )
> group by
> year_list.year
> ;
>
>
> select
> year_list.year,
> mytable.cause,
> count(mytable.*)
> from (
> select years
> from generate_series(2006,2009) as years
> ) year_list
> left outer join mytable on (
> date_part('year', mytable.date) = year_list.years
> )
> group by
> year_list.year,
> mytable.cause
> ;
>
Aaron,
Thank you so much for your reply.
However, the 2 examples you provided have "weird" outputs:
The first:
years | count | count
-------+---------+---------
2009 | 0 | 0
2008 | 0 | 0
2007 | 0 | 0
2006 | 7802080 | 7802080
(4 rows)
Time: 87110.753 ms << yay.
The second:
years | cause | count
-------+---------+-------
2009 | | 0
2008 | | 0
2007 | | 0
2006 | 6 | 1
2006 | 1 | 4030
2006 | 2 | 1936
2006 | 3 | 4078
2006 | 100 | 3159
2006 | 98 | 2659
2006 | 99 | 2549
My need is really to only group the counts of where cause=1 and cause=2
for each year, none of the others.
> I think one of the problems many people have is the writing of their
> SQL in paragraph form. It makes the SQL really hard to read and even
> harder to understand and debug. Formatting your SQL like I did above
> may make it easier to see what is wrong.
Indeed. Note taken, i'll improve my formatting.
\\pb
This should work too:
select
year_list.year,
one_list.one_count,
two_list.two_count
FROM (
select years
from generate_series(2006,2009) as years
) year_list
left outer join (
select
date_part('year', one.date) as one_year,
count(one.*) as one_count
from mytable as one
where one.cause = 1
group by
date_part('year', one.date)
) one_list on (year_list.years = one_year)
left outer join (
select
date_part('year', two.date) as two_year,
count(two.*) as two_count
from mytable as two
where two.cause = 2
group by
date_part('year', two.date)
) two_list on (year_list.years = two_year)
;
select
year_list.year,
one_list.one_count,
two_list.two_count
FROM (
select years
from generate_series(2006,2009) as years
) year_list
left outer join (
select
date_part('year', one.date) as one_year,
count(one.*) as one_count
from mytable as one
where one.cause = 1
group by
date_part('year', one.date)
) one_list on (year_list.years = one_year)
left outer join (
select
date_part('year', two.date) as two_year,
count(two.*) as two_count
from mytable as two
where two.cause = 2
group by
date_part('year', two.date)
) two_list on (year_list.years = two_year)
;
On 6/28/06, Aaron Bono <postgresql@aranya.com > wrote:
Sorry, I think I see the mistake - it is getting all the rows for 1 and all the rows for 2 and joining them. Try splitting up the query into two inner queries like so:
select
one_list.year,
one_list.one_count,
two_list.two_count
FROM
(
select
year_list.year,
count(one.*) as one_count
from (
select years
from generate_series(2006,2009) as years
) year_list
left outer join mytable as one on (
date_part('year', one.date) = year_list.years
and one.cause = 1
)group by
year_list.year) one_list,
(
select
year_list.year,
count(two.*) as two_count
from (
select years
from generate_series(2006,2009) as years
) year_listleft outer join mytable as two on (
date_part('year', two.date ) = year_list.years
and two.cause = 2
)
group by
year_list.year) two_list
WHERE one_list.year = two_list.year
;On 6/28/06, Pedro B. <pedro.borracha@msglab.com> wrote:On Wed, 2006-06-28 at 13:34 -0500, Aaron Bono wrote:
> select
> year_list.year,
> count(one.*),
> count(two.*)
> from (
> select years
> from generate_series(2006,2009) as years
> ) year_list
> left outer join mytable as one on (
> date_part('year', one.date) = year_list.years
> and one.cause = 1
> )
> left outer join mytable as two on (
> date_part('year', two.date) = year_list.years
> and two.cause = 2
> )
> group by
> year_list.year
> ;
>
>
> select
> year_list.year,
> mytable.cause,
> count(mytable.*)
> from (
> select years
> from generate_series(2006,2009) as years
> ) year_list
> left outer join mytable on (
> date_part('year', mytable.date) = year_list.years
> )
> group by
> year_list.year,
> mytable.cause
> ;
>
Aaron,
Thank you so much for your reply.
However, the 2 examples you provided have "weird" outputs:
The first:
years | count | count
-------+---------+---------
2009 | 0 | 0
2008 | 0 | 0
2007 | 0 | 0
2006 | 7802080 | 7802080
(4 rows)
Time: 87110.753 ms << yay.
The second:
years | cause | count
-------+---------+-------
2009 | | 0
2008 | | 0
2007 | | 0
2006 | 6 | 1
2006 | 1 | 4030
2006 | 2 | 1936
2006 | 3 | 4078
2006 | 100 | 3159
2006 | 98 | 2659
2006 | 99 | 2549
My need is really to only group the counts of where cause=1 and cause=2
for each year, none of the others.
> I think one of the problems many people have is the writing of their
> SQL in paragraph form. It makes the SQL really hard to read and even
> harder to understand and debug. Formatting your SQL like I did above
> may make it easier to see what is wrong.
Indeed. Note taken, i'll improve my formatting.
\\pb
On Wed, 2006-06-28 at 15:16 -0500, Aaron Bono wrote: > This should work too: > > select > year_list.year, > one_list.one_count, > two_list.two_count > FROM ( > select years > from generate_series(2006,2009) as years > ) year_list > left outer join ( > select > date_part('year', one.date) as one_year, > count(one.*) as one_count > from mytable as one > where one.cause = 1 > group by > date_part('year', one.date) > ) one_list on (year_list.years = one_year) > left outer join ( > select > date_part('year', two.date) as two_year, > count(two.*) as two_count > from mytable as two > where two.cause = 2 > group by > date_part('year', two.date) > ) two_list on (year_list.years = two_year) > ; Aaron, I confess i will take some time to digest the amazing code you just sent, but in the meantime, let me tell you right away that both work just as i needed. I will stop pulling my hairs now. Thank you so much. \\pb -- This message has been scanned for viruses and dangerous content at MsgLab.com and is believed to be clean.