Thread: JOIN and aggregate problem

JOIN and aggregate problem

From

Tarlika Elisabeth Schmitz

Date:

20 February 2009, 14:46:47

I have 2 tables T1 and T2

T1 has the columns: D, S, C. The combination of D,S,C is unique.
T2 has the columns: D, S, C, and boolean X. The combination of D,S,C is
not unique. 

I need to produce the following result for every occurrence of T1:
D,S,C, COUNT

COUNT is the number of matching D,S,C combinations in T2 where X = true.
There might be no matching pair in T2 or there might be match but X
is false.

How can I express this?



--


Best Regards,

Tarlika Elisabeth Schmitz

Re: JOIN and aggregate problem

From

Bob Henkel

Date:

20 February 2009, 15:03:32

I might be missing something but does this solve your issue?

CREATE TABLE t1(d INT,s INT, c INT);

CREATE UNIQUE INDEX idx01_t1
ON t1 USING btree (d, s, c);

INSERT INTO t1 (d, s, c)
VALUES (1,1,1),(2,2,2),(3,3,3),(4,4,4);

CREATE TABLE t2(d INT,s INT, c INT, x boolean);

INSERT INTO t2(d, s, c, x)
VALUES (1,1,1,TRUE),(1,1,1,FALSE),(2,2,2,FALSE),(4,4,4,FALSE),(4,4,4,FALSE);

SELECT t1.d, t1.s, t1.c, t2.x, COUNT(*)
FROM t1
INNER JOIN t2       ON t1.d = t2.d      AND t1.s = t2.s      AND t1.c = t2.c      WHERE t2.x = 'FALSE'
GROUP BY t1.d, t1.s, t1.c, t2.x;

On Fri, Feb 20, 2009 at 12:04 PM, Tarlika Elisabeth Schmitz
<postgresql@numerixtechnology.de> wrote:
> I have 2 tables T1 and T2
>
> T1 has the columns: D, S, C. The combination of D,S,C is unique.
> T2 has the columns: D, S, C, and boolean X. The combination of D,S,C is
> not unique.
>
> I need to produce the following result for every occurrence of T1:
> D,S,C, COUNT
>
> COUNT is the number of matching D,S,C combinations in T2 where X = true.
> There might be no matching pair in T2 or there might be match but X
> is false.
>
> How can I express this?
>
>
>
> --
>
>
> Best Regards,
>
> Tarlika Elisabeth Schmitz
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>

Re: JOIN and aggregate problem

From

Richard Huxton

Date:

20 February 2009, 15:06:57

Tarlika Elisabeth Schmitz wrote:
> I have 2 tables T1 and T2
> 
> T1 has the columns: D, S, C. The combination of D,S,C is unique.
> T2 has the columns: D, S, C, and boolean X. The combination of D,S,C is
> not unique. 
> 
> I need to produce the following result for every occurrence of T1:
> D,S,C, COUNT
> 
> COUNT is the number of matching D,S,C combinations in T2 where X = true.
> There might be no matching pair in T2 or there might be match but X
> is false.

try something like:

SELECT t1.d, t1.s, t1.c, count(*)
FROM t1
LEFT JOIN (SELECT d,s,c FROM t2 WHERE x
) AS t2_true USING (d,s,c)
GROUP BY t1.d, t1.s, t1.c;

Warning - not tested

--  Richard Huxton Archonet Ltd

Re: JOIN and aggregate problem

From

Bob Henkel

Date:

20 February 2009, 15:15:33

Scratch this one won't work for you.

On Fri, Feb 20, 2009 at 1:03 PM, Bob Henkel <bob.henkel@gmail.com> wrote:
> I might be missing something but does this solve your issue?
>
> CREATE TABLE t1(d INT,s INT, c INT);
>
> CREATE UNIQUE INDEX idx01_t1
> ON t1 USING btree (d, s, c);
>
> INSERT INTO t1 (d, s, c)
> VALUES (1,1,1),(2,2,2),(3,3,3),(4,4,4);
>
> CREATE TABLE t2(d INT,s INT, c INT, x boolean);
>
> INSERT INTO t2(d, s, c, x)
> VALUES (1,1,1,TRUE),(1,1,1,FALSE),(2,2,2,FALSE),(4,4,4,FALSE),(4,4,4,FALSE);
>
> SELECT t1.d, t1.s, t1.c, t2.x, COUNT(*)
> FROM t1
> INNER JOIN t2
>        ON t1.d = t2.d
>       AND t1.s = t2.s
>       AND t1.c = t2.c
>       WHERE t2.x = 'FALSE'
> GROUP BY t1.d, t1.s, t1.c, t2.x;
>
> On Fri, Feb 20, 2009 at 12:04 PM, Tarlika Elisabeth Schmitz
> <postgresql@numerixtechnology.de> wrote:
>> I have 2 tables T1 and T2
>>
>> T1 has the columns: D, S, C. The combination of D,S,C is unique.
>> T2 has the columns: D, S, C, and boolean X. The combination of D,S,C is
>> not unique.
>>
>> I need to produce the following result for every occurrence of T1:
>> D,S,C, COUNT
>>
>> COUNT is the number of matching D,S,C combinations in T2 where X = true.
>> There might be no matching pair in T2 or there might be match but X
>> is false.
>>
>> How can I express this?
>>
>>
>>
>> --
>>
>>
>> Best Regards,
>>
>> Tarlika Elisabeth Schmitz
>>
>> --
>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-sql
>>
>

Re: JOIN and aggregate problem

From

Stephan Szabo

Date:

20 February 2009, 15:15:37

On Fri, 20 Feb 2009, Tarlika Elisabeth Schmitz wrote:

> I have 2 tables T1 and T2
>
> T1 has the columns: D, S, C. The combination of D,S,C is unique.
> T2 has the columns: D, S, C, and boolean X. The combination of D,S,C is
> not unique.
>
> I need to produce the following result for every occurrence of T1:
> D,S,C, COUNT
>
> COUNT is the number of matching D,S,C combinations in T2 where X = true.
> There might be no matching pair in T2 or there might be match but X
> is false.
>
> How can I express this?

Maybe something like one of these barely tested queries?

select d, s, c, sum(case when t2.x then 1 else 0 end)
from t1 left outer join t2 using(d,s,c)
group by d, s, c;

or

select d,s,c,(select count(*) from t2 where t2.d=t1.d and t2.s=t1.s and t2.c=t1.c and t2.x)
from t1;

Re: JOIN and aggregate problem

From

Bob Henkel

Date:

20 February 2009, 15:23:53

How about this?

CREATE TABLE t1(d INT,s INT, c INT);

CREATE UNIQUE INDEX idx01_t1
ON t1 USING btree (d, s, c);

INSERT INTO t1 (d, s, c)
VALUES (1,1,1),(2,2,2),(3,3,3),(4,4,4),(5,5,5);

CREATE TABLE t2(d INT,s INT, c INT, x boolean);

INSERT INTO t2(d, s, c, x)
VALUES (1,1,1,TRUE),(1,1,1,FALSE),(2,2,2,FALSE),(4,4,4,FALSE),(4,4,4,FALSE),(5,5,5,TRUE),(5,5,5,TRUE);

SELECT t1.d, t1.s, t1.c, CASE WHEN t2.x IS NULL THEN 0 ELSE COUNT(*) END
FROM t1
LEFT OUTER JOIN t2       ON t1.d = t2.d      AND t1.s = t2.s      AND t1.c = t2.c      AND t2.x = TRUE
GROUP BY t1.d, t1.s, t1.c,t2.x;

--DROP TABLE t1;
--DROP TABLE t2;

On Fri, Feb 20, 2009 at 12:04 PM, Tarlika Elisabeth Schmitz
<postgresql@numerixtechnology.de> wrote:
> I have 2 tables T1 and T2
>
> T1 has the columns: D, S, C. The combination of D,S,C is unique.
> T2 has the columns: D, S, C, and boolean X. The combination of D,S,C is
> not unique.
>
> I need to produce the following result for every occurrence of T1:
> D,S,C, COUNT
>
> COUNT is the number of matching D,S,C combinations in T2 where X = true.
> There might be no matching pair in T2 or there might be match but X
> is false.
>
> How can I express this?
>
>
>
> --
>
>
> Best Regards,
>
> Tarlika Elisabeth Schmitz
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>

Re: JOIN and aggregate problem

From

Tarlika Elisabeth Schmitz

Date:

23 February 2009, 11:15:28

On Fri, 20 Feb 2009 11:15:09 -0800 (PST)
Stephan Szabo <sszabo@megazone.bigpanda.com> wrote:

> On Fri, 20 Feb 2009, Tarlika Elisabeth Schmitz wrote:
>
> > I have 2 tables T1 and T2
> >
> > T1 has the columns: D, S, C. The combination of D,S,C is unique.
> > T2 has the columns: D, S, C, and boolean X. The combination of
> > D,S,C is not unique.
> >
> > I need to produce the following result for every occurrence of T1:
> > D,S,C, COUNT
> >
> > COUNT is the number of matching D,S,C combinations in T2 where X =
> > true. There might be no matching pair in T2 or there might be match
> > but X is false.
> >
>
> Maybe something like one of these barely tested queries?
>
> select d, s, c, sum(case when t2.x then 1 else 0 end)
> from t1 left outer join t2 using(d,s,c)
> group by d, s, c;

this works

> or
>
> select d,s,c,
>  (select count(*)
>   from t2
>   where t2.d=t1.d and t2.s=t1.s and t2.c=t1.c and t2.x)
> from t1;

this works too


From a performance point of view, is one preferable to the other?

Many thanks for your help!

--
Tarlika Elisabeth Schmitz

Re: JOIN and aggregate problem

From

Tarlika Elisabeth Schmitz

Date:

23 February 2009, 11:15:38

On Fri, 20 Feb 2009 19:06:48 +0000
Richard Huxton <dev@archonet.com> wrote:

> Tarlika Elisabeth Schmitz wrote:
> > I have 2 tables T1 and T2
> > 
> > T1 has the columns: D, S, C. The combination of D,S,C is unique.
> > T2 has the columns: D, S, C, and boolean X. The combination of
> > D,S,C is not unique. 
> > 
> > I need to produce the following result for every occurrence of T1:
> > D,S,C, COUNT
> > 
> > COUNT is the number of matching D,S,C combinations in T2 where X =
> > true. There might be no matching pair in T2 or there might be match
> > but X is false.
> 
> try something like:
> 
> SELECT t1.d, t1.s, t1.c, count(*)
> FROM t1
> LEFT JOIN (
>     SELECT d,s,c FROM t2 WHERE x
> ) AS t2_true USING (d,s,c)
> GROUP BY t1.d, t1.s, t1.c;
> 
> Warning - not tested

Many thanks for the quick reply.


This suggestion does not work as it returns a count of 1 even when
there are no rows in t2 that match (d,s,c) in T1.



--


Best Regards,

Tarlika Elisabeth Schmitz


A: Because it breaks the logical sequence of discussion
Q: Why is top posting bad?

Re: JOIN and aggregate problem

From

Tarlika Elisabeth Schmitz

Date:

23 February 2009, 11:15:41

On Fri, 20 Feb 2009 13:23:47 -0600
Bob Henkel <bob.henkel@gmail.com> wrote:

> CREATE UNIQUE INDEX idx01_t1
> ON t1 USING btree (d, s, c);
> 
> [...]
> 
> SELECT t1.d, t1.s, t1.c, CASE WHEN t2.x IS NULL THEN 0 ELSE COUNT(*)
> END FROM t1
> LEFT OUTER JOIN t2
>         ON t1.d = t2.d
>        AND t1.s = t2.s
>        AND t1.c = t2.c
>        AND t2.x = TRUE
> GROUP BY t1.d, t1.s, t1.c,t2.x;
> 
> [...]
> 
> On Fri, Feb 20, 2009 at 12:04 PM, Tarlika Elisabeth Schmitz
> <postgresql@numerixtechnology.de> wrote:
> > I have 2 tables T1 and T2
> >
> > T1 has the columns: D, S, C. The combination of D,S,C is unique.
> > T2 has the columns: D, S, C, and boolean X. The combination of
> > D,S,C is not unique.
> >
> > I need to produce the following result for every occurrence of T1:
> > D,S,C, COUNT
> >
> > COUNT is the number of matching D,S,C combinations in T2 where X =
> > true. There might be no matching pair in T2 or there might be match
> > but X is false.

Thank you very much for taking the time to help.

This is what I had tried myself but it does not cover the cases where
1) (1,1,1) exists in T1 but not in T2
1) (1,1,1) exists in T1 and T2 but X = false


As an aside: I see you use UNIQUE INDEX. I had created T1 with PRIMARY
KEY (D,S,C) assuming that that would create a unique index.


--


Best Regards,

Tarlika Elisabeth Schmitz


A: Because it breaks the logical sequence of discussion
Q: Why is top posting bad?

Re: JOIN and aggregate problem

From

Richard Huxton

Date:

23 February 2009, 11:44:22

Tarlika Elisabeth Schmitz wrote:
> On Fri, 20 Feb 2009 19:06:48 +0000
> Richard Huxton <dev@archonet.com> wrote:
>> try something like:
>>
>> SELECT t1.d, t1.s, t1.c, count(*)
>> FROM t1
>> LEFT JOIN (
>>     SELECT d,s,c FROM t2 WHERE x
>> ) AS t2_true USING (d,s,c)
>> GROUP BY t1.d, t1.s, t1.c;
>>
>> Warning - not tested
> 
> Many thanks for the quick reply.
> 
> 
> This suggestion does not work as it returns a count of 1 even when
> there are no rows in t2 that match (d,s,c) in T1.

Ah, then rather than count(*) you'll want count(t2_true.d) so when you
get a null because of no match it's not counted. You can use any column
from t2_true.

--  Richard Huxton Archonet Ltd

Re: JOIN and aggregate problem

From

Tarlika Elisabeth Schmitz

Date:

23 February 2009, 11:57:19

On Mon, 23 Feb 2009 15:44:05 +0000
Richard Huxton <dev@archonet.com> wrote:

> Tarlika Elisabeth Schmitz wrote:
> > On Fri, 20 Feb 2009 19:06:48 +0000
> > Richard Huxton <dev@archonet.com> wrote:
> >> try something like:
> >>
> >> SELECT t1.d, t1.s, t1.c, count(*)
> >> FROM t1
> >> LEFT JOIN (
> >>     SELECT d,s,c FROM t2 WHERE x
> >> ) AS t2_true USING (d,s,c)
> >> GROUP BY t1.d, t1.s, t1.c;
> >>
> >> Warning - not tested
> > 
> > Many thanks for the quick reply.
> > 
> > 
> > This suggestion does not work as it returns a count of 1 even when
> > there are no rows in t2 that match (d,s,c) in T1.
> 
> Ah, then rather than count(*) you'll want count(t2_true.d) so when you
> get a null because of no match it's not counted. You can use any
> column from t2_true.

Indeed, that works. Now I am spoilt for choice!

--


Best Regards,

Tarlika Elisabeth Schmitz


A: Because it breaks the logical sequence of discussion
Q: Why is top posting bad?