Thread: multi-table join, final table is outer join count ...

multi-table join, final table is outer join count ...

From
The Hermit Hacker
Date:
Okay, not sure best way to try and describe this ... have multiple tables,
of a form like:

table agid intdata text

table bgid intdata text

table cgid intdata text

table dgid intdata text

I want to return:

a.gid,a.data,b.data,c.data,count(d.data)

where

a.gid = b.gid = c.gid = d.gid

*but* I want count(d.data) to return zero *if* there are no records in
table d ...

essentially, gid has to exist in tables a,b,c but not d ...

So, ignoring table d, i'd have:

SELECT a.gid,a.data,b.data,c.data FROM tablea a, tableb b, tablec cWHERE a.gid = b.gid  AND b.gid = c.gid;

How do I add 'tabled d' to the mix?

Thanks ...

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org



Re: multi-table join, final table is outer join count ...

From
The Hermit Hacker
Date:
Got it after a bit of fiddling ... actually, not bad code ...
 SELECT distinct s.gid, s.created, count(i.title) AS images   FROM status s LEFT JOIN images i ON (s.gid = i.gid AND
i.active),       personal_data pd, relationship_wanted rw  WHERE s.active AND s.status != 0    AND (s.gid = pd.gid AND
pd.gender= 0)    AND (s.gid = rw.gid AND rw.gender = 0 )
 
GROUP BY s.gid,s.created
ORDER BY  images desc;

The part that had confused me was the whole 'ON' part ... once I clued in
that that is essentially a WHERE, it actually made sense ...


On Sat, 12 May 2001, The Hermit Hacker wrote:

>
> Okay, not sure best way to try and describe this ... have multiple tables,
> of a form like:
>
> table a
>     gid int
>     data text
>
> table b
>     gid int
>     data text
>
> table c
>     gid int
>     data text
>
> table d
>     gid int
>     data text
>
> I want to return:
>
> a.gid,a.data,b.data,c.data,count(d.data)
>
> where
>
> a.gid = b.gid = c.gid = d.gid
>
> *but* I want count(d.data) to return zero *if* there are no records in
> table d ...
>
> essentially, gid has to exist in tables a,b,c but not d ...
>
> So, ignoring table d, i'd have:
>
> SELECT a.gid,a.data,b.data,c.data
>   FROM tablea a, tableb b, tablec c
>  WHERE a.gid = b.gid
>    AND b.gid = c.gid;
>
> How do I add 'tabled d' to the mix?
>
> Thanks ...
>
> Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
> Systems Administrator @ hub.org
> primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org
>
>

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org



Re: Re: multi-table join, final table is outer join count ...

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
>   SELECT distinct s.gid, s.created, count(i.title) AS images
>     FROM status s LEFT JOIN images i ON (s.gid = i.gid AND i.active),
>          personal_data pd, relationship_wanted rw
>    WHERE s.active AND s.status != 0
>      AND (s.gid = pd.gid AND pd.gender = 0)
>      AND (s.gid = rw.gid AND rw.gender = 0 )
> GROUP BY s.gid,s.created
> ORDER BY  images desc;

> The part that had confused me was the whole 'ON' part ... once I clued in
> that that is essentially a WHERE, it actually made sense ...

Right, but there's some fine points here.

When you're dealing with INNER JOINs, ON (or its variant USING) is
exactly equivalent to WHERE.  Write whichever you like.

When you're dealing with OUTER JOINs, ON is *not* quite the same as
WHERE, because it determines which rows are considered to "match"
and thus which rows will be extended with NULLs.  Let's take a
simplified version of your above example.  If you wrote
FROM status s LEFT JOIN images i ON (s.gid = i.gid)WHERE i.active AND ...other conditions...

then this would produce the regular inner join of status and images
where gid matches, *plus* a row for each unmatched status row (extended
with NULLs for the images columns).  This collection of rows would then
pass through your WHERE clauses, and whichever ones pass all the WHERE
conditions get into the result.  But, when you write
FROM status s LEFT JOIN images i ON (s.gid = i.gid AND i.active)WHERE ...other conditions...

then you get the inner join of status and images on gid, minus the rows
where i.active is false, plus a null-extended row for each status row
that does not have a matching *active* image row.  So the set of rows
that comes out of the join is different: there could be more
null-extended rows in this case than in the other one.  In particular,
you could see rows having i.active=NULL in the final result, which'd
never happen if you had put i.active into the WHERE clause instead of
the ON clause.

Bottom line: what you put in the ON part should just be the clauses that
determine whether you think there's a match between the two tables.
The WHERE part is additional restrictions that limit what you want to
see, but don't affect the semantics of whether there's a match.

In your above example, I'm not sure whether it's right to put i.active
in the ON part or in WHERE.  It depends on what you want to happen for
status rows that match only inactive images, and whether you consider
them different from status rows that match no images at all.
        regards, tom lane