Home > mailing lists

Re: Help me with this multi-table query - Mailing list pgsql-general

From	Nilesh Govindarajan
Subject	Re: Help me with this multi-table query
Date	March 24, 2010 08:58:36
Msg-id	4BA9FC90.7030805@itech7.com Whole thread Raw
In response to	Re: Help me with this multi-table query (Dean Rasheed <dean.a.rasheed@googlemail.com>)
List	pgsql-general

Tree view

On 03/24/2010 01:14 PM, Dean Rasheed wrote:
> On 24 March 2010 05:17, Nilesh Govindarajan<lists@itech7.com>  wrote:
>> On 03/24/2010 12:45 AM, Dean Rasheed wrote:
>>>
>>> On 23 March 2010 11:07, Nilesh Govindarajan<lists@itech7.com>    wrote:
>>>>
>>>> Hi,
>>>>
>>>> I want to find out the userid, nodecount and comment count of the userid.
>>>>
>>>> I'm going wrong somewhere.
>>>>
>>>> Check my SQL Code-
>>>>
>>>> select u.uid, count(n.nid) nc , count(c.cid) cc from users u left join
>>>> node
>>>> n on ( n.uid = u.uid ) left join comments c on ( c.uid = u.uid ) group by
>>>> u.uid having u.uid<>    0 order by u.uid;
>>>>
>>>
>>> I think you want select u.uid, count(distinct n.nid) nc ,
>>> count(distinct c.cid) cc from ...
>>> otherwise you're counting each node/comment multiple times as the rows in
>>> the
>>> join multiply up (note 85 x 174 = 14790).
>>>
>>> For big tables, this could start to become inefficient, and you might
>>> be better off
>>> doing your queries 1 and 2 above as sub-queries and joining them in an
>>> outer query.
>>>
>>> Regards,
>>> Dean
>>
>> Thanks a lot !! It worked.
>> How to do it using subqueries ?
>>
>
> Well the problem with the original joined query is that when it is
> executed there will be an intermediate step where it has to consider
> many thousands of rows (one for each combination of a node and comment
> for each user). You can see the number of rows processed from your
> original query by adding up the counts (about 17000). This problem
> would be compounded if you added more table joins and counts to the
> query.
>
> One way to re-write it using sub-queries would be something like
>
> select v1.uid, v1.nc, v2.cc from
> (select u.uid, count(n.nid) nc from users u left join node n on (
> n.uid = u.uid ) group by u.uid) as v1,
> (select u.uid, count(c.nid) cc from users u left join comments c on (
> c.uid = u.uid ) group by u.uid) as v2
> where v1.uid=v2.uid
> order by u.uid
>
> This is the equivalent of defining a couple of views for the counts
> and then selecting from those views.
>
> Another possibility would be something like
>
> select
>    u.uid,
>    (select count(n.nid) from node n where n.uid = u.uid) as nc,
>    (select count(c.nid) from comments c where c.uid = u.uid) as cc
> from users u order by u.uid
>
> There are probably other ways too. Which is best probably depends on
> the size and distribution of your data, and any indexes you have. You
> might benefit from indexes on the uid columns of node and comments, if
> you don't already have them. Try timing them in psql with \timing, and
> use EXPLAIN ANALYSE to see how each is executed.
>
> Regards,
> Dean

The second method is the best. It takes 3.311 ms to execute. The first
method suggested by you takes 5.7 ms, and the worst is my method which
takes 60ms (boo).

Thanks a lot :) :) :)

--
Nilesh Govindarajan
Site & Server Administrator
www.itech7.com

pgsql-general by date:

From: Frank jansen
Date: 24 March 2010, 07:22:24
Subject: Concatenate

From: yue peng
Date: 24 March 2010, 09:25:08
Subject: Out of Memory during Insert

Re: Help me with this multi-table query - Mailing list pgsql-general

Previous

Next