Re: 2 simple SQL questions: optimizing aggegate query - Mailing list pgsql-sql

From Alex Rice
Subject Re: 2 simple SQL questions: optimizing aggegate query
Date
Msg-id DC3C990C-7FF8-11D7-96E3-000393529642@ARCplanning.com
Whole thread Raw
In response to Re: 2 simple SQL questions: optimizing aggegate query  (Bruno Wolff III <bruno@wolff.to>)
List pgsql-sql
On Monday, May 5, 2003, at 07:28  PM, Bruno Wolff III wrote:

>> I am converting some existing query examples that use the form
>> SUM( dict.word IN ( word list...))
>> which doesn't work with pgsql.
>
> You want to do this outside the sum, not inside (and change sum to
> count). For example:
> select count(dict.word) from dict where word in (word list ...);
> (If word list is a select rather than a list of constants, you 
> currently
> want to use exists instead for performance reasons. In 7.4 that won't
> be a problem.)
>
> By putting things in the where close you don't have to examine all of
> the rows for each word - url pair. This may cut out a lot of rows that
> need to be examined if there is an index on dict.word and the number of
> words in a url is large relative to the number being searched on.
>

Bruno: Thanks! Using your advice, my queries are now < 1 sec! Using 
this form

SELECT url_id, url.url, url.txt, url.content_type,url.docsize, count(dict.word) AS rank
FROM url,dict WHERE word IN ('word1', 'word2', ...)
AND url.rec_id = dict.url_id
GROUP BY url_id, url, txt, content_type, docsize
ORDER BY rank DESC;

I have also upgraded to pgsql 7.3.2, and the query I posted yesterday 
"sum( case ... ) " is a lot faster as well (about 2 seconds now). So I 
think maybe there was also something wrong with my indices yesterday- 
in addition to there being something wrong with my SQL. :-)

Thanks for the SQL advice.

Alex Rice, Software Developer
Architectural Research Consultants, Inc.
alrice@ARCplanning.com
alrice@swcp.com



pgsql-sql by date:

Previous
From: Wei Weng
Date:
Subject: Re: pgsql Replication Proxy (was Re: Replication for a
Next
From: Michael A Nachbaur
Date:
Subject: Re: pgsql Replication Proxy (was Re: Replication for a