On Mon, May 05, 2003 at 14:03:18 -0600, Alex Rice <alrice@ARCplanning.com> wrote:
>
> On Monday, May 5, 2003, at 01:45 PM, Bruno Wolff III wrote:
>
> >>SELECT url.rec_id, url, title, content_type, txt,
> >>sum(
> >> case
> >> when dict.word = 'wordx' then 1
> >> when dict.word = 'wordx' then 1
> >> when dict.word = 'wordy' then 1
> >> else 0
> >> end
> >>) as rank
> >>FROM dict, url
> >>WHERE url.rec_id = dict.url_id
> >>GROUP BY rec_id, url, title, content_type, txt
> >>ORDER BY rank DESC
> >
> >Is there some reason you can't put the word list in the where clause?
> >(You can use coalesce to set NULL sums to 0.)
>
> Can you give an example of what you mean use coalesce and how to use it
> to produce a count of the matched words, the rank?
You don't really need coalesce. I mentioned that because if you take
the sum of zero rows (the case where no words match for one of the returned
rows) then you get NULL instead of 0 (which may or may not be a problem
for you).
> I am converting some existing query examples that use the form
> SUM( dict.word IN ( word list...))
> which doesn't work with pgsql.
You want to do this outside the sum, not inside (and change sum to
count). For example:
select count(dict.word) from dict where word in (word list ...);
(If word list is a select rather than a list of constants, you currently
want to use exists instead for performance reasons. In 7.4 that won't
be a problem.)
By putting things in the where close you don't have to examine all of
the rows for each word - url pair. This may cut out a lot of rows that
need to be examined if there is an index on dict.word and the number of
words in a url is large relative to the number being searched on.