Thread: Query help

Query help

From

Madison Kelly

Date:

08 March 2007, 02:22:24

Hi all,

   I've got a query that looks through a table I use for my little
search engine. It's something of a reverse-index but not quite, where a
proper reverse index would have 'word | doc1, doc3, doc4, doc7' showing
all the docs the keyword is in, mine has an entry for eac

   I've got a query like:

SELECT
    sch_id, sch_for_table, sch_ref_id, sch_instances
FROM
    search_index
WHERE
    (sch_keyword LIKE '%digi%' OR sch_keyword LIKE '%madi%')
AND
    sch_for_table!='client'
AND
    ... (more restrictions)
ORDER BY
    sch_instances DESC;

   This returns references to a data column (sch_ref_id) in a given
table (sch_for_table) for each matched keyword.

   The problem I am having is that two keywords might reference the same
table/column which would, in turn, give me two+ search results pointing
to the same entry.

   What I would like to do is, when two or more results match the same
'sch_ref_id' and 'sch_for_table' to merge the results. Specifically, the
'sch_instances' column is the number of times the given keyword is found
in the table/column. I'd like to add up the number in the duplicate
results (to give it a higher accuracy and move it up the search results).

   Is this possible or would I need to add this logic in my program? I'd
rather do it in PostgreSQL though, if I could.

   Here is the 'search_index' table I am using:

db=> \d search_index
                        Table "public.search_index"
     Column     |  Type   |                   Modifiers
---------------+---------+-----------------------------------------------
  sch_id        | integer | not null default nextval('sch_seq'::regclass)
  sch_keyword   | text    | not null
  sch_instances | integer | not null default 1
  sch_for_table | text    | not null
  sch_ref_id    | integer | not null
Indexes:
     "search_index_pkey" PRIMARY KEY, btree (sch_id)

   Thanks in advance to any help you might be able to give me!

Madison

Re: Query help

From

Shaun Johnston

Date:

08 March 2007, 02:45:55

Try

SELECT DISTINCT

rather than SELECT

That should return a result with unique records.

Madison Kelly wrote:
> Hi all,
>
>   I've got a query that looks through a table I use for my little
> search engine. It's something of a reverse-index but not quite, where
> a proper reverse index would have 'word | doc1, doc3, doc4, doc7'
> showing all the docs the keyword is in, mine has an entry for eac
>
>   I've got a query like:
>
> SELECT
>     sch_id, sch_for_table, sch_ref_id, sch_instances
> FROM
>     search_index
> WHERE
>     (sch_keyword LIKE '%digi%' OR sch_keyword LIKE '%madi%')
> AND
>     sch_for_table!='client'
> AND
>     ... (more restrictions)
> ORDER BY
>     sch_instances DESC;
>
>   This returns references to a data column (sch_ref_id) in a given
> table (sch_for_table) for each matched keyword.
>
>   The problem I am having is that two keywords might reference the
> same table/column which would, in turn, give me two+ search results
> pointing to the same entry.
>
>   What I would like to do is, when two or more results match the same
> 'sch_ref_id' and 'sch_for_table' to merge the results. Specifically,
> the 'sch_instances' column is the number of times the given keyword is
> found in the table/column. I'd like to add up the number in the
> duplicate results (to give it a higher accuracy and move it up the
> search results).
>
>   Is this possible or would I need to add this logic in my program?
> I'd rather do it in PostgreSQL though, if I could.
>
>   Here is the 'search_index' table I am using:
>
> db=> \d search_index
>                        Table "public.search_index"
>     Column     |  Type   |                   Modifiers
> ---------------+---------+-----------------------------------------------
>  sch_id        | integer | not null default nextval('sch_seq'::regclass)
>  sch_keyword   | text    | not null
>  sch_instances | integer | not null default 1
>  sch_for_table | text    | not null
>  sch_ref_id    | integer | not null
> Indexes:
>     "search_index_pkey" PRIMARY KEY, btree (sch_id)
>
>   Thanks in advance to any help you might be able to give me!
>
> Madison
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

Re: Query help

From

Richard Huxton

Date:

08 March 2007, 05:57:51

Madison Kelly wrote:
> Hi all,
>
>   I've got a query that looks through a table I use for my little search
> engine. It's something of a reverse-index but not quite, where a proper
> reverse index would have 'word | doc1, doc3, doc4, doc7' showing all the
> docs the keyword is in, mine has an entry for eac
>
>   I've got a query like:
>
> SELECT
>     sch_id, sch_for_table, sch_ref_id, sch_instances
> FROM
>     search_index
> WHERE
>     (sch_keyword LIKE '%digi%' OR sch_keyword LIKE '%madi%')
> AND
>     sch_for_table!='client'
> AND
>     ... (more restrictions)
> ORDER BY
>     sch_instances DESC;
>
>   This returns references to a data column (sch_ref_id) in a given table
> (sch_for_table) for each matched keyword.
>
>   The problem I am having is that two keywords might reference the same
> table/column which would, in turn, give me two+ search results pointing
> to the same entry.
>
>   What I would like to do is, when two or more results match the same
> 'sch_ref_id' and 'sch_for_table' to merge the results. Specifically, the
> 'sch_instances' column is the number of times the given keyword is found
> in the table/column. I'd like to add up the number in the duplicate
> results (to give it a higher accuracy and move it up the search results).

You'll want something like:

SELECT
     sch_id, sch_for_table, sch_ref_id,
     SUM(sch_instances) AS tot_instances
...
GROUP BY
     sch_id, sch_for_table, sch_ref_id
ORDER BY
     tot_instances DESC;

The key word to search the manuals on is "aggregates" (sum(), count() etc).

--
   Richard Huxton
   Archonet Ltd

Re: Query help

From

Madison Kelly

Date:

08 March 2007, 22:48:54

Richard Huxton wrote:
> Madison Kelly wrote:
>> Hi all,
>>
>>   I've got a query that looks through a table I use for my little
>> search engine. It's something of a reverse-index but not quite, where
>> a proper reverse index would have 'word | doc1, doc3, doc4, doc7'
>> showing all the docs the keyword is in, mine has an entry for eac
>>
>>   I've got a query like:
>>
>> SELECT
>>     sch_id, sch_for_table, sch_ref_id, sch_instances
>> FROM
>>     search_index
>> WHERE
>>     (sch_keyword LIKE '%digi%' OR sch_keyword LIKE '%madi%')
>> AND
>>     sch_for_table!='client'
>> AND
>>     ... (more restrictions)
>> ORDER BY
>>     sch_instances DESC;
>>
>>   This returns references to a data column (sch_ref_id) in a given
>> table (sch_for_table) for each matched keyword.
>>
>>   The problem I am having is that two keywords might reference the
>> same table/column which would, in turn, give me two+ search results
>> pointing to the same entry.
>>
>>   What I would like to do is, when two or more results match the same
>> 'sch_ref_id' and 'sch_for_table' to merge the results. Specifically,
>> the 'sch_instances' column is the number of times the given keyword is
>> found in the table/column. I'd like to add up the number in the
>> duplicate results (to give it a higher accuracy and move it up the
>> search results).
>
> You'll want something like:
>
> SELECT
>     sch_id, sch_for_table, sch_ref_id,
>     SUM(sch_instances) AS tot_instances
> ...
> GROUP BY
>     sch_id, sch_for_table, sch_ref_id
> ORDER BY
>     tot_instances DESC;
>
> The key word to search the manuals on is "aggregates" (sum(), count() etc).
>

This is *exactly* the pointer I needed, thank you!

Sad thing is that I even used "GROUP BY" before... had just forgotten
about it. ^_^;

Madison