Re: Question regarding contrib/fulltextindexing - Mailing list pgsql-novice

From Derek Barrett
Subject Re: Question regarding contrib/fulltextindexing
Date
Msg-id 20020714224102.22447.qmail@graffiti.net
Whole thread Raw
In response to Question regarding contrib/fulltextindexing  ("Derek Barrett" <derekbarrett@graffiti.net>)
Responses Re: Question regarding contrib/fulltextindexing  (Andrew McMillan <andrew@catalyst.net.nz>)
List pgsql-novice
Thanks Andrew, that solution seems to be easier in my mind.


Okay, so I create a lookup table for the description field.

CREATE TABLE lookup_description (

pk            integer
id            integer
search_word   varchar(50)

)

(pk is the primary key, id being a foreign key)


(By the way, any recommendation in sizing the search_word field? Are there strings that are large enough that aren't
worthindexing?)  

My user INSERTS the following string:

"The quick brown fox jumped over the moon and another fox."

In my code, I use a for loop, and take the text string and put it into an array, and fill up the lookup_description
table.Of course, I will create a noisewords filter to remove words like the, a, an from this list. What about duplicate
words?Should those be filtered out as well? In this example, fox is duplicated. I would assume that leaving in the
duplicates,might be useful later if I decide to implement a relevency-type of search engine. (Rank the results based on
howmany times, "fox" is found). 

TABLE lookup_description

pk id search_word
-- -- -----------
1  1  quick
2  1  brown
3  1  fox
4  1  jumped
5  1  over
6  1  moon
7  1  another
8  1  fox

Then I can create an index on the search_word column. Later when I do my SELECT query, I will join this lookup query to
themain query. 

Is that the idea?


Derek

----- Original Message -----
From: Andrew McMillan <andrew@catalyst.net.nz>
Date: 14 Jul 2002 12:16:59 +1200
To: Derek Barrett <derekbarrett@graffiti.net>
Subject: Re: [NOVICE] Question regarding contrib/fulltextindexing


> On Sun, 2002-07-14 at 08:53, Derek Barrett wrote:
> >
> > In my situation, I need to match exact words, so I've used regular expressions to search on a varchar(10000) field:
> >
> > SELECT *
> > FROM table
> > WHERE description ~* ('[^a-zA-Z0-9]($keyword[$x])[^a-zA-Z0-9]');
> >
> > Would this module still be useful in my situation?
>
> I was doing word search with a modified version of the fulltextindex
> code, but in the end I found it easier to write a perl program to do
> build the index table - the trigger approach was more work to manage
> than it seemed it should be.
>
> Regards,
>                     Andrew.
> --
> --------------------------------------------------------------------
> Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington
> WEB: http://catalyst.net.nz/        PHYS: Level 2, 150-154 Willis St
> DDI: +64(4)916-7201    MOB: +64(21)635-694    OFFICE: +64(4)499-2267
>        Are you enrolled at http://schoolreunions.co.nz/ yet?
>
>

--
_______________________________________________
Get your free email from http://www.graffiti.net

Powered by Outblaze

pgsql-novice by date:

Previous
From: "Norman Khine"
Date:
Subject: Re: Multiple table insert using a CSV list as the datasource
Next
From: Andrew McMillan
Date:
Subject: Re: Question regarding contrib/fulltextindexing