Re: full text searching - Mailing list pgsql-general

From Mitch Vincent
Subject Re: full text searching
Date
Msg-id 008a01c09083$91564700$0200000a@windows
Whole thread Raw
In response to full text searching  (Culley Harrelson <culleyharrelson@yahoo.com>)
List pgsql-general
> > Well, the search engine isn't the database, IMHO. The search "engine" is
> > your application... The database will go get anything you tell it to,
you
> > just have to know how to tell it and make sure that your application
tells
> > it in the correct way.
> >
> > Teaching an application or database the English language is going to be
a
> > hell of a project, good luck!
>
> Well, I don't want to write another search engine. What I would like to
see
> is a way to integrate with different third party products. It would be
cool
> with Lucene or some other free search engine as an optional add on
> for PostgreSQL.
>
> > Anyway. Moral of the story.. I'd like to see native PostgreSQL full text
> > indexing before we go adding on to the contrib'd trigger/function
> > implementation...
>
> Well, I think any attempt at a "complete" full text indexing
implementation
> in the database itself is futile. Find a way to move this out of the
> database and integrate with another product.

Futile? Nah, I don't think it's futile anymore than indexing for any other
field is futile. If you could have both then well, that would rock.

I'm talking about indexing from the standpoint of fast searching, not really
smart searching (I wouldn't want a search for "woman" to return results with
"women"). I put it upon myself to generate the queries needed to give the
proper results.. I work for a custom software shop and so far every
application I've written needs a search and the client needs it to do a very
customized, very specific thing. I can't help but write it from scratch (of
course all I'm really doing is writing a frontend to PostgreSQL).. I'm not
sure that a generic search engine would work for me because all the clients
I've worked with have very specific needs.. PostgreSQL is my generic search
engine for all intents and ppurposes and I make it give me what I want..
With regard to FTI, I just want it to search large chunks of text faster...

> I've been using a variant of the FTI system in an application, but this is
> far from sufficient when it comes to matching. Speed is OK, but the
quality
> of the results could have been a lot better.

Really? How are you using it? If it's better than the one I wrote (and it
almost has to be!) I'd love to take a look.. Speed is OK on the machine I'm
searching through large text chunks with now because of a combination of a
good database (PostgreSQL) and a hefty machine (Dual PII 800, 512M ECC RAM,
Ultra 160 SCSI drives).. Still it's only doing sequential scans and using
LIKE to give me matches.. My search is generic SELECT * from whatever WHERE
textfield LIKE '%<searched word>%';  -----  That's fairly fast -- it would
be a hell of a lot faster if I could do an index scan there.. Of course it
was, it's just that updating and inserting suffered too much; something that
will happen anytime you're indexing large amount of data on the fly, I would
just like to see it suffer less, which might happen if FTI was built into
PG.. I'm just talking here, I don't know how FTI would be implemented better
if it was built in, other than I'm sure the person doing it would know more
about the internals of PG and more about C then me (Sadly I'm not all that
good with C anymore)..

Have a good one!


-Mitch







pgsql-general by date:

Previous
From: Culley Harrelson
Date:
Subject: Re: selecting a random record
Next
From: Bruce Momjian
Date:
Subject: Re: new type proposal