mlw <markw@mohawksoft.com> writes:
> Then you call search with a string, such as "the long and winding road"
> or "software OR hardware AND engineer NOT sales." A few milliseconds
> later, a list of key/rank pairs are produced. This is FAR faster than
> the '~~~' operator because it never does a full table scan.
An index-associated operator doesn't imply a full table scan either.
The whole purpose of an index is to pull out the rows matched by the
WHERE expression without doing a full scan.
The thing that bothers me about the way you're doing it is that the
search result as such doesn't give you access to anything but the keys
themselves. Typically what you want to do is get the whole record(s)
in which the matching keys are located --- and that's why the notion
of SELECT ... WHERE textfield-matches-search-string looks so attractive.
You get the records immediately, in one step. Without that, your next
step after the search engine call is to do a join of the search result
table against your data table, and poof there goes much of your speed
gain. (At best, you can make the join reasonably quick by having an
index on the unique key field ... but that just means another index to
maintain.)
Another advantage of handling it as an index is that you don't have to
rely on a periodic recomputation of the index; you can do on-the-fly
updates each time the table is altered. (Of course, if your indexing
technology can't handle incremental updates efficiently, that might not
be of any value to you. But there's nothing in the system design that
precludes making an index type that's only updated by REINDEX.)
I realize this is probably not what you wanted to hear, since building a
new index type is a lot more work than I suppose you were looking for.
But if you want a full-text index that's integrated naturally into
Postgres, that's the path to travel. The way you're doing it is
swimming against the tide. Even when the function-returning-recordset
limitation is gone (maybe a version or two away), it's still going to
be an awkward and inefficient way to work.
regards, tom lane