Steve Atkins wrote:
>>What would be performance of pgSQL text search vs MySQL vs Lucene (flat
>>file) for a 2 terabyte db?
>>thanks for any comments.
>
> My experience with tsearch2 has been that indexing even moderately
> large chunks of data is too slow to be feasible. Moderately large
> meaning tens of megabytes.
My experience with MySQL's full text search as well as the various
MySQL-based text indexing programs (forgot the names, it's been a while)
for some 10-20GB of mail archives has been pretty disappointing too. My
biggest gripe is with the indexing speed. It literally takes days to
index less than a million documents.
I ended up using Swish++. Microsoft's CHM compiler also has pretty
amazing indexing speed (though it crashes quite often when encountering
bad HTML).
--
dave