Re: Text Search vs MYSQL vs Lucene - Mailing list pgsql-general

From David Garamond
Subject Re: Text Search vs MYSQL vs Lucene
Date
Msg-id 414077B7.5080808@zara.6.isreserved.com
Whole thread Raw
In response to Re: Text Search vs MYSQL vs Lucene  (Steve Atkins <steve@blighty.com>)
List pgsql-general
Steve Atkins wrote:
>>What would be performance of pgSQL text search vs MySQL vs Lucene (flat
>>file) for a 2 terabyte db?
>>thanks for any comments.
>
> My experience with tsearch2 has been that indexing even moderately
> large chunks of data is too slow to be feasible. Moderately large
> meaning tens of megabytes.

My experience with MySQL's full text search as well as the various
MySQL-based text indexing programs (forgot the names, it's been a while)
for some 10-20GB of mail archives has been pretty disappointing too. My
biggest gripe is with the indexing speed. It literally takes days to
index less than a million documents.

I ended up using Swish++. Microsoft's CHM compiler also has pretty
amazing indexing speed (though it crashes quite often when encountering
bad HTML).

--
dave

pgsql-general by date:

Previous
From: Vivek Khera
Date:
Subject: Re: How to determine a database is intact?
Next
From: Tom Lane
Date:
Subject: Re: [JDBC] ERROR: canceling query due to user request