Re: Database-based alternatives to tsearch2? - Mailing list pgsql-general

From Richard Huxton
Subject Re: Database-based alternatives to tsearch2?
Date
Msg-id 457F0FF0.3090205@archonet.com
Whole thread Raw
In response to Database-based alternatives to tsearch2?  (Wes <wespvp@syntegra.com>)
List pgsql-general
Wes wrote:
>
> Indexes are too fragile.  Our documents will be offline, and re-indexing
> would be impossible.  Additionally, as I undertstand it, tsearch2 doesn't
> scale to the numbers I need (hundreds of millions of documents).

Jeff's right about tsvector - sounds like it's what you're looking for.

If you're worried about reindexing costs, perhaps look at partioning the
table, or using partial indexes (so you could have multiple indexes for
each table, based on (id mod 100) or some such).

Obviously, partitioning over multiple machines is usually quite do-able
for this sort of task too.

> Is anyone aware of any such solutions for PostgreSQL, open source or
> otherwise?

Without wishing to discourage a potential large user from PG, it might
be worth checking if Google/Yahoo/etc have a non-relational server that
meets your needs off-the-shelf.

--
   Richard Huxton
   Archonet Ltd

pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Asynchronous replication of a PostgreSQL DB to
Next
From: "Daniel Verite"
Date:
Subject: Re: Database-based alternatives to tsearch2?