Mat,
> 1. With tsearch2 I get very good query times up until I insert more
> records. For example with 100,000 records tsearch2 returns in around 6
> seconds, with 200,000 records tsearch2 returns in just under a minute.
> Is this due to the indices fitting entirely in memory with 100,000
> records?
Maybe, maybe not. If you want a difinitive answer, post your EXPLAIN ANALYZE
results with the original query.
I assume that you have run VACUUM ANALYZE, first? Don't bother to respond
until you have.
> 2. As well as whole word matching i also need to be able to do
> substring matching. Is the FTI module the way to approach this?
Yes.
> 3. I have just begun to look into distibuted queries. Is there an
> existing solution for distibuting a postgresql database amongst
> multiple servers, so each has the same schema but only a subset of the
> total data?
No, it would be ad-hoc. So far, Moore's law has prevented us from needing to
devote serious effort to the above approach.
> Any other helpful comments or sugestions on how to improve query times
> using different hardware or software techniques would be appreciated.
Read the archives of this list.
--
Josh Berkus
Aglio Database Solutions
San Francisco