> I promised that I will get back to the group with the reason. Well, of
> course was a query :). I do use a search engine file system
> based(lucene) that will take any desired entity saved into the database
> and find the primary keys and then do a select * from entity where id is
> in (:ids)If I get too many matches(3000-4000)... that will delay my
> postmaster and that postmaster associated with the query would take
> 10-15 minutes to process that query. So, now I limit that to 500,
> anything bigger than that will ask user to refine the query.However this
> whole investigation made me observe some things. On my server box 7.4.7
> I have some queries that are executing pretty slow 1.2-1.5secs it's a
> lot for a query that goes thru 5000 records. On my local environment 8.1
> the same queries on similar table size executes much faster like
> 200-400ms. Do you know if this is a known issue or my dev box is better
> than my server box?
PG developers are not Microsoft, hence, every version of PG gets meaner
and faster instead of slower and more bloated.
The speedup you see might simply be due to developer competence...
Anyway, if you use Lucene for full text search (I didn't try it, I tried
Xapian which seems similar) :
Users don't care about searches returning 5000 results ; they usually
only look at the first pages. It's the sorting that is important, and this
should be done by Lucene itself (sorting on phrase relevance, for
instance, is a search engine job). So, you are right in limiting the
results to 500, it could even be lower. The role of the search engine is
to sort the good results at the top of the list so users are happy.
Do you really display 500 results ? If you paginate the results, the
users will probably never go past page3. Fetching a lot of results is
therefore a waste of database power : just fetch the ids from Lucene and
grab the results from Postgres, but only for the page you are going to
display.
If you use Postgres to do the sorting (example : lucene matches the
records and you want to order them by a field which is not stored in
Lucene but in Postgres) obviously this advice does not apply.