Re: Multiple DB join - Mailing list pgsql-sql

From Andrew Sullivan
Subject Re: Multiple DB join
Date
Msg-id 20060815141712.GB21939@phlogiston.dyndns.org
Whole thread Raw
In response to Re: Multiple DB join  (Sumeet Ambre <sambre@indiana.edu>)
Responses Re: Multiple DB join  (Sumeet <asumeet@gmail.com>)
List pgsql-sql
On Tue, Aug 15, 2006 at 10:11:41AM -0400, Sumeet Ambre wrote:
> >  
> The Design of the database is because our organization wants to split up 
> different datasets into different entities, and there might be a 
> possibility that they'll run different instances of postgres for each 
> dataset. 

It's this "different instances" thing that I'm having a tough time
with.  Is this because they want to be able to query local things
when disconnected or something?  I can think of applications for
this, for sure, I'm just suggesting that you make sure you're not
building an optimisation that is (1) premature and (2) possibly a
pessimal operation.

> records, the query runs damn slow.......below is the sample schema for 
> my base table and the query i try to run on it....it takes me more than 
> 2-3 minutes to run a query....Is there any way i could speed this up......

The problem is not your design, nor even the size of the data
exactly, but the query: 

> e.g.  SELECT * FROM base WHERE ((title ILIKE '%something%') AND (authors 
> ~*   '.*something.*')  AND (db_name='something'));

You have two initially-unbound search terms there: ILIKE '%' and ~*
'.*' are automatically seqscans, because you have nowhere in the
index to start.  If you really want to do this kind of unbound-string
query, you need to look into full text search.  The above approach is
never going to be fast.

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well.     --Dennis Ritchie


pgsql-sql by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Breaking up a query
Next
From: Andrew Sullivan
Date:
Subject: Re: Multiple DB join