Re: Dealing with complex queries - Mailing list pgsql-general

From Bruno Wolff III
Subject Re: Dealing with complex queries
Date
Msg-id 20030204160431.GB4209@wolff.to
Whole thread Raw
In response to Re: Dealing with complex queries  (Francisco Reyes <lists@natserv.com>)
List pgsql-general
On Tue, Feb 04, 2003 at 09:54:59 -0500,
  Francisco Reyes <lists@natserv.com> wrote:
> On Tue, 4 Feb 2003, Bruno Wolff III wrote:
>
> > On Mon, Feb 03, 2003 at 16:32:10 -0500,
> >   Francisco Reyes <lists@natserv.com> wrote:
> > > Any hints suggestions on dealing with complex queries.
> >
> > >         substring(ppl.carried_as from 1 for 3)
> > >                 = substring(jc.last from 1 for 3) and
> >
> > Joining on substrings is a good sign that your design needs rethinking.
>
> The tables in the substring comparison are some of our tables compared
> to tables supplied by another company. It is a fact that the names don't
> always match and have variations.
>
> The whole excersise is to match by names to then populate our table with
> their keys so in the future we can match by keys in feeds we receive from
> them.

I think you might be able to coerce use of an index by using like and
only using substring on one side. You will want to try it both ways
to see which generates a faster query. You probably also want to use
a C locale if possible. You will need an index on the column that is
not in the substring call and you will need to concatenate a '%' on to
the substring being used for the search.

So you might have a clause like:
ppl.carried_as like substring(jc.last from 1 for 3) || '%' and

Another idea if the list doesn't have too many keys is to just get a sorted
distinct list and have a human scan it for matches.

pgsql-general by date:

Previous
From: Francisco Reyes
Date:
Subject: Re: Dealing with complex queries
Next
From: Tom Lane
Date:
Subject: Re: UNION problem