Home > mailing lists

Re: similarity and operator '%' - Mailing list pgsql-performance

From	Jeff Janes
Subject	Re: similarity and operator '%'
Date	May 30, 2016 20:05:50
Msg-id	CAMkU=1wtKJpkjBoL7ubjbZS=rOMAsNKum-BXZUQkpW70gntzSQ@mail.gmail.com Whole thread
In response to	similarity and operator '%' (Volker Boehm <volker@vboehm.de>)
List	pgsql-performance

Tree view

On Mon, May 30, 2016 at 10:53 AM, Volker Boehm <volker@vboehm.de> wrote:

> The reason for using the similarity function in place of the '%'-operator is
> that I want to use different similarity values in one query:
>
>     select name, street, zip, city
>     from addresses
>     where name % $1
>         and street % $2
>         and (zip % $3 or city % $4)
>         or similarity(name, $1) > 0.8

I think the best you can do through query writing is to use the
most-lenient setting in all places, and then refilter to get the less
lenient cutoff:

     select name, street, zip, city
     from addresses
     where name % $1
         and street % $2
         and (zip % $3 or city % $4)
         or (name % $1 and similarity(name, $1) > 0.8)

If it were really important to me to get maximum performance, what I
would do is alter/fork the pg_trgm extension so that it had another
operator, say %%%, with a hard-coded cutoff which paid no attention to
the set_limit().  I'm not really sure how the planner would deal with
that, though.

Cheers,

Jeff

pgsql-performance by date:

From: Jeff Janes
Date: 30 May 2016, 19:34:39
Subject: Re: Re: Planner chooses slow index heap scan despite accurate row estimates

From: Claudio Freire
Date: 03 June 2016, 21:26:40
Subject: Re: index fragmentation on insert-only table with non-unique column

Re: similarity and operator '%' - Mailing list pgsql-performance

Previous

Next