Re: best practise/pattern for large OR / LIKE searches - Mailing list pgsql-general

From Jasen Betts
Subject Re: best practise/pattern for large OR / LIKE searches
Date
Msg-id h7dpkr$7ld$1@reversiblemaps.ath.cx
Whole thread Raw
In response to best practise/pattern for large OR / LIKE searches  (Ries van Twisk <pg@rvt.dds.nl>)
Responses Re: best practise/pattern for large OR / LIKE searches  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-general
On 2009-08-26, Ries van Twisk <pg@rvt.dds.nl> wrote:
>
> --Apple-Mail-1173-222712773
> Content-Type: text/plain;
>     charset=US-ASCII;
>     format=flowed;
>     delsp=yes
> Content-Transfer-Encoding: 7bit
>
> Hey All,
>
> I am wondering if there is a common pattern for these sort of queries :
>
> SELECT * FROM tbl WHERE datanumber LIKE '%12345%' OR LIKE '%54321%' OR
> LIKE '%8766%' OR LIKE '%009%', ..

SELECT * FROM tbl WHERE datanumber LIKE ANY ARRAY('%12345%','%54321%','%8766%'...)

> The number of OR/LIKES are in the order of 50-100 items...
> the table tbl is a couple of million rows.

regex might perfrom better than LIKE ANY

SELECT * FROM tbl WHERE '12345|54321|8766|009' ~ datanumber;

regex is compiled to a finite state machine and then the datanumber
column is scanned in a single pass (for each row)

> Searches are currently taking to long and we would like to optimize
> them, but before we dive into our own solution we
> where wondering if there already common solutions for this...

try regex first if that's too slow you may need to write a
dictionary function that splits datanuimber into it's components
and use full text index/search. (this will slow down updates as they will do
upto 20 inserts into the index)

searches should then be optimally fast

pgsql-general by date:

Previous
From: "Eric Comeau"
Date:
Subject: Re: New database or New Schema?
Next
From: Stephen Cuppett
Date:
Subject: Trouble using TG_TABLE_NAME in BEFORE INSERT OR UPDATE trigger