Thread: Optimizing t1.col like '%t2.col%'

Optimizing t1.col like '%t2.col%'

From

"Dan Kaplan"

Date:

27 February 2008, 15:47:51

I’ve got a lot of rows in one table and a lot of rows in another table. I want to do a bunch of queries on their join column. One of these is like this: t1.col like '%t2.col%'

I know that always sucks. I’m wondering how I can make it better. First, I should let you know that I can likely hold both of these tables entirely in ram. Since that’s the case, would it be better to accomplish this with my programming language? Also you should know that in most cases, t1.col and t2.col is 2 words or less. I’m not sure if that matters, I mention it because it may make tsearch2 perform badly.

Re: Optimizing t1.col like '%t2.col%'

From

Tom Lane

Date:

27 February 2008, 16:26:47

"Dan Kaplan" <dkaplan@citizenhawk.com> writes:
> I've got a lot of rows in one table and a lot of rows in another table.  I
> want to do a bunch of queries on their join column.  One of these is like
> this: t1.col like '%t2.col%'

> I know that always sucks.  I'm wondering how I can make it better.

tsearch or pg_trgm could probably help.  Are you really after exact
substring-match semantics, or is this actually a poor man's substitute
for full text search?  If you just want substrings then see pg_trgm,
if you want text search see tsearch.

            regards, tom lane