Home > mailing lists

Re: Using In Clause For a Large Text Matching Query - Mailing list pgsql-sql

From	Richard Broersma Jr
Subject	Re: Using In Clause For a Large Text Matching Query
Date	June 29, 2006 20:52:03
Msg-id	20060629204927.16465.qmail@web31802.mail.mud.yahoo.com Whole thread Raw
In response to	Using In Clause For a Large Text Matching Query (Jason Farmer <jfarmer@getloaded.com>)
List	pgsql-sql

Tree view

> We can assume a certain portion of the text is included in the DB table, 
> so I want to be able to do a substring match on "brown" and "green" and 
> in this case return both "brown kitty", and "green doggy". However the 
> problem is, if I run the query on each of my 300 rows to scan 200,000 
> rows in my DB is entirely too slow. So I was hoping to use the IN clause 
> to create an IN group of about 300 items to scan the DB once.

You can probably do it.  However, you will have to pick a substring from your text field to
compare against.  In this case you seem to be choosing the first word, i.e. "brown" and "green".

so maybe:

select t1.col1
from    table1 as t1,   (      select distinct split_part( tblFileDump.Col1, ' ', 1 ) as samplecol      from
tblFileDump  ) as fd1
 

where t1.col1 like '%' || fd1.samplecol || '%'
;

This is just an idea.  I've never used split_part or developed a sudo join this way.  But it may
work provided you and jump your text files into a temp table.

Notice:
http://www.postgresql.org/docs/8.1/interactive/functions-string.html
for the syntax for split_part().

Regards,

Richard Broersma Jr.

pgsql-sql by date:

From: "Daniel Caune"
Date: 29 June 2006, 20:32:41
Subject: Documentation Generator for pl/pgsql

From: "Jonah H. Harris"
Date: 29 June 2006, 21:04:04
Subject: Re: Documentation Generator for pl/pgsql

Re: Using In Clause For a Large Text Matching Query - Mailing list pgsql-sql

Previous

Next