Re: How to optimize query that concatenates strings? - Mailing list pgsql-general

From Chander Ganesan
Subject Re: How to optimize query that concatenates strings?
Date
Msg-id 44AE8DCE.2070301@otg-nc.com
Whole thread Raw
In response to How to optimize query that concatenates strings?  ("badlydrawnbhoy" <badlydrawnbhoy@gmail.com>)
List pgsql-general
You could build a function-based index that contains the "simplified" version of each URL (in your case, the field with the '/' stripped).  Then use the same function on the URL going in.  In that case PostgreSQL will use the index that you created already.

Take a look at the PostgreSQL documentation for function-based indexes.

select from ... where simplify(url) <> url_col;

In the example above 'url_col' would have a function-based index that was based on 'simplify(url_col)'
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
Phone: 877-258-8987/919-463-0999


badlydrawnbhoy wrote:
Hi all,

I've got a database of URLs, and when inserting new data into it I want
to make sure that there are no functionally equivalent URLs already
present. For example, 'umist.ac.uk' is functionally the same as
'umist.ac.uk/'.

I find that searching for the latter form, using string concatentation
to append the trailing slash, is much slower than searching for a
simple string - the index on URL name isn't used to speed up the
search.

Here's an illustration

url=# explain select exists(select * from url where url = 'umist.ac.uk'
or url || '/' = 'umist.ac.uk') as present;                                         QUERY PLAN

-----------------------------------------------------------------------------------------------Result  (cost=47664.01..47664.02 rows=1 width=0)  InitPlan    ->  Seq Scan on url  (cost=0.00..47664.01 rows=6532 width=38)          Filter: ((url = 'umist.ac.uk'::text) OR ((url || '/'::text)
= 'umist.ac.uk'::text))
(4 rows)

url=# explain select exists(select * from url where url =
'umist.ac.uk') as present;                                QUERY PLAN
----------------------------------------------------------------------------Result  (cost=5.97..5.98 rows=1 width=0)  InitPlan    ->  Index Scan using url_idx on url  (cost=0.00..5.97 rows=1
width=38)          Index Cond: (url = 'umist.ac.uk'::text)
(4 rows)


Is there any way I can force postgres to use the index when using the
string concatenation in the query?

Thanks in advance,

BBB


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster 

pgsql-general by date:

Previous
From: "Parang Saraf"
Date:
Subject: How to insert .xls files into database
Next
From: "Joshua D. Drake"
Date:
Subject: Re: VACUUM FULL versus CLUSTER ON