testing/predicting optimization using indexes - Mailing list pgsql-sql

From TJ O'Donnell
Subject testing/predicting optimization using indexes
Date
Msg-id 41F1A387.5040402@acm.org
Whole thread Raw
Responses Re: testing/predicting optimization using indexes  (PFC <lists@boutiquenumerique.com>)
List pgsql-sql
I have several questions reagaring the kind of increase in speed I can
expect when I use a multi-column index. Here's what I've done so far.

I've written some search functions which operate on character varying
data used to represent molecular structures.  We call this a Smiles string.
I want to optimize the search using an index.  As a test, I've created
9 integer columns in the tables containting atom counts, e.g.
number of carbon atoms, oxygen, aromatic carbon, etc.
I then made a multi-column index.  Here are some samples times

1. When the table contains only smiles, no 9 integer columns and no index:
Select count(smiles) from structure where oe_matches(smiles,'c1ccccc1CC(=O)NC');
1313 rows in about 15 seconds.

2. When the table contains smiles and the 9 integer columns as an index:
Select count(smiles) from structure where oe_matches(smiles,'c1ccccc1CC(=O)NC');
1313 rows in about 20 seconds.

3. When the table contains smiles and the 9 integer columns as an index:
Select smiles,id from structure where (nc,nn,no,ns,"n-arom-c","n-arom-n","n-arom-o","n-arom-s",nhalo) >=
(3,1,1,0,6,0,0,0,0)
and oe_matches(smiles,'c1ccccc1CC(=O)NC');
1313 rows in about 7 seconds.

I'm quite happy with the speedup in 3, but puzzled over the slowdown in 2.
Here are my questions.
1. Why does the search slow down after I've created the extra columns and
index, even when I don't ask to use the index in the SQL, as in 2.
2. Since I get such a nice speedup in 3, should I go to the trouble to
create a new datatype (smiles) and define how it should be indexed in a way
analogous to the 9 integer columns?  In other words, could I expect an even
greater speedup using a new datatype and index?

Thanks,
TJ



pgsql-sql by date:

Previous
From: Richard Huxton
Date:
Subject: Re: OID's
Next
From: "adam etienne"
Date:
Subject: update from multiple rows