Re: Strange (?) Index behavior? - Mailing list pgsql-performance

From Matt Clark
Subject Re: Strange (?) Index behavior?
Date
Msg-id 008401c4c366$127d5260$8300a8c0@solent
Whole thread Raw
In response to Re: Strange (?) Index behavior?  (Allen Landsidel <alandsidel@gmail.com>)
Responses Re: Strange (?) Index behavior?
List pgsql-performance
> With that many rows, and a normal index on the field,
> postgres figures the best option for say "I%" is not an index
> scan, but a sequential scan on the table, with a filter --
> quite obviously this is slow as heck, and yes, I've run
> analyze several times and in fact have the vacuum analyze automated.

Ah, so "like 'I%'" uses a very slow seq scan, but "like 'ABC%'" uses an
ordinary index OK?  If so then...

The planner would usually assume (from what Tom usually says) that 1/26
selectivity isn't worth doing an index scan for, but in your case it's wrong
(maybe because the rows are very big?)

You may be able to get the planner to go for an index scan on "like 'I%'" by
tweaking the foo_cost variables in postgresql.conf

Or you could have the app rewrite "like 'I%'" to "like 'IA%' or like 'IB%'
... ", or do that as a stored proc.

> With the partial index the index scan is used and the cost
> drops from 0..2million to 0..9000 -- a vast improvement.

So there are really only 9000 rows out of 76 million starting with 'I'?  How
about combining some techniques - you could create an index on the first two
chars of the field (should be selective enough to give an index scan),
select from that, and select the actual data with the like clause.

CREATE INDEX idx_firstletters ON table (substr(field, 1, 2));
CREATE INDEX idx_all ON table (field);
SELECT field FROM (SELECT field FROM table WHERE substr(field, 1, 2) = 'DE')
AS approx WHERE field LIKE 'DE%';

Any good?


pgsql-performance by date:

Previous
From: patrick ~
Date:
Subject: Re: vacuum analyze slows sql query
Next
From: "Matthew T. O'Connor"
Date:
Subject: Re: What is the difference between these?