Thread: How does the planner deal with multiple possible indexes?

How does the planner deal with multiple possible indexes?

From
"Jim C. Nasby"
Date:
Client has a table with 9 indexes; one is on just ident_id and takes up
75181 pages, the other is on ident_id and another field and uses 117461
pages.  

"bdata__ident_filed_departure" btree (ident_id, filed_departuretime), tablespace "array4"
"bdata_ident" btree (ident_id), tablespace "array4"


Whats interesting is that even a simple

SELECT * FROM table WHERE ident_id=1234

uses bdata__ident_filled_departure, even though it would require less IO
to use bdata_ident.

\d does list bdata__ident_filed_departure before bdata_ident; I'm
wondering if the planner is finding the first index with ident_id in it
and stopping there?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: How does the planner deal with multiple possible indexes?

From
"Gregory Maxwell"
Date:
On 7/19/06, Jim C. Nasby <jnasby@pervasive.com> wrote:
[snip]
> \d does list bdata__ident_filed_departure before bdata_ident; I'm
> wondering if the planner is finding the first index with ident_id in it
> and stopping there?

From my own experience it was grabbing the first that has the
requested field as its first member.. I haven't looked at the code to
see if that is the intended behavior.


Re: How does the planner deal with multiple possible indexes?

From
Tom Lane
Date:
"Gregory Maxwell" <gmaxwell@gmail.com> writes:
> On 7/19/06, Jim C. Nasby <jnasby@pervasive.com> wrote:
> [snip]
>> \d does list bdata__ident_filed_departure before bdata_ident; I'm
>> wondering if the planner is finding the first index with ident_id in it
>> and stopping there?

>> From my own experience it was grabbing the first that has the
> requested field as its first member.. I haven't looked at the code to
> see if that is the intended behavior.

Ordering would only matter if the estimated costs were exactly the same,
which they probably shouldn't be for indexes with such different sizes.
However, if the estimated number of matching rows were real small, you
might be winding up with a "one page to fetch" estimate in either case.
Jim didn't provide enough details to guess what the cost estimates
actually are...
        regards, tom lane


Re: How does the planner deal with multiple possible indexes?

From
"Jim C. Nasby"
Date:
On Wed, Jul 19, 2006 at 07:00:40PM -0400, Tom Lane wrote:
> "Gregory Maxwell" <gmaxwell@gmail.com> writes:
> > On 7/19/06, Jim C. Nasby <jnasby@pervasive.com> wrote:
> > [snip]
> >> \d does list bdata__ident_filed_departure before bdata_ident; I'm
> >> wondering if the planner is finding the first index with ident_id in it
> >> and stopping there?
> 
> >> From my own experience it was grabbing the first that has the
> > requested field as its first member.. I haven't looked at the code to
> > see if that is the intended behavior.
> 
> Ordering would only matter if the estimated costs were exactly the same,
> which they probably shouldn't be for indexes with such different sizes.
> However, if the estimated number of matching rows were real small, you
> might be winding up with a "one page to fetch" estimate in either case.
> Jim didn't provide enough details to guess what the cost estimates
> actually are...

Indeed, if I find a case where there's a large enough number of rows it
will choose the smaller index. But I'm wondering if it would be better
to always favor the smaller index, since it would (presumably) be easier
to keep it in cache?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: How does the planner deal with multiple possible indexes?

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Indeed, if I find a case where there's a large enough number of rows it
> will choose the smaller index. But I'm wondering if it would be better
> to always favor the smaller index, since it would (presumably) be easier
> to keep it in cache?

AFAICS, in existing releases that should happen, because the cost
estimate varies with the size of the index.  And it does happen for me
in simple tests.  You did not provide the requested information to help
us find out why it's not happening for you.

(I'm a bit worried about whether CVS HEAD may have broken this behavior
with the recent changes in the indexscan cost equations ... but unless
you are working with HEAD that's not relevant.)
        regards, tom lane


Re: How does the planner deal with multiple possible indexes?

From
"Jim C. Nasby"
Date:
On Wed, Jul 19, 2006 at 07:54:49PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > Indeed, if I find a case where there's a large enough number of rows it
> > will choose the smaller index. But I'm wondering if it would be better
> > to always favor the smaller index, since it would (presumably) be easier
> > to keep it in cache?
> 
> AFAICS, in existing releases that should happen, because the cost
> estimate varies with the size of the index.  And it does happen for me
> in simple tests.  You did not provide the requested information to help
> us find out why it's not happening for you.
> 
> (I'm a bit worried about whether CVS HEAD may have broken this behavior
> with the recent changes in the indexscan cost equations ... but unless
> you are working with HEAD that's not relevant.)

No, this is 8.1.3, and it's a production machine so I'd prefer not to go
about dropping indexes to get cost comparisons; unless there's some way
to disable the use of an index in a given backend? Otherwise I'll try
and come up with a test case.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: How does the planner deal with multiple possible

From
Hannu Krosing
Date:
Ühel kenal päeval, R, 2006-07-21 kell 08:29, kirjutas Jim C. Nasby:

> No, this is 8.1.3, and it's a production machine so I'd prefer not to go
> about dropping indexes to get cost comparisons; unless there's some way
> to disable the use of an index in a given backend?

Currently the closest thing is 

BEGIN;
DROP INDEX xxx;

test query here

ABORT;

>  Otherwise I'll try
> and come up with a test case.
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com





Re: How does the planner deal with multiple possible indexes?

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> No, this is 8.1.3, and it's a production machine so I'd prefer not to go
> about dropping indexes to get cost comparisons; unless there's some way
> to disable the use of an index in a given backend?

The traditional hack for that is
    begin;    drop index foo;    explain whatever;    rollback;

The DROP acquires an exclusive lock on the table, but it's only held for
a very short time while you EXPLAIN (you might want to put the whole
thing in a script file instead of relying on human typing speed).  So
unless you've got seriously strict response time requirements, this is
generally OK even in production DBs.  You do have to watch out for long
running transactions holding non-exclusive locks, eg don't try this
while a VACUUM is running on the table --- else the DROP blocks on the
vacuum and all other accesses start to queue up behind the DROP.

If the online-index-build patch gets in, there will be a cleaner option
which is to just mark the index disabled in pg_index.  That doesn't
require any exclusive lock, indeed won't be visible to other backends at
all if you do it within a transaction as above.
        regards, tom lane