Re: Planner chooses multi-column index in 9.2 when maybe it should not - Mailing list pgsql-general

From Tom Lane
Subject Re: Planner chooses multi-column index in 9.2 when maybe it should not
Date
Msg-id 1474.1349890289@sss.pgh.pa.us
Whole thread Raw
In response to Planner chooses multi-column index in 9.2 when maybe it should not  (Greg Sabino Mullane <greg@endpoint.com>)
Responses Re: Planner chooses multi-column index in 9.2 when maybe it should not  (Greg Sabino Mullane <greg@endpoint.com>)
Re: Planner chooses multi-column index in 9.2 when maybe it should not  (Greg Sabino Mullane <greg@endpoint.com>)
List pgsql-general
Greg Sabino Mullane <greg@endpoint.com> writes:
> We are seeing a performance regression when moving to 9.2. There is a
> complex query that is doing a self-join, but long story short, it
> is choosing to use a multi-column index when it really ought not to be.
> I was not able to develop a standalone test case without resorting
> to changing enable_seqscan, but this does show the difference:

> CREATE TABLE gregtest (a TEXT PRIMARY KEY, b TEXT);
> CREATE INDEX gregtest_i ON gregtest(b,a);
> SET enable_seqscan = off;
> EXPLAIN SELECT 1 FROM gregtest WHERE a <> 'panda' AND b <> 'yak';

> On versions 8.2, 8.3, 8.4, 9.0, and 9.1, this gives:

>  Seq Scan on gregtest  (cost=10000000000.00..10000000022.90 rows=855 width=0)
>    Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))

> Which makes sense - I would imagine that b = 'yak' would use the index,
> but the negation means the index is not very useful?

> However, on 9.2, this gives:

>  Bitmap Heap Scan on gregtest  (cost=8.76..31.66 rows=855 width=0)
>    Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))
>    ->  Bitmap Index Scan on gregtest_i  (cost=0.00..8.55 rows=860 width=0)

The above doesn't seem like a regression to me.  You told it not to use
a seqscan, and it didn't.  (The reason it now considers the index is
that an index-only scan is possible; before 9.2 there was no reason to
consider an indexscan at all given this query, so you got the seqscan
despite the attempted disable.)


> The above was tested on stock versions of Postgres, with no changes
> made to postgresql.conf. In the actual query, the result is something like
> this on 9.2 (columns explained below):

>  Nested Loop  (cost=0.00..6050226723847.12 rows=282638194054762 width=8)
>    Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
>    ->  Index Only Scan using index_i on foobar m  (cost=0.00..889187.83 rows=16998032 width=8)
>    ->  Materialize  (cost=0.00..1079773.42 rows=16627702 width=8)
>          ->  Index Only Scan using index_i on foobar o  (cost=0.00..931682.91 rows=16627702 width=8)
>                Filter: ((status)::text <> 'split'::text)

It's not obvious that this is a worse plan than a seqscan --- the
index-only scans will only have to read the index not the heap, at least
if the heap is all-visible.  If it's coming out slower, then that's a
question of whether the cost estimates match reality.  I'd wonder how
many heap fetches occur anyway, and also whether you've tweaked the
planner cost parameters.

You should be able to force it back to the seqscan based plan by turning
off enable_indexscan or enable_indexonlyscan.  It would be useful to
see EXPLAIN ANALYZE (not just EXPLAIN) results for both this plan and
the seqscan plan in 9.2.

            regards, tom lane


pgsql-general by date:

Previous
From: Greg Sabino Mullane
Date:
Subject: Planner chooses multi-column index in 9.2 when maybe it should not
Next
From: Greg Sabino Mullane
Date:
Subject: Re: Planner chooses multi-column index in 9.2 when maybe it should not