Planner chooses multi-column index in 9.2 when maybe it should not - Mailing list pgsql-general

From Greg Sabino Mullane
Subject Planner chooses multi-column index in 9.2 when maybe it should not
Date
Msg-id 20121010170631.GK9910@tinybird.home
Whole thread Raw
Responses Re: Planner chooses multi-column index in 9.2 when maybe it should not  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
We are seeing a performance regression when moving to 9.2. There is a
complex query that is doing a self-join, but long story short, it
is choosing to use a multi-column index when it really ought not to be.
I was not able to develop a standalone test case without resorting
to changing enable_seqscan, but this does show the difference:

CREATE TABLE gregtest (a TEXT PRIMARY KEY, b TEXT);
CREATE INDEX gregtest_i ON gregtest(b,a);
SET enable_seqscan = off;
EXPLAIN SELECT 1 FROM gregtest WHERE a <> 'panda' AND b <> 'yak';

On versions 8.2, 8.3, 8.4, 9.0, and 9.1, this gives:

 Seq Scan on gregtest  (cost=10000000000.00..10000000022.90 rows=855 width=0)
   Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))

Which makes sense - I would imagine that b = 'yak' would use the index,
but the negation means the index is not very useful?

However, on 9.2, this gives:

 Bitmap Heap Scan on gregtest  (cost=8.76..31.66 rows=855 width=0)
   Filter: ((a <> 'panda'::text) AND (b <> 'yak'::text))
   ->  Bitmap Index Scan on gregtest_i  (cost=0.00..8.55 rows=860 width=0)

The above was tested on stock versions of Postgres, with no changes
made to postgresql.conf. In the actual query, the result is something like
this on 9.2 (columns explained below):

 Nested Loop  (cost=0.00..6050226723847.12 rows=282638194054762 width=8)
   Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
   ->  Index Only Scan using index_i on foobar m  (cost=0.00..889187.83 rows=16998032 width=8)
   ->  Materialize  (cost=0.00..1079773.42 rows=16627702 width=8)
         ->  Index Only Scan using index_i on foobar o  (cost=0.00..931682.91 rows=16627702 width=8)
               Filter: ((status)::text <> 'split'::text)

But like this on 8.3:

 Nested Loop  (cost=1003294.60..8207409555713.15 rows=283931552087940 width=8)
   Join Filter: ((o.foo)::text <> ((m.foo)::text || 'panda'::text))
   ->  Seq Scan on foobar o  (cost=0.00..962314.95 rows=16672204 width=8)
         Filter: ((status)::text <> 'split'::text)
   ->  Materialize  (cost=1003294.60..1240121.96 rows=17030236 width=8)
         ->  Seq Scan on foobar m  (cost=0.00..919739.36 rows=17030236 width=8)

In the above, foobar has a primary key on foo, and an index named
index_i on foobar(status, foo). In another variation of the query,
8.3 uses foobar_pkey as well, rather than index_i, and filters that.
Matter of fact, index_i is never used.

At any rate, the change causes the original query to run much, much
slower. Problem on 9.2? Something wrong with our system and/or query?
More information needed from me?


--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

Attachment

pgsql-general by date:

Previous
From: Chris Ernst
Date:
Subject: Re: pg_upgrade not detecting version properly
Next
From: Tom Lane
Date:
Subject: Re: Planner chooses multi-column index in 9.2 when maybe it should not