Thread: BUG #11500: PRIMARY KEY index not being used

BUG #11500: PRIMARY KEY index not being used

From
marko@joh.to
Date:
The following bug has been logged on the website:

Bug reference:      11500
Logged by:          Marko Tiikkaja
Email address:      marko@joh.to
PostgreSQL version: 9.1.12
Operating system:   Linux
Description:

Hi,

We've been observing a performance problem where a PRIMARY KEY index is not
being used.  The problem looks like this:

pg2=#* explain analyze select * from events where eventid = 132685185 and
processed = 0;
                                                            QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------
 Index Scan using index_events_processed on events  (cost=0.00..7.73 rows=1
width=106) (actual time=31.808..31.808 rows=0 loops=1)
   Index Cond: (processed = 0)
   Filter: (eventid = 132685185)
 Total runtime: 31.852 ms
(4 rows)

pg2=#* explain analyze select * from events where eventid = 132685185 and
processed+0 = 0;
                                                      QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------
 Index Scan using events_pkey on events  (cost=0.00..12.38 rows=1 width=106)
(actual time=0.071..0.071 rows=0 loops=1)
   Index Cond: (eventid = 132685185)
   Filter: ((processed + 0) = 0)
 Total runtime: 0.109 ms
(4 rows)


I'm guessing that's happening because the index on processed is smaller
(7GB, relpages=900880 vs 3.7GB, relpages=478225).  The statistics say that
there are no rows where processed=0 (and it's not far from the truth), but
it's still a risky plan compared to the PK lookup.  The index
index_events_processed is an index on events(processed), which should
probably be a partial index on  WHERE processed = 0, but I thought I'd
report this plan anyway.

Any thoughts?

Re: BUG #11500: PRIMARY KEY index not being used

From
Marti Raudsepp
Date:
On Fri, Sep 26, 2014 at 11:02 AM,  <marko@joh.to> wrote:
> The statistics say that
> there are no rows where processed=0 (and it's not far from the truth), but
> it's still a risky plan compared to the PK lookup.

> Any thoughts?

PostgreSQL 9.0 introduced this optimization for greater/less operators:

> When looking up statistics for greater/less-than comparisons, if the
> comparison value is in the first or last histogram bucket, use an index
> (if available) to fetch the current actual column minimum or maximum.
> This greatly improves the accuracy of estimates for comparison values
> near the ends of the data range, particularly if the range is constantly
> changing due to addition of new data.

Not sure whether it's a good idea a bad idea, but perhaps a solution
is to expand this to equality lookups too?

Does using "WHERE processed <= 0" work around the problem? (Assuming
you don't have any negative numbers in this column).

> The index
> index_events_processed is an index on events(processed), which should
> probably be a partial index on  WHERE processed = 0, but I thought I'd
> report this plan anyway.

I guess you would still have this problem, unless your new index
contains the eventid column.

Regards,
Marti

Re: BUG #11500: PRIMARY KEY index not being used

From
Marko Tiikkaja
Date:
On 9/26/14 2:02 PM, Marti Raudsepp wrote:
> On Fri, Sep 26, 2014 at 11:02 AM,  <marko@joh.to> wrote:
>> The statistics say that
>> there are no rows where processed=0 (and it's not far from the truth), but
>> it's still a risky plan compared to the PK lookup.
>
>> Any thoughts?
>
> PostgreSQL 9.0 introduced this optimization for greater/less operators:
>
>> When looking up statistics for greater/less-than comparisons, if the
>> comparison value is in the first or last histogram bucket, use an index
>> (if available) to fetch the current actual column minimum or maximum.
>> This greatly improves the accuracy of estimates for comparison values
>> near the ends of the data range, particularly if the range is constantly
>> changing due to addition of new data.
>
> Not sure whether it's a good idea a bad idea, but perhaps a solution
> is to expand this to equality lookups too?

I'm not sure that's the right idea to be honest.  The problem is that
the planner is taking a risk by using an index which could contain
(theoretically) any number of matching rows, instead of using the
primary key which is guaranteed to only contain 0 or 1 rows.  Sure,
peeking into the index to see that there are indeed some processed=0
rows would probably discourage the planner from using it, but why bother?

> Does using "WHERE processed <= 0" work around the problem? (Assuming
> you don't have any negative numbers in this column).

I unfortunately already dropped the problematic index, so I can't answer
that.


.marko

Re: BUG #11500: PRIMARY KEY index not being used

From
Tom Lane
Date:
marko@joh.to writes:
> PostgreSQL version: 9.1.12

> We've been observing a performance problem where a PRIMARY KEY index is not
> being used.  The problem looks like this:

In 9.2 and again in 9.3, we significantly changed the planner's modeling
of index access costs, with a view towards making better choices when
there are multiple plausible indexes to use.  If you can reproduce this
misbehavior in 9.3 or later it would be worth looking into; but we're not
going to change 9.1's estimation rules at this late date.  People running
on back branches tend to want plan stability in my experience.

            regards, tom lane