Re: Planner chooses multi-column index in 9.2 when maybe it should not - Mailing list pgsql-general

From Tom Lane
Subject Re: Planner chooses multi-column index in 9.2 when maybe it should not
Date
Msg-id 15816.1349918672@sss.pgh.pa.us
Whole thread Raw
In response to Re: Planner chooses multi-column index in 9.2 when maybe it should not  (Greg Sabino Mullane <greg@endpoint.com>)
Responses Re: Planner chooses multi-column index in 9.2 when maybe it should not  (Greg Sabino Mullane <greg@endpoint.com>)
List pgsql-general
Greg Sabino Mullane <greg@endpoint.com> writes:
>      ->  Bitmap Heap Scan on foobar o  (C=30389..835271 R=8980 W=8) (AT=0.06..0.07 R=1 L=1)
>            Recheck Cond: (((id) >= (m.id)) AND ((id) <= ((m.id) || '.999999')))
>            Filter: (((status) <> ALL ('{panda,penguin}'[])) \
>              AND ((id) ~ (('^' || (m.id)) || '(\.[0-9.]+)?$')))
>        ->  Bitmap Index Scan on foobar_pkey  (C=0..30386 R=1888670 W=0) (AT=0.02..0.02 R=1 L=1)
>              Index Cond: (((id) >= (m.id)) AND ((id) <= ((m.id) || '.999999')))

Actually, looking closer, I think the problem is not with the estimation
of the index-only scan on the other index; the planner is estimating
that as pretty expensive, which it is.  The problem is that it thinks
the above bitmap scan is pretty expensive, when it isn't.  And the
reason evidently is that it's totally off in the weeds about the
selectivity of the range condition on foobar.id.  Anytime you've got
1888670 estimated rows and 1 actual row, you've got a problem.

This is related to the problem I was on about a couple weeks ago:
http://archives.postgresql.org/message-id/17655.1348874742@sss.pgh.pa.us
namely that the planner fails to recognize pairs of clauses as a range
constraint if they're join clauses.  If it had recognized that, you'd
have gotten an estimate that would still be far more than "1 row", but
would be more than an order of magnitude less than this one, which would
be enough to fix this problem.

I'd not been thinking of that change as something we'd risk
back-patching, but maybe we should consider putting it into 9.2.  It
seems like the index-only scan support has put a new level of premium on
the quality of the planner's rowcount estimates.

Meanwhile, that range condition in itself looks a tad, er, klugy.
Do you really need that, or is this a crummy way of stating
foobar.id = m.id?

            regards, tom lane


pgsql-general by date:

Previous
From: Gavin Flower
Date:
Subject: Re: Index only scan
Next
From: Steve A
Date:
Subject: Can two “SELECT FOR UPDATE” statements on the same table cause a deadlock?