Re: planner makes provably suboptimal index selection choices with partial indexes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: planner makes provably suboptimal index selection choices with partial indexes
Date
Msg-id 273238.1767389739@sss.pgh.pa.us
Whole thread Raw
In response to planner makes provably suboptimal index selection choices with partial indexes  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-hackers
Merlin Moncure <mmoncure@gmail.com> writes:
> I have a situation with a highly volatile table where the planner seems to
> make the incorrect choice.  The problem is definitely statistics related,
> but I don't think statistics ought to be related to the determination of
> which index to consider.

Why would you think that?  Statistics are critical in estimating how
many index entries are likely to be hit.

> Consider a table:

> CREATE TABLE foo(foo_key bigint, tt1 TIMESTAMPTZ, t2 TIMESTAMPTZ int);
> ...and two partial indexes

> CREATE INDEX foo_a_idx ON foo(a) WHERE t2 IS NOT NULL;
> CREATE INDEX foo_t1_idx ON foo(t1) WHERE t2 IS NOT NULL;

> ...and the following query
> SELECT * FROM foo f WHERE a = ? AND t2 IS NOT NULL;

> ...the database seems to want to pick foo_t1_idx when the estimated row
> counts are similar even though ISTM there is no way it could give better
> results for any distribution of dat; only the same, or worse (is this
> true?).

It might be that foo_t1_idx is enough smaller than foo_a_idx that the
estimated scanning cost comes out less even though more rows are
predicted to be read.  That could be because the individual index
entries are smaller (not possible for bigint vs timestamptz, but
your example is pretty unclear about what the data type of "a" is),
or because one index is bloated compared to the other thanks to a more
erratic insertion pattern.

Hard to tell exactly what's happening without a more concrete example,
but there are plenty of factors involved in these choices.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: Ignore PlaceHolderVars when looking up statistics
Next
From: Corey Huinker
Date:
Subject: Re: Can we remove support for standard_conforming_strings = off yet?