Scott Ribe <scott_ribe@elevated-dev.com> writes:
> PG 9.3, consider a table test like:
> tz timestamp not null,
> cola varchar not null,
> colb varchar not null
> 2 compound indexes:
> tz_cola on (tz, cola)
> tz_colb on (tz, colb varchar_pattern_ops)
> now a query, for some start & end timestamps:
> select * from test where tz >= start and tz < end and colb like '%foobar%'
> Assume that the tz restriction is somewhat selective, say 1% of the table, and the colb restriction is extremely
selective,say less than 0.00001%.
> It seems to me that the fastest way to resolve this query is to use the tz_colb index directly, scanning the range
betweentz >= start and tz < end for the colb condition.
> But pg wants to use the pg_cola index to find all rows in the time range, then filter those rows for the colb
condition.(FYI, cola contains only very small values, while colb's values are typically several times longer.)
The reason you're losing on this is that the "select *" command eliminates
the possibility of an index-only scan (I'm assuming that that selects some
columns that aren't in the index). Given that a plain indexscan will
always involve fetching each heap row that satisfies the indexable
condition (the one on tz), the planner figures it might as well use the
physically-smaller index.
It's true that in principle we could use the index-only-scan index AM
machinery to retrieve colb from the index, and then check the LIKE
predicate on that value before we go to the heap to get the other values;
but the code isn't factored that way at the moment. I'm not entirely sure
that such cases arise often enough to be worth making it happen. I think
there was discussion of this point back when the index-only-scan patch was
being written, and we decided it didn't seem worth pursuing at the time.
regards, tom lane