Hi-
Can someone tell me how the cost is estimated for retrieving a column based
on a function that is indexed?
The issue I'm wrestling with is a query which works nicely when based on a
plain "LIKE field", but poorly using "LIKE upper(field)". There is an
index on both.
Since this field has only uppercase strings in it currently, I know that
both indexes are equally selective, but the planner judges that "LIKE field"
will return 1 row, and "LIKE upper(field)" will return 2168 rows. In both
cases, the index is used, but in the next step, joining to another table,
the regular version uses an index on the other table, while the "upper"
version uses a seq scan. I'm guessing that the scan is used because if we
are going after 2168 rows in the adjoining table, the index is no longer a
good choice.
I'm able to see the stats for the field using pg_stats, but don't see
anything connected to the function, so I'm guessing that real stats aren't
kept & some sort of default is used. perhaps I can modify this default.
Also, even with 2168 rows to gather, my experience based on cases where
several thousand rows really are returned indicates that the index would
still be a good choice. Is there a way to make the planner favor index scans
a bit more? (Other than the drastic set enable_seqscan to off.)
Thanks
-Nick
--------------------------------------------------------------------------
Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/