Thread: Re: [PATCHES] Minimum selectivity estimate for LIKE 'prefix%'
Am Donnerstag, 6. März 2008 schrieb Tom Lane: > What I propose doing about this is a small variant on Peter's original > suggestion: compute the estimated selectivity for > col = 'prefix' > and clamp the result of prefix_selectivity to be at least that. OK, first results with this patch are in: The selectivity estimations are adjusted nicely, but the cost calculation doesn't change at all. Before: Index Scan using foo_idx_3 on foo foo (cost=0.01..6.03 rows=1 width=8) After: Index Scan using foo_idx_3 on foo foo (cost=0.01..6.03 rows=627 width=8) How is that possible? Btw., the corresponding query plan for the LIKE 'constant' case is: Index Scan using foo_idx_3 on foo foo (cost=0.00..2527.84 rows=627 width=8) This is what we had hoped to get in the "after" case.
Peter Eisentraut <peter_e@gmx.net> writes: > OK, first results with this patch are in: The selectivity estimations are > adjusted nicely, but the cost calculation doesn't change at all. Before: I've forgotten the context ... what's the whole query and plan again? And which PG version exactly? regards, tom lane
Am Montag, 31. März 2008 schrieb Tom Lane: > Peter Eisentraut <peter_e@gmx.net> writes: > > OK, first results with this patch are in: The selectivity estimations are > > adjusted nicely, but the cost calculation doesn't change at all. Before: > > I've forgotten the context ... what's the whole query and plan again? > And which PG version exactly? Please see http://archives.postgresql.org/pgsql-hackers/2008-01/msg00048.php
Peter Eisentraut <peter_e@gmx.net> writes: > Am Montag, 31. M�rz 2008 schrieb Tom Lane: >> I've forgotten the context ... what's the whole query and plan again? >> And which PG version exactly? > Please see http://archives.postgresql.org/pgsql-hackers/2008-01/msg00048.php Hm. Now that I think about it, the index scan cost estimate is made using a separate estimate of rows fetched (since this will depend on the specific index qual clauses in use, whereas the overall row estimate for the relation doesn't vary with index). For the case at hand, the index quals that it's looking at are the >= and < clauses with close-together comparison values, and so it comes out with a rock-bottom rowcount estimate. The clamping occuring over in prefix_selectivity isn't relevant here. Your original complaint was that the bad overall rowcount estimate was leading to a bad join plan, and that should be fixed even though the cost estimate for the indexscan itself is unrealistically small. Changing the indexscan cost estimate would require patching the main range-constraint-estimation code in clausesel.c. I don't see any very good fix for that, since it has to deal with much more general cases than this. In particular it doesn't really know whether it's dealing with >= or >. regards, tom lane