Home > mailing lists

Re: Bad estimate on LIKE matching - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Bad estimate on LIKE matching
Date	January 18, 2006 11:38:03
Msg-id	10172.1137598679@sss.pgh.pa.us Whole thread Raw
In response to	Re: Bad estimate on LIKE matching (Simon Riggs <simon@2ndquadrant.com>)
Responses	Re: Bad estimate on LIKE matching
List	pgsql-hackers

Tree view

Simon Riggs <simon@2ndquadrant.com> writes:
> On Tue, 2006-01-17 at 13:53 +0100, Magnus Hagander wrote:
>> Any way to teach the planner about this?

> In a recent thread on -perform, I opined that this case could best be
> solved by using dynamic random block sampling at plan time followed by a
> direct evaluation of the LIKE against the sample. This would yield a
> more precise selectivity and lead to the better plan. So it can be
> improved for the next release.

I find it exceedingly improbable that we'll ever install any such thing.
On-the-fly sampling of enough rows to get a useful estimate would
increase planning time by orders of magnitude --- and most of the time
the extra effort would be unhelpful.  In the particular case exhibited
by Magnus, it is *really* unlikely that any such method would do better
than we are doing now.  He was concerned because the planner failed to
tell the difference between selectivities of about 1e-4 and 1e-6.
On-the-fly sampling will do better only if it manages to find some of
those rows, which it is unlikely to do with a sample size less than
1e5 or so rows.  With larger tables the problem gets rapidly worse.
        regards, tom lane

pgsql-hackers by date:

From: Greg Stark
Date: 18 January 2006, 10:53:59
Subject: Re: Surrogate keys (Was: enums)

From: Martijn van Oosterhout
Date: 18 January 2006, 12:33:07
Subject: Re: Unique constraints for non-btree indexes

Re: Bad estimate on LIKE matching - Mailing list pgsql-hackers

Previous

Next