Re: query very slow when enable_seqscan=on - Mailing list pgsql-bugs

From Tom Lane
Subject Re: query very slow when enable_seqscan=on
Date
Msg-id 13853.1152021389@sss.pgh.pa.us
Whole thread Raw
In response to Re: query very slow when enable_seqscan=on  (Tomasz Ostrowski <tometzky@batory.org.pl>)
Responses Re: query very slow when enable_seqscan=on  (Tomasz Ostrowski <tometzky@batory.org.pl>)
List pgsql-bugs
Tomasz Ostrowski <tometzky@batory.org.pl> writes:
> I think because there is no good solution to this - no statistical
> information is going to predict how much data will match a regular
> expression.

Well, it's certainly hard to imagine simple stats that would let the
code guess that, say, "warsa" and "warsaw" match nearly the same
(large) number of rows while "warsawq" matches nothing.

I think the real problem here is that regex matching is the wrong tool
for the job.  Have you looked into a full-text index (tsearch2)?
With something like that, the index operator has at least got the
correct conceptual model, ie, looking for indexed words.  I'm not sure
if they have any decent statistical support for it :-( but in theory
that seems doable, whereas regex estimation will always be a crapshoot.

            regards, tom lane

pgsql-bugs by date:

Previous
From: "Alexander M. Pravking"
Date:
Subject: ALTER TYPE ... USING(NULL) / NOT NULL violation
Next
From: Tom Lane
Date:
Subject: Re: ALTER TYPE ... USING(NULL) / NOT NULL violation