> > unfortunately '^whatever.*' is what I'm trying to locate (ie: all words
> > starting with whatever, but with nay trailing text), the problem seems to be in
> > the termination of the index scan, not in the actual regex match (which actually
> > seems very good, speed wise..) otherwise I could just use ='whatever', which
> > runs very very fast.
>
> Isn't "all words that start with whatever but without trailing text" the
> same as = 'whatever'? From a regex point of view '^whatever' and
> '^whatever.*' are exactly equivalent, but I can see where one could fail
> to optimize properly.
OK, let's turn from speculations to facts (have just gotten off my
rear end and verified each).:
1. '^whatever.*' and '^whatever' are equivalent regular expressions.
2. The version of regexp used in postgres is aware of this equivalence.
3. Btree index is used in the queries involving anchored expressions:
emp=> explain select * from ps where ps ~ '^EDTA';
NOTICE: QUERY PLAN:
Index Scan using psix on ps (cost=2373.21 rows=1 width=62)
emp=> explain select * from ps where ps ~ '^EDTA.*';
NOTICE: QUERY PLAN:
Index Scan using psix on ps (cost=2373.21 rows=1 width=62)
(ps is a 250k-row table; the result is returned immediately when
indexed and in about 3 seconds when not)
However,
4. Hash index is never used
===========================
Observations made with 6.5 on RedHat 5.1.
--Gene