Re: [GENERAL] indexed regex select optimisation missing? - Mailing list pgsql-general

From Gene Selkov, Jr.
Subject Re: [GENERAL] indexed regex select optimisation missing?
Date
Msg-id 199911041927.NAA14283@antares
Whole thread Raw
In response to Re: [GENERAL] indexed regex select optimisation missing?  (Peter Eisentraut <e99re41@DoCS.UU.SE>)
Responses Re: [GENERAL] indexed regex select optimisation missing?
List pgsql-general
> > unfortunately '^whatever.*' is what I'm trying to locate (ie: all words
> > starting with whatever, but with nay trailing text), the problem seems to be in
> > the termination of the index scan, not in the actual regex match (which actually
> > seems very good, speed wise..) otherwise I could just use ='whatever', which
> > runs very very fast.
>
> Isn't "all words that start with whatever but without trailing text" the
> same as = 'whatever'? From a regex point of view '^whatever' and
> '^whatever.*' are exactly equivalent, but I can see where one could fail
> to optimize properly.

OK, let's turn from speculations to facts (have just gotten off my
rear end and verified each).:

1. '^whatever.*' and '^whatever' are equivalent regular expressions.

2. The version of regexp used in postgres is aware of this equivalence.

3. Btree index is used in the queries involving anchored expressions:

emp=> explain select * from ps where ps ~ '^EDTA';
NOTICE:  QUERY PLAN:

Index Scan using psix on ps  (cost=2373.21 rows=1 width=62)

emp=> explain select * from ps where ps ~ '^EDTA.*';
NOTICE:  QUERY PLAN:

Index Scan using psix on ps  (cost=2373.21 rows=1 width=62)

(ps is a 250k-row table; the result is returned immediately when
indexed and in about 3 seconds when not)

However,

4. Hash index is never used
===========================

Observations made with 6.5 on RedHat 5.1.


--Gene

pgsql-general by date:

Previous
From: "S. van der Tol"
Date:
Subject: Problem with initdb
Next
From: "Jeffrey D. Paquette"
Date:
Subject: query during transaction?