Re: [HACKERS] Broken select on regular expression !!! - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] Broken select on regular expression !!!
Date
Msg-id 199905210321.XAA16218@candle.pha.pa.us
Whole thread Raw
In response to Re: [HACKERS] Broken select on regular expression !!!  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Responses Re: [HACKERS] Broken select on regular expression !!!  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-hackers
> These all oddness are caused by the parser (makeIndexable). When
> makeIndexable sees ~* '^41|^des' , it tries to rewrite the target
> regexp so that an index can be used. The rewritten query might be
> something like:
> 
> fld1 ~* '^41|^des' and fld1 >= '41|^' and fld1 <= '41|^\377'
> 
> Apparently this is wrong. This is because makeIndexable does not
> understand '|' and '^' appearing in the middle of the regexp. On the
> other hand, 
> 
> >regression=> select * from regdemo where fld1 ~* '^des|^41';
> >regression=> select * from regdemo where fld1 ~* '^sou|^des';
> 
> will work since makeIndexable gave up the optimization if the op is
> "~*" and a letter appearing right after '^' is *alphabet*.
> 
> Note that:
> 
> >regression=> select * from regdemo where fld1 ~ '^sou|^des';
> 
> will not work because the op is *not* "~*".
> 
> It seems that the only solution is checking '|' to see if it appears
> in the target regexp and giving up the optimization in that case.
> 
> One might think that ~* '^41|^des' can be rewritten like:
> 
> fld1 ~* '^41' or fld1 ~* '^des'
> 
> For me this seems not to be a good idea. To accomplish this, we have
> to deeply parse the regexp (consider that we might have arbitrary
> complex regexps) and such kind thing is a job regexp() shoud
> do.

Again very clear, and caused by the indexing of regex's, as you suggest.
I can easily look for '|' in the string, and skip the optimization.  Is
that the only special case I need to add?


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Broken select on regular expression !!!
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Current TODO list