Thread: regex (from TODO)

regex (from TODO)

From
Karel Zak - Zakkr
Date:
In the PostgreSQL TODO is "Get faster regex() code from Henry Spencer..".
I look at current available regex used (a example) in apache, php, .etc. But
if I look at changes (via diff) between PostgreSQL's regex and more new
regex in PHP4 it is very same. A differentions are that in new regex code
are all values marks as 'register' and this new regex not support MULTIBYTE.
It is without any relevant changes (or 'register' is really fastly?).
What means TODO? 
The PG's regex use malloc -- why not MemoryContext?
                        Karel



Re: [HACKERS] regex (from TODO)

From
Bruce Momjian
Date:
> 
> 
>  In the PostgreSQL TODO is "Get faster regex() code from Henry Spencer..".
> 
>  I look at current available regex used (a example) in apache, php, .etc. But
> if I look at changes (via diff) between PostgreSQL's regex and more new
> regex in PHP4 it is very same. A differentions are that in new regex code
> are all values marks as 'register' and this new regex not support MULTIBYTE.
> It is without any relevant changes (or 'register' is really fastly?).
> 
>  What means TODO? 
> 
>  The PG's regex use malloc -- why not MemoryContext?

Henry has new code that is faster, and he has put it only in TCL so far.
I am waiting for a library version of it that we can include.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] regex (from TODO)

From
Tatsuo Ishii
Date:
>  In the PostgreSQL TODO is "Get faster regex() code from Henry Spencer..".
> 
>  I look at current available regex used (a example) in apache, php, .etc. But
> if I look at changes (via diff) between PostgreSQL's regex and more new
> regex in PHP4 it is very same. A differentions are that in new regex code
> are all values marks as 'register' and this new regex not support MULTIBYTE.

Actually Henry has never supported MULTIBYTE:-) We modified his code
so that we could support it.

> It is without any relevant changes (or 'register' is really fastly?).

I vaguely recall that we decided that 'register' did nothing good with
modern compilers, and it'd be better to let the optimizer determine
what variables should be assigned to registers.

>  What means TODO? 

That means "get faster code from Henry and modify it if it does not
support MULTIBYTE" -- I guess.

>  The PG's regex use malloc -- why not MemoryContext?

Probably because the regex caches the results of regcomp in a static
area which points to a malloced memory allocated while compiling a
regular expression. However, for the regexec stage we might be able to
use palloc instead of malloc. I'm not sure if this would result in a
any better performance, though.
--
Tatsuo Ishii