Re: Some regular-expression performance hacking - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Some regular-expression performance hacking
Date
Msg-id 3028106.1613748380@sss.pgh.pa.us
Whole thread Raw
In response to Re: Some regular-expression performance hacking  ("Joel Jacobson" <joel@compiler.org>)
List pgsql-hackers
"Joel Jacobson" <joel@compiler.org> writes:
> On Thu, Feb 18, 2021, at 19:53, Tom Lane wrote:
>> (Having said that, I can't help noticing that a very large fraction
>> of those usages look like, eg, "[\w\W]".  It seems to me that that's
>> a very expensive and unwieldy way to spell ".".  Am I missing
>> something about what that does in Javascript?)

> I think this is a non-POSIX hack to match any character, including newlines,
> which are not included unless the "s" flag is set.

> "foo\nbar".match(/([\w\W]+)/)[1];
> "foo
> bar"

Oooh, that's very interesting.   I guess the advantage of that over using
the 's' flag is that you can have different behaviors at different places
in the same regex.

I was just wondering about this last night in fact, while hacking on
the code to get it to accept \W etc in bracket expressions.  I see that
right now, our code thinks that NLSTOP mode ('n' switch, the opposite
of 's') should cause \W \D \S to not match newline.  That seems a little
weird, not least because \S should probably be different from the other
two, and it isn't.  And now we see it'd mean that you couldn't use the 'n'
switch to duplicate Javascript's default behavior in this area.  Should we
change it?  (I wonder what Perl does.)

            regards, tom lane



pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: Problem with accessing TOAST data in stored procedures
Next
From: Pavel Stehule
Date:
Subject: Re: Problem with accessing TOAST data in stored procedures