Re: BUG #18956: Observing an issue in regexp_count() - Mailing list pgsql-bugs

From hubert depesz lubaczewski
Subject Re: BUG #18956: Observing an issue in regexp_count()
Date
Msg-id aErDmEWoyiNrhY1M@depesz.com
Whole thread Raw
In response to BUG #18956: Observing an issue in regexp_count()  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
On Thu, Jun 12, 2025 at 08:03:25AM +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      18956
> Logged by:          Anudeep Vattikonda
> Email address:      anudeepvattikonda0404@gmail.com
> PostgreSQL version: 17.5
> Operating system:   Mac
> Description:        
> 
> Hi
> I am trying to run the below query
> select REGEXP_COUNT('cat at the flat', '\Bat\b') ;
> I was expecting it to return 2 but I see Postgres is returning 0. I see that
> there are two matches, cat and flat. All it should do is to look for the
> word at whose left side shoudn't be a word boundary while the right side
> should be a word boundary

What makes you think that \B/\b has anything to do with word boundary?

Docs
(https://www.postgresql.org/docs/17/functions-matching.html#FUNCTIONS-POSIX-REGEXP)
show:

\b - backspace, as in C
\B - synonym for backslash (\) to help reduce the need for backslash doubling

As far as I can tell pg regexps have nothing related to word boundaries.

You could get 2 by changing the regexp to something that actually works:

$ select REGEXP_COUNT('cat at the flat', '[a-z]at(?![a-z])');
 regexp_count
──────────────
            2
(1 row)

Best regards,

depesz




pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18956: Observing an issue in regexp_count()
Next
From: Fujii Masao
Date:
Subject: Re: BUG #18952: pg_restore --help and document have strange description: Dump something