On Fri, 2025-09-12 at 21:53 -0400, Tom Lane wrote:
>
> Ah, got it. But this logic definitely deserves more comments.
> What do you think of something like
>
> if (pchar == ']' && charclass_start > 2)
> {
> /* found the real end of a bracket pair */
> charclass_depth--;
> /* past start of outer brackets, so keep charclass_start > 2 */
> }
> else if (pchar == '[')
> {
> /* start of a nested bracket pair */
> charclass_depth++;
> /* leading ^ or ] in this context is not special */
> charclass_start = 3;
> }
> else if (pchar == '^')
> {
> /* okay to increment charclass_start even if already > 1 */
> charclass_start++;
> }
> else
> {
> /* otherwise (including case of leading ']') */
> charclass_start = 3; /* definitely past the start */
> }
>
> > Perhaps s/charclass_depth/bracket_depth/ would be a good idea.
>
> Wouldn't object to that. Maybe we can think of a better name for
> "charclass_start" too --- that sounds like a boolean condition.
> The best I'm coming up with right now is "charclass_count", but
> that's not that helpful.
I came up with the attached patch set.
0001 is the actual bug fix: in addition to my previous patch, I fixed two
more cases in which a closing bracket might not be recognized as ending
the character class (one is from your patch above). I think that these
could not lead to bad query results, but I am not certain.
0002 is the cosmetic improvement: I renamed "charclass_depth" to "bracket_depth"
and "bracket_depth" to "bracket_pos", rewrote the "else if" cascade as you
suggested above and put some love into additional comments.
I used two separate patches for clarity and ease of review, but both
should get backpatched.
Yours,
Laurenz Albe