Re: Regular expression question - Mailing list pgsql-general

From Tom Lane
Subject Re: Regular expression question
Date
Msg-id 7216.976548607@sss.pgh.pa.us
Whole thread Raw
In response to Regular expression question  (Steve Heaven <steve@thornet.co.uk>)
List pgsql-general
Steve Heaven <steve@thornet.co.uk> writes:
> Does the regular expression parser have anything equivalent to Perl's \w
> word boundary metacharacter?

src/backend/regex/re_format.7 contains the whole scoop (for some reason
this page doesn't seem to get installed with the rest of the
documentation).  In particular:

    There are two special cases of bracket expressions:
    the bracket expressions `[[:<:]]' and `[[:>:]]' match the null
    string at the beginning and end of a word respectively.
    A word is defined as a sequence of word characters
    which is neither preceded nor followed by word characters.
    A word character is an alnum character (as defined by ctype(3))
    or an underscore.  This is an extension, compatible with but not
    specified by POSIX 1003.2, and should be used with caution in
    software intended to be portable to other systems.

    ...

    BUGS

    The syntax for word boundaries is incredibly ugly.

POSIX bracket expressions are pretty ugly anyway, and this is no worse
than the rest.  However, if you prefer Perl or Tcl, I'd recommend that
you just *use* Perl or Tcl ;-).  plperl and pltcl make great
implementation languages for text-mashing functions...

            regards, tom lane

pgsql-general by date:

Previous
From: Vince Vielhaber
Date:
Subject: Re: Simple Question: Case sensitivity
Next
From: Michael Ansley
Date:
Subject: RE: Regular expression question