Thread: Inside the Regex Engine
Kind people, As a perl weenie, I'm used to being able to do things with regexes like $text =~ s/(foo|bar|baz)/NO UNIX WEENIES HERE/; $got_it = $1; While PL/Perl is great, it's not available everywhere, and I'd like to be able to grab atoms from a regex match in, say, a SELECT. Is there some way to get access to them? TIA for any pointers on this :) Cheers, D -- David Fetter david@fetter.org http://fetter.org/ phone: +1 510 893 6100 cell: +1 415 235 3778 Civil government, so far as it is instituted for the security of property, is in reality instituted for the defense of the rich against the poor, or of those who have some property against those who have none at all. Adam Smith
On Tue, Dec 02, 2003 at 07:52:57PM -0600, David Fetter wrote: > As a perl weenie, I'm used to being able to do things with regexes > like > > $text =~ s/(foo|bar|baz)/NO UNIX WEENIES HERE/; > $got_it = $1; > > While PL/Perl is great, it's not available everywhere, and I'd like to > be able to grab atoms from a regex match in, say, a SELECT. Is there > some way to get access to them? Huh, the best I am able to do is alvh=> select substring('bazfoo fubar', 'fu(foo|bar)');substring -----------bar (1 fila) The choice of the name for the function seems weird to me. Also note that you are able to use only one set of parenthesis (i.e. the first gets picked up, the rest is ignored). If you need to be able to extract further things, there's a tip in the docuemntation that reads "If you have pattern matching needs that go beyond this, consider writing a user-defined function in Perl or Tcl." It does not appear to be that difficult to add the functionality needed to extract random atoms, but there's some hacking involved. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Cuando no hay humildad las personas se degradan" (A. Christie)
david@fetter.org (David Fetter) writes: > While PL/Perl is great, it's not available everywhere, and I'd like to > be able to grab atoms from a regex match in, say, a SELECT. Is there > some way to get access to them? There's a three-parameter variant of substring() that allows extraction of a portion of a regex match --- unfortunately it uses SQL99's brain-dead notion of regex, which will not satisfy any Perl weenie :-( I think it'd be worth our while to define some comparable functionality that depends only on the POSIX regex engine ... regards, tom lane
Tom Lane wrote: >david@fetter.org (David Fetter) writes: > > >>While PL/Perl is great, it's not available everywhere, and I'd like to >>be able to grab atoms from a regex match in, say, a SELECT. Is there >>some way to get access to them? >> >> > >There's a three-parameter variant of substring() that allows extraction >of a portion of a regex match --- unfortunately it uses SQL99's >brain-dead notion of regex, which will not satisfy any Perl weenie :-( > >I think it'd be worth our while to define some comparable functionality >that depends only on the POSIX regex engine ... > > substitute should be relatively straightforward, I guess; split and match maybe less so - what do you return? An array? Or you could require an explicit subscript to get a particular return value as in split_part(), which would be potentially inefficient if you want more than one (although I guess results could be cached). cheers andrew
Andrew Dunstan <andrew@dunslane.net> wrote: > Tom Lane wrote: > >>david@fetter.org (David Fetter) writes: >> >> >>>While PL/Perl is great, it's not available everywhere, and I'd like >>>to be able to grab atoms from a regex match in, say, a SELECT. Is >>>there some way to get access to them? >> >>There's a three-parameter variant of substring() that allows >>extraction of a portion of a regex match --- unfortunately it uses >>SQL99's brain-dead notion of regex, which will not satisfy any Perl >>weenie :-( >> >>I think it'd be worth our while to define some comparable >>functionality that depends only on the POSIX regex engine ... > > substitute should be relatively straightforward, I guess; split and > match maybe less so - what do you return? An array? That would be great. > Or you could require an explicit subscript to get a particular > return value as in split_part(), which would be potentially > inefficient if you want more than one (although I guess results > could be cached). That'd be good, too. Cheers D -- David Fetter david@fetter.org http://fetter.org/ phone: +1 510 893 6100 cell: +1 415 235 3778 My definition of a free society is a society where it is safe to be unpopular. Adlai Stevenson
Tom Lane <tgl@sss.pgh.pa.us> wrote: > david@fetter.org (David Fetter) writes: >> While PL/Perl is great, it's not available everywhere, and I'd like >> to be able to grab atoms from a regex match in, say, a SELECT. Is >> there some way to get access to them? > > There's a three-parameter variant of substring() that allows > extraction of a portion of a regex match --- unfortunately it uses > SQL99's brain-dead notion of regex, which will not satisfy any Perl > weenie :-( > > I think it'd be worth our while to define some comparable > functionality that depends only on the POSIX regex engine ... What pieces of the source code would be involved? Cheers, D -- David Fetter david@fetter.org http://fetter.org/ phone: +1 510 893 6100 cell: +1 415 235 3778 Transported to a surreal landscape, a young girl kills the first woman she meets and then teams up with three complete strangers to kill again. Marin County newspaper's TV listing for The Wizard of Oz