question regarding regular expressions - Mailing list pgsql-general

From Cindy
Subject question regarding regular expressions
Date
Msg-id 13509.1032203086@stephanus.tlg.uci.edu
Whole thread Raw
Responses Re: question regarding regular expressions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
If I've got something like "...AND citation ~ 'x[0-9]*'..." in my
query, is there any way to extract the portion that matched?  In
many languages if you surround with parentheses the part of the regexp
you're interested in, you can retrieve what was matched with $1, etc.

What I'm trying to do is given a possible citation value of arbitrary
letters/number sequences (eg a123b345c876), I would like to be able to
find the next sequence for a particular embedded sequence.  Let's say
I'm looking at a123b345, and I want to find the next b sequence (which
might be 346 or 347, or some number larger than 345) that I have,
ignoring the values of any of the c sequences (and d, and e, whatever
else is afterwards).

The problem I'm having is extracting (SUBSTR) the right number of
numerals in the match I've made for the ORDER BY clause, since the
value can be 0 to 9999 and substring either picks up all the
characters after the given start, or picks up a fixed number of chars
after the starting point.

Yes, I realize that citations shouldn't be a single varchar and should
instead be a set of fields but this is something I've inherited and I
would have to have a very good reason for changing this (and breaking
all kinds of things)...

My query at this point is something like this:

SELECT byteloc, citation FROM citations where
citation ~ '$prefix[0-9]*$postfix' AND
TO_NUMBER(SUBSTR(citation, $location), '999999') > $curcitefield
ORDER BY TO_NUMBER(SUBSTR(citation, $location), '999999') DESC LIMIT 1;

This only works for the very last letter/number field in the string, of
course.  If I want to find a letter/number field at the start or in
the middle of citation, I would have to add something to the substr
function to keep from picking up all the rest of the chars (some of which
would be non numeric) and that's where I got stuck.

Thoughts?

Thanks,
--Cindy
--
ctmoore@uci.edu

pgsql-general by date:

Previous
From: Elaine Lindelef
Date:
Subject: Re: Panic - Format has changed
Next
From: Alex Rice
Date:
Subject: find overlapping address ranges