Thread: Regexp confusion
Trying to match some numbers, and I'm having some regexp problems. I've boiled it down to the following: /* (1) */ select '3.14' similar to E'^\\d+\\.\\d+$'; -- true /* (2) */ select '3.14' similar to E'^\\d+(\\.\\d+)$'; -- true /* (3) */ select '3.14' similar to E'^\\d+(\\.\\d+)*$'; -- true /* (4) */ select '3.14' similar to E'^\\d+(\\.\\d+)?$'; -- false /* (5) */ select '3.14' similar to E'^\\d+(\\.\\d+)+$'; -- true So, based on (1) and (2), the pattern '\.\d+' occurs once. So why does (4) return false? between (3), (4), and (5), it appears as though the group is matching multiple times. Thanks, -- ------------------------------------------------------------------------ *Doug Gorley* | doug.gorley@gmail.com <mailto:doug.gorley@gmail.com>
Doug Gorley escribió: > Trying to match some numbers, and I'm having some regexp problems. > I've boiled it down to the following: > > /* (1) */ select '3.14' similar to E'^\\d+\\.\\d+$'; -- true > /* (2) */ select '3.14' similar to E'^\\d+(\\.\\d+)$'; -- true > /* (3) */ select '3.14' similar to E'^\\d+(\\.\\d+)*$'; -- true > /* (4) */ select '3.14' similar to E'^\\d+(\\.\\d+)?$'; -- false > /* (5) */ select '3.14' similar to E'^\\d+(\\.\\d+)+$'; -- true > > So, based on (1) and (2), the pattern '\.\d+' occurs once. So why > does (4) return false? between (3), (4), and (5), it appears as > though the group is matching multiple times. I think the confusion is about what SIMILAR TO supports. ? it doesn't. See here: http://www.postgresql.org/docs/8.4/static/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP You probably want to use ~ instead of SIMILAR TO. (SIMILAR TO is a weird beast that the SQL committee came up with, vaguely based on regular expressions.) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Doug Gorley escribi�: >> Trying to match some numbers, and I'm having some regexp problems. >> I've boiled it down to the following: >> >> /* (1) */ select '3.14' similar to E'^\\d+\\.\\d+$'; -- true >> /* (2) */ select '3.14' similar to E'^\\d+(\\.\\d+)$'; -- true >> /* (3) */ select '3.14' similar to E'^\\d+(\\.\\d+)*$'; -- true >> /* (4) */ select '3.14' similar to E'^\\d+(\\.\\d+)?$'; -- false >> /* (5) */ select '3.14' similar to E'^\\d+(\\.\\d+)+$'; -- true >> >> So, based on (1) and (2), the pattern '\.\d+' occurs once. So why >> does (4) return false? between (3), (4), and (5), it appears as >> though the group is matching multiple times. > I think the confusion is about what SIMILAR TO supports. ? it doesn't. > See here: > http://www.postgresql.org/docs/8.4/static/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP > You probably want to use ~ instead of SIMILAR TO. > (SIMILAR TO is a weird beast that the SQL committee came up with, > vaguely based on regular expressions.) Hmm ... actually I think *none* of those should have succeeded, because ^ and $ are not supposed to be metacharacters in SIMILAR TO. We are failing to quote them, but apparently we need to --- it looks like the regexp engine processes ^^ at the start of the pattern the same as ^, and likewise for $$ at the end. regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes: > I think the confusion is about what SIMILAR TO supports. ? it doesn't. Actually, upon looking into SQL:2008, it seems it's supposed to support ? now, and also {m,n} style bounds. Those weren't there in SQL99 ... I've changed the similar_escape code to not escape ? and {, so that those things will work now, and to escape ^ and $ instead. regards, tom lane