Thread: Finding broken regex'es
Hi folks, I'm looking for some way to find broken regex'es in some column to kick them off. For now I'm regularily fetching all regexes from an PHP script, try an preg_match() and so find the broken ones to later remove them. Is there any way to do this directly within the db ? thx -- ---------------------------------------------------------------------Enrico Weigelt == metux IT service - http://www.metux.de/ ---------------------------------------------------------------------Please visit the OpenSource QM Taskforce: http://wiki.metux.de/public/OpenSource_QM_TaskforcePatches/ Fixes for a lot dozens of packages in dozens of versions:http://patches.metux.de/ ---------------------------------------------------------------------
On Oct 2, 2007, at 3:23 PM, Enrico Weigelt wrote: > > Hi folks, > > > I'm looking for some way to find broken regex'es in some column > to kick them off. For now I'm regularily fetching all regexes > from an PHP script, try an preg_match() and so find the broken > ones to later remove them. > > Is there any way to do this directly within the db ? IIRC, if they're PERL compatible which it would seem from the php function you're using, no. Postgres supports POSIX regexes but not (right now anyway) PERL regexes. Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
Erik Jones <erik@myemma.com> writes: > IIRC, if they're PERL compatible which it would seem from the php > function you're using, no. Postgres supports POSIX regexes but not > (right now anyway) PERL regexes. Actually what we support are Tcl (Henry Spencer) regexes, which are about as powerful as Perl's but have a few minor incompatibilities. Tcl and Perl regexes are both supersets of the POSIX requirements. regards, tom lane
On Oct 2, 2007, at 10:48 PM, Tom Lane wrote: > Erik Jones <erik@myemma.com> writes: >> IIRC, if they're PERL compatible which it would seem from the php >> function you're using, no. Postgres supports POSIX regexes but not >> (right now anyway) PERL regexes. > > Actually what we support are Tcl (Henry Spencer) regexes, which are > about as powerful as Perl's but have a few minor incompatibilities. > Tcl and Perl regexes are both supersets of the POSIX requirements. > Excellent point of information. I was really just trying to focus on why the Postgres regex engine wouldn't be equivalent to the matching functionality of php's preg_* functions. One example would be look behind assertions. Btw, am I crazy or do I remember someone mentioning that support for Perl regexes possibly being added to Postgres in the future. Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
Erik Jones <erik@myemma.com> writes: > Btw, am I crazy or do I remember someone > mentioning that support for Perl regexes possibly being added to > Postgres in the future. You can already get exact-Perl-behavior regexes by means of a plperl wrapper function. I can't really see any plausible argument for us supporting both behaviors natively --- IMHO the three regex flavors supported by Spencer's library are already two too many. regards, tom lane
On 10/2/07, Enrico Weigelt <weigelt@metux.de> wrote: > > Hi folks, > > > I'm looking for some way to find broken regex'es in some column > to kick them off. For now I'm regularily fetching all regexes > from an PHP script, try an preg_match() and so find the broken > ones to later remove them. > > Is there any way to do this directly within the db ? Of course. Exceptions is what You need! CREATE FUNCTION regex_is_broken(r text) RETURNS boolean AS $$ BEGIN PERFORM '' ~ r; RETURN 'f'; EXCEPTION WHEN INVALID_REGULAR_EXPRESSION THEN RETURN 't'; END; $$ LANGUAGE PLpgSQL STRICT IMMUTABLE; ...and then you could do something like: DELETE FROM table WHERE regex_is_broken(rx_col); You don't need PLpgSQL to prevent such invalid regexes in the first place. You could use CHECK constraint for it: CREATE TABLE rx_check ( rx text CHECK ('' ~ rx IN ('t','f')) ); postgres=> INSERT INTO rx_check (rx) VALUES ('.*'); INSERT 0 1 Time: 13.660 ms postgres=> INSERT INTO rx_check (rx) VALUES ('234234'); INSERT 0 1 Time: 2.282 ms postgres=> INSERT INTO rx_check (rx) VALUES ('par).*'); ERROR: invalid regular expression: parentheses () not balanced Regards, Dawid
2007/10/3, Dawid Kuroczko <qnex42@gmail.com>: > CREATE TABLE rx_check ( > rx text CHECK ('' ~ rx IN ('t','f')) > ); wow. This is beautiful :)
On 10/3/07, Filip Rembiałkowski <plk.zuber@gmail.com> wrote: > 2007/10/3, Dawid Kuroczko <qnex42@gmail.com>: > > > CREATE TABLE rx_check ( > > rx text CHECK ('' ~ rx IN ('t','f')) > > ); > > wow. This is beautiful :) Personally I would wrap it around DOMAIN, i.e.: CREATE DOMAIN regex AS text CHECK ('' ~ VALUE IN (TRUE,FALSE)); And then use 'regex' type instead of 'text'. For a nice look&feel: CREATE TABLE rx_test ( rx regex ); qnex=>insert into rx_test values ('.*'); INSERT 0 1 qnex=>insert into rx_test values ('qwe'); INSERT 0 1 qnex=>insert into rx_test values ('aaa(aaa'); ERROR: invalid regular expression: parentheses () not balanced
Hi, I'm working on a row level plpgsql trigger running after delete, using a 8.0.3 server. It gets a parameter which is a field name of the OLD record. How can that field be accessed? I'd like to do something like: for recvar in 'select OLD.'||quote_ident(TG_ARGV[0])... Daniel