Thread: Could regexp_matches be immutable?
I tried making a functional index based on an expression containing the 2 argument regexp_matches() function. Is there a reason why this function is not marked immutable instead of normal? regards, Rod Taylor
Rod Taylor <rod.taylor@gmail.com> writes: > I tried making a functional index based on an expression containing > the 2 argument regexp_matches() function. Is there a reason why this > function is not marked immutable instead of normal? regex_flavor affects its result. regards, tom lane
On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote: > Rod Taylor <rod.taylor@gmail.com> writes: > > I tried making a functional index based on an expression > > containing the 2 argument regexp_matches() function. Is there a > > reason why this function is not marked immutable instead of > > normal? > > regex_flavor affects its result. Speaking of which, can we see about deprecating and removing this GUC? I've yet to hear of anyone using a flavor other than the default. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter wrote: > On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote: > >> Rod Taylor <rod.taylor@gmail.com> writes: >> >>> I tried making a functional index based on an expression >>> containing the 2 argument regexp_matches() function. Is there a >>> reason why this function is not marked immutable instead of >>> normal? >>> >> regex_flavor affects its result. >> > > Speaking of which, can we see about deprecating and removing this GUC? > I've yet to hear of anyone using a flavor other than the default. > > > You have now. I have a client who sadly uses a non-default setting. And on 8.4, what is more. There are more things under heaven and earth .... cheers andrew
David Fetter <david@fetter.org> wrote: > On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote: >> Rod Taylor <rod.taylor@gmail.com> writes: >> > I tried making a functional index based on an expression >> > containing the 2 argument regexp_matches() function. Is there a >> > reason why this function is not marked immutable instead of >> > normal? >> >> regex_flavor affects its result. > > Speaking of which, can we see about deprecating and removing this > GUC? +1 It would seem to me to be more valuable to have the benefits of IMMUTABLE than preserve pre-7.4 compatibility forever. -Kevin
On 10/14/09 2:07 PM, David Fetter wrote: > On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote: >> Rod Taylor <rod.taylor@gmail.com> writes: >>> I tried making a functional index based on an expression >>> containing the 2 argument regexp_matches() function. Is there a >>> reason why this function is not marked immutable instead of >>> normal? >> regex_flavor affects its result. > > Speaking of which, can we see about deprecating and removing this GUC? > I've yet to hear of anyone using a flavor other than the default. Actually, *we* (PGX) have a client who does. You just haven't worked on their stuff. --Josh
On Wed, Oct 14, 2009 at 05:14:31PM -0400, Andrew Dunstan wrote: > David Fetter wrote: >> On Wed, Oct 14, 2009 at 04:51:03PM -0400, Tom Lane wrote: >> >>> Rod Taylor <rod.taylor@gmail.com> writes: >>> >>>> I tried making a functional index based on an expression >>>> containing the 2 argument regexp_matches() function. Is there a >>>> reason why this function is not marked immutable instead of >>>> normal? >>>> >>> regex_flavor affects its result. >>> >> >> Speaking of which, can we see about deprecating and removing this GUC? >> I've yet to hear of anyone using a flavor other than the default. > > You have now. I have a client who sadly uses a non-default setting. And > on 8.4, what is more. OK, now I've heard of one. I still think we should deprecate and remove. Say, deprecate this next release and remove for the following one? > There are more things under heaven and earth .... My philosophy doesn't include infinite backward compatibility. Neither do heaven and earth, come to think of it. :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
> +1 It would seem to me to be more valuable to have the benefits of > IMMUTABLE than preserve pre-7.4 compatibility forever. Just create a shell function which calls it in a specific flavor, and make that immutable. --Josh
Andrew Dunstan <andrew@dunslane.net> writes: > David Fetter wrote: >> Speaking of which, can we see about deprecating and removing this GUC? >> I've yet to hear of anyone using a flavor other than the default. > You have now. I have a client who sadly uses a non-default setting. And > on 8.4, what is more. How critical is it to them? It would be nice to get rid of that source of variability. It would be possible to keep using old-style regexes even without the GUC, if they can interpose anything that can stick an "embedded options" prefix on the pattern strings. See 9.7.3.4: http://developer.postgresql.org/pgdocs/postgres/functions-matching.html regards, tom lane
On Wed, Oct 14, 2009 at 06:06:23PM -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > David Fetter wrote: > >> Speaking of which, can we see about deprecating and removing this GUC? > >> I've yet to hear of anyone using a flavor other than the default. > > > You have now. I have a client who sadly uses a non-default setting. And > > on 8.4, what is more. > > How critical is it to them? It would be nice to get rid of that source > of variability. > > It would be possible to keep using old-style regexes even without the > GUC, if they can interpose anything that can stick an "embedded options" > prefix on the pattern strings. See 9.7.3.4: > http://developer.postgresql.org/pgdocs/postgres/functions-matching.html Switching it to just embedded options solves the issue of leaving the feature in while cutting the surprises down for those not using it. :) The "embedded options" method is also doable by search-and-replace, as they only work in AREs, which such people wouldn't be using. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> David Fetter wrote: >> >>> Speaking of which, can we see about deprecating and removing this GUC? >>> I've yet to hear of anyone using a flavor other than the default. >>> > > >> You have now. I have a client who sadly uses a non-default setting. And >> on 8.4, what is more. >> > > How critical is it to them? It would be nice to get rid of that source > of variability. > > It would be possible to keep using old-style regexes even without the > GUC, if they can interpose anything that can stick an "embedded options" > prefix on the pattern strings. See 9.7.3.4: > http://developer.postgresql.org/pgdocs/postgres/functions-matching.html > > > They are probably quite open to changing it, but IIRC it is a setting imposed by OpenACS, which is what they are based on. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > They are probably quite open to changing it, but IIRC it is a setting > imposed by OpenACS, which is what they are based on. I seem to recall having asked this before ... but does OpenACS even know what they're doing here? The difference between ERE mode and ARE mode is awfully slight. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> They are probably quite open to changing it, but IIRC it is a setting >> imposed by OpenACS, which is what they are based on. >> > > I seem to recall having asked this before ... but does OpenACS even > know what they're doing here? The difference between ERE mode and ARE > mode is awfully slight. > That's not the worst of it :-( See <http://openacs.org/xowiki/How_to_install_in_Postgres_8.x> cheers andrew
On Wed, Oct 14, 2009 at 11:51:13PM -0400, Andrew Dunstan wrote: > > > Tom Lane wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >> >>> They are probably quite open to changing it, but IIRC it is a >>> setting imposed by OpenACS, which is what they are based on. >> >> I seem to recall having asked this before ... but does OpenACS even >> know what they're doing here? The difference between ERE mode and >> ARE mode is awfully slight. > > That's not the worst of it :-( See > <http://openacs.org/xowiki/How_to_install_in_Postgres_8.x> This just illustrates the fact that at least as far as PostgreSQL is concerned, OpenACS is a dead project. It's been at least 3 major versions since they even attempted to keep compatible with PostgreSQL. OpenACS could be revived as a PostgreSQL-supporting piece of software, and that might be a good thing. It would entail adjusting OpenACS to the things PostgreSQL has changed rather than the reverse. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > On Wed, Oct 14, 2009 at 11:51:13PM -0400, Andrew Dunstan wrote: >> That's not the worst of it :-( See >> <http://openacs.org/xowiki/How_to_install_in_Postgres_8.x> > This just illustrates the fact that at least as far as PostgreSQL is > concerned, OpenACS is a dead project. It's been at least 3 major > versions since they even attempted to keep compatible with PostgreSQL. That seems pretty harsh, considering that there's plenty of discussion of how to use OpenACS with PG on their website. What it looks like to me is that the recommendation about regex_flavor might be just cargo-cult programming. Somebody had some trouble with an updated PG version, turned on every backwards-compatibility option he could find, and when it worked he didn't inquire too closely into which settings were actually important. And ever since then that's been the received wisdom about how to make OpenACS run with Postgres. It would be interesting to try turning off these options one at a time to see which ones do matter. (I'd bet lunch that the one about add_missing_from is bogus, too, or could easily be made so. mysql isn't forgiving about missing FROM items, so it's hard to believe that they have a lot of such things no matter how little they care about Postgres.) regards, tom lane
> (I'd bet lunch that the one about add_missing_from is bogus, too, > or could easily be made so. mysql isn't forgiving about missing > FROM items, so it's hard to believe that they have a lot of such > things no matter how little they care about Postgres.) OpenACS does the old-style DELETEs without a subselect, so they rely on add-missing-from for that. I had to debug this for another user. --Josh Berkus
On Thu, Oct 15, 2009 at 10:22:52AM -0700, Josh Berkus wrote: > > (I'd bet lunch that the one about add_missing_from is bogus, too, > > or could easily be made so. mysql isn't forgiving about missing > > FROM items, so it's hard to believe that they have a lot of such > > things no matter how little they care about Postgres.) > > OpenACS does the old-style DELETEs without a subselect, so they rely > on add-missing-from for that. I had to debug this for another user. Is OpenACS getting enough new deployments to fix this? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Rod Taylor <rod.taylor@gmail.com> writes: > I tried making a functional index based on an expression containing > the 2 argument regexp_matches() function. Is there a reason why this > function is not marked immutable instead of normal? So I went to see about making the changes to remove regex_flavor, and was astonished to find that all the regex-related functions are already marked immutable, and AFAICS always have been. This is clearly wrong, and we would have to fix it if we weren't about to remove the GUC. (In principle we should advise people to change the markings in existing databases, but given the lack of complaints it's probably not worth the trouble --- I doubt many applications change regex_flavor on the fly.) So, having dismissed my original off-the-cuff answer to Rod, the next question is what's really going wrong for him. I get this from a quick trial: regression=# create table tt1(f1 text, f2 text); CREATE TABLE regression=# create index tt1i on tt1(regexp_matches(f1,f2)); ERROR: index expression cannot return a set IOW the problem is that regexp_matches returns SETOF, not that it's marked stable (as it should have been). I'm not sure what semantics you were expecting the index to have, but we don't have any useful support for indexes on sets. regards, tom lane
On Tue, 2009-10-20 at 20:48 -0400, Tom Lane wrote: > So I went to see about making the changes to remove regex_flavor, and > was astonished to find that all the regex-related functions are already > marked immutable, and AFAICS always have been. This is clearly wrong, > and we would have to fix it if we weren't about to remove the GUC. > (In principle we should advise people to change the markings in existing > databases, but given the lack of complaints it's probably not worth the > trouble --- I doubt many applications change regex_flavor on the fly.) Are you sure this wasn't intentional, because it breaks performance and we doubted that many applications would change regex_flavor on the fly?
Peter Eisentraut <peter_e@gmx.net> writes: > On Tue, 2009-10-20 at 20:48 -0400, Tom Lane wrote: >> So I went to see about making the changes to remove regex_flavor, and >> was astonished to find that all the regex-related functions are already >> marked immutable, and AFAICS always have been. This is clearly wrong, >> and we would have to fix it if we weren't about to remove the GUC. > Are you sure this wasn't intentional, because it breaks performance and > we doubted that many applications would change regex_flavor on the fly? Intentional or not, it's wrong :-( In practice I doubt there are many cases where constant-folding a regex would be possible or performance-critical. The real use of having it be immutable is probably Rod's, ie, using it in an index. And that is *obviously* really dangerous if there's a GUC affecting the results. regards, tom lane
> So, having dismissed my original off-the-cuff answer to Rod, the next > question is what's really going wrong for him. I get this from > a quick trial: I wish I had kept specific notes on what I was actually trying to do. I tried to_number first then the expression as seen below. I guess I saw the error again and assumed it was the same as for to_number. sk=# BEGIN; BEGIN sk=# sk=# create table t1 (col1 text); CREATE TABLE sk=# INSERT INTO t1 values ('Z342432'); INSERT 0 1 sk=# INSERT INTO t1 values ('REW9432'); INSERT 0 1 sk=# sk=# SELECT (regexp_matches(col1, '(\d+)$'))[1] from t1;regexp_matches ----------------3424329432 (2 rows) sk=# sk=# create index t1_idx ON t1 (( (regexp_matches(col1, '(\d+)$'))[1] )); ERROR: index expression cannot return a set sk=# sk=# ROLLBACK; ROLLBACK It is interesting that "citext" seems to be functional with exactly the same statements. sk=# BEGIN; BEGIN sk=# sk=# create table t1 (col1 citext); CREATE TABLE sk=# INSERT INTO t1 values ('Z342432'); INSERT 0 1 sk=# INSERT INTO t1 values ('REW9432'); INSERT 0 1 sk=# sk=# SELECT (regexp_matches(col1, '(\d+)$'))[1] from t1;regexp_matches ----------------3424329432 (2 rows) sk=# sk=# create index t1_idx ON t1 (( (regexp_matches(col1, '(\d+)$'))[1] )); CREATE INDEX sk=# sk=# ROLLBACK; ROLLBACK The function regexp_replace(col1, '^[^0-9]+', '') does seem to do the trick for text.
Rod Taylor <rod.taylor@gmail.com> writes: > It is interesting that "citext" seems to be functional with exactly > the same statements. Huh, it looks to me like that's an error in the declaration of the citext versions of regexp_matches --- they should be declared to return setof text[], the same as the underlying text functions. David, do you agree? regards, tom lane
On Oct 21, 2009, at 7:27 AM, Tom Lane wrote: > Huh, it looks to me like that's an error in the declaration of the > citext versions of regexp_matches --- they should be declared to > return > setof text[], the same as the underlying text functions. David, do > you > agree? Ooh, yeah, dunno how I missed that. Best, David
"David E. Wheeler" <david@kineticode.com> writes: > On Oct 21, 2009, at 7:27 AM, Tom Lane wrote: >> Huh, it looks to me like that's an error in the declaration of the >> citext versions of regexp_matches --- they should be declared to >> return >> setof text[], the same as the underlying text functions. David, do >> you agree? > Ooh, yeah, dunno how I missed that. I think we're probably stuck in 8.4, but we should fix it going forward. Would you make a quick check if any of the other citext functions have the same bug? regards, tom lane
On Oct 21, 2009, at 9:37 AM, Tom Lane wrote: >> Ooh, yeah, dunno how I missed that. > > I think we're probably stuck in 8.4, but we should fix it going > forward. Would you make a quick check if any of the other citext > functions have the same bug? I've fixed it in my [version for 8.3](https://svn.kineticode.com/citext/trunk ). Is there a straight-foward way to check such a thing programmatically, with a query perhaps? Or should I just put aside an hour to do an audit? Best, David
On Oct 21, 2009, at 9:40 AM, David E. Wheeler wrote: > On Oct 21, 2009, at 9:37 AM, Tom Lane wrote: > >>> Ooh, yeah, dunno how I missed that. >> >> I think we're probably stuck in 8.4, but we should fix it going >> forward. Would you make a quick check if any of the other citext >> functions have the same bug? > > I've fixed it in my [version for 8.3](https://svn.kineticode.com/citext/trunk > ). Is there a straight-foward way to check such a thing > programmatically, with a query perhaps? Or should I just put aside > an hour to do an audit? FWIW, I think that this is a bug, and that the variation from the text version will be unexpected. I recommend fixing it for 8.4.2. Best, David
"David E. Wheeler" <david@kineticode.com> writes: > FWIW, I think that this is a bug, and that the variation from the text > version will be unexpected. I recommend fixing it for 8.4.2. Well, it's certainly a bug, but I don't think it's back-patchable. A back-patch will not affect existing installations anyway. What it will do is break user code that is expecting the existing behavior (for instance, Rod's index). I think it's something we can only change at a major version boundary. regards, tom lane
"David E. Wheeler" <david@kineticode.com> writes: > Is there a straight-foward way to check such a thing > programmatically, with a query perhaps? Or should I just put aside an > hour to do an audit? I was wondering whether you could query pg_proc to look for functions with the same name and different arguments/results. It's a bit tricky though because you'd expect s/citext/text/ in at least some positions (maybe not all)? regards, tom lane
On Oct 21, 2009, at 9:48 AM, Tom Lane wrote: > I was wondering whether you could query pg_proc to look for functions > with the same name and different arguments/results. It's a bit tricky > though because you'd expect s/citext/text/ in at least some positions > (maybe not all)? Yeah, almost all. I'll poke around, though it might be a day or two… Best, David