Thread: pg_config wrongly marked as not parallel safe?
Hi, Triggered by the thread at [1] I looked for functions marked as immutable but not parallel safe. postgres[19492][1]=# SELECT oid::regprocedure, provolatile, proparallel FROM pg_proc WHERE provolatile = 'i' AND proparallel!= 's'; ┌─────────────┬─────────────┬─────────────┐ │ oid │ provolatile │ proparallel │ ├─────────────┼─────────────┼─────────────┤ │ pg_config() │ i │ r │ └─────────────┴─────────────┴─────────────┘ (1 row) # pg_config { oid => '3400', descr => 'pg_config binary as a function', proname => 'pg_config', prorows => '23', proretset => 't', proparallel => 'r', prorettype => 'record', proargtypes => '', proallargtypes => '{text,text}', proargmodes => '{o,o}', proargnames => '{name,setting}', prosrc => 'pg_config' }, so that function is marked as immutable but not parallel safe, without an explanation for that odd combination. Now obviously I don't think it practially matters for pg_config(), but it seems unnecessarily confusing as a general matter. I think we probably should fix this specific case, and then add a check to opr_sanity.sql or such. As far as I can tell pg_config() was marked as such since its addition in [2]. Joe, I assume this wasn't intentional? Greetings, Andres Freund [1] https://www.postgresql.org/message-id/CA+6d-n4dOakgLu2gsTfk9uD2CC9ueNCg+z_mnXA2-=Qaod1Wuw@mail.gmail.com [2] http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a5c43b886942e96ec5c745041f2d6a50c3205147
On 11/26/18 6:45 PM, Andres Freund wrote: > Hi, > > Triggered by the thread at [1] I looked for functions marked as > immutable but not parallel safe. > > postgres[19492][1]=# SELECT oid::regprocedure, provolatile, proparallel FROM pg_proc WHERE provolatile = 'i' AND proparallel!= 's'; > ┌─────────────┬─────────────┬─────────────┐ > │ oid │ provolatile │ proparallel │ > ├─────────────┼─────────────┼─────────────┤ > │ pg_config() │ i │ r │ > └─────────────┴─────────────┴─────────────┘ > (1 row) > > # pg_config > { oid => '3400', descr => 'pg_config binary as a function', > proname => 'pg_config', prorows => '23', proretset => 't', proparallel => 'r', > prorettype => 'record', proargtypes => '', proallargtypes => '{text,text}', > proargmodes => '{o,o}', proargnames => '{name,setting}', > prosrc => 'pg_config' }, > > so that function is marked as immutable but not parallel safe, without > an explanation for that odd combination. > > Now obviously I don't think it practially matters for pg_config(), but > it seems unnecessarily confusing as a general matter. > > I think we probably should fix this specific case, and then add a check > to opr_sanity.sql or such. > > As far as I can tell pg_config() was marked as such since its addition > in [2]. Joe, I assume this wasn't intentional? Not intentional. Though, sitting here chatting with Stephen about it, I am now wondering if pg_config() should actually be marked immutable: select * from pg_config() where name = 'VERSION'; name | setting ---------+----------------- VERSION | PostgreSQL 10.5 (1 row) [...upgrade the postgres binaries...] select * from pg_config() where name = 'VERSION'; name | setting ---------+----------------- VERSION | PostgreSQL 10.6 (1 row) So the correct answer is probably to mark pg_config() stable, but it still seems to be parallel safe to me. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
On 2018-11-26 19:04:46 -0500, Joe Conway wrote: > On 11/26/18 6:45 PM, Andres Freund wrote: > > Hi, > > > > Triggered by the thread at [1] I looked for functions marked as > > immutable but not parallel safe. > > > > postgres[19492][1]=# SELECT oid::regprocedure, provolatile, proparallel FROM pg_proc WHERE provolatile = 'i' AND proparallel!= 's'; > > ┌─────────────┬─────────────┬─────────────┐ > > │ oid │ provolatile │ proparallel │ > > ├─────────────┼─────────────┼─────────────┤ > > │ pg_config() │ i │ r │ > > └─────────────┴─────────────┴─────────────┘ > > (1 row) > > > > # pg_config > > { oid => '3400', descr => 'pg_config binary as a function', > > proname => 'pg_config', prorows => '23', proretset => 't', proparallel => 'r', > > prorettype => 'record', proargtypes => '', proallargtypes => '{text,text}', > > proargmodes => '{o,o}', proargnames => '{name,setting}', > > prosrc => 'pg_config' }, > > > > so that function is marked as immutable but not parallel safe, without > > an explanation for that odd combination. > > > > Now obviously I don't think it practially matters for pg_config(), but > > it seems unnecessarily confusing as a general matter. > > > > I think we probably should fix this specific case, and then add a check > > to opr_sanity.sql or such. > > > > As far as I can tell pg_config() was marked as such since its addition > > in [2]. Joe, I assume this wasn't intentional? > > Not intentional. Though, sitting here chatting with Stephen about it, I > am now wondering if pg_config() should actually be marked immutable: > > select * from pg_config() where name = 'VERSION'; > name | setting > ---------+----------------- > VERSION | PostgreSQL 10.5 > (1 row) > > [...upgrade the postgres binaries...] > > select * from pg_config() where name = 'VERSION'; > name | setting > ---------+----------------- > VERSION | PostgreSQL 10.6 > (1 row) > > So the correct answer is probably to mark pg_config() stable, but it > still seems to be parallel safe to me. I don't think we should consider immutability to mean anything across major versions. What'd be helped by doing that? We'd have to rule out any behaviour change to any immutable function for that to make sense. Including making an immutable function not immutable anymore. - Andres
On 11/26/18 7:08 PM, Andres Freund wrote: > On 2018-11-26 19:04:46 -0500, Joe Conway wrote: >> Not intentional. Though, sitting here chatting with Stephen about it, I >> am now wondering if pg_config() should actually be marked immutable: >> >> select * from pg_config() where name = 'VERSION'; >> name | setting >> ---------+----------------- >> VERSION | PostgreSQL 10.5 >> (1 row) >> >> [...upgrade the postgres binaries...] >> >> select * from pg_config() where name = 'VERSION'; >> name | setting >> ---------+----------------- >> VERSION | PostgreSQL 10.6 >> (1 row) >> >> So the correct answer is probably to mark pg_config() stable, but it >> still seems to be parallel safe to me. > > I don't think we should consider immutability to mean anything across > major versions. What'd be helped by doing that? We'd have to rule out > any behaviour change to any immutable function for that to make > sense. Including making an immutable function not immutable anymore. Umm, this is a minor version not major. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-11-26 19:04:46 -0500, Joe Conway wrote: > > So the correct answer is probably to mark pg_config() stable, but it > > still seems to be parallel safe to me. > > I don't think we should consider immutability to mean anything across > major versions. What'd be helped by doing that? We'd have to rule out > any behaviour change to any immutable function for that to make > sense. Including making an immutable function not immutable anymore. Then we have to require that all indexes built with immutable functions be rebuilt when someone does a pg_upgrade from one major version to the next. Not to mention that the issue at hand isn't a major version upgrade anyway, it's a minor version change... Thanks! Stephen
Attachment
On 2018-11-26 19:14:24 -0500, Joe Conway wrote: > On 11/26/18 7:08 PM, Andres Freund wrote: > > On 2018-11-26 19:04:46 -0500, Joe Conway wrote: > >> Not intentional. Though, sitting here chatting with Stephen about it, I > >> am now wondering if pg_config() should actually be marked immutable: > >> > >> select * from pg_config() where name = 'VERSION'; > >> name | setting > >> ---------+----------------- > >> VERSION | PostgreSQL 10.5 > >> (1 row) > >> > >> [...upgrade the postgres binaries...] > >> > >> select * from pg_config() where name = 'VERSION'; > >> name | setting > >> ---------+----------------- > >> VERSION | PostgreSQL 10.6 > >> (1 row) > >> > >> So the correct answer is probably to mark pg_config() stable, but it > >> still seems to be parallel safe to me. > > > > I don't think we should consider immutability to mean anything across > > major versions. What'd be helped by doing that? We'd have to rule out > > any behaviour change to any immutable function for that to make > > sense. Including making an immutable function not immutable anymore. > > Umm, this is a minor version not major. Oops. Greetings, Andres Freund
Hi, On 2018-11-26 19:16:00 -0500, Stephen Frost wrote: > Greetings, > > * Andres Freund (andres@anarazel.de) wrote: > > On 2018-11-26 19:04:46 -0500, Joe Conway wrote: > > > So the correct answer is probably to mark pg_config() stable, but it > > > still seems to be parallel safe to me. > > > > I don't think we should consider immutability to mean anything across > > major versions. What'd be helped by doing that? We'd have to rule out > > any behaviour change to any immutable function for that to make > > sense. Including making an immutable function not immutable anymore. > > Then we have to require that all indexes built with immutable functions > be rebuilt when someone does a pg_upgrade from one major version to the > next. Happened a couple times. The harm from overaggressively removing immutability, and thus not even being able to add such indexes and obviously worse query plans, seems much bigger than avoiding the need to rebuild indexes in some cases. > Not to mention that the issue at hand isn't a major version upgrade > anyway, it's a minor version change... Yea, that obviously makes this different. Apparently the old versioning scheme is stuck in my head on a deep enough level... Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-11-26 19:16:00 -0500, Stephen Frost wrote: > > * Andres Freund (andres@anarazel.de) wrote: > > > On 2018-11-26 19:04:46 -0500, Joe Conway wrote: > > > > So the correct answer is probably to mark pg_config() stable, but it > > > > still seems to be parallel safe to me. > > > > > > I don't think we should consider immutability to mean anything across > > > major versions. What'd be helped by doing that? We'd have to rule out > > > any behaviour change to any immutable function for that to make > > > sense. Including making an immutable function not immutable anymore. > > > > Then we have to require that all indexes built with immutable functions > > be rebuilt when someone does a pg_upgrade from one major version to the > > next. > > Happened a couple times. The harm from overaggressively removing > immutability, and thus not even being able to add such indexes and > obviously worse query plans, seems much bigger than avoiding the need to > rebuild indexes in some cases. These two things seem entirely independent in my view, so I'm really not sure that I'm following what you're getting at. We have required specific kinds of indexes to be rebuilt across major version upgrades, yes, but those have almost uniformly been cases where the index was very rarely used or only used in some corner cases. I certainly do not believe we should invalidate all indexes built with immutable functions across major versions upgrades. Perhaps in some cases when we've decided that an immutable function today should work in some different way in a new version then we should complain about or invalidate those indexes, but that's a special case. Taken to an extreme, you could argue that partial indexes should also be rebuilt on major version upgrades, in case we decide that 1 is suddently less than 2, but instead of baking that assumption in, let's just deal with those cases when they come up and write logic into pg_upgrade to detect such usages and make the user aware of them. If you feel that immutability is being overaggressively removed from something, that's fair, but should also be addressed on a case-by-case basis. In this particular case, I don't see any concern that we're going to break user's indexes by changing pg_config() to be stable instead of immutable, and I don't really think we even need to write logic into pg_upgrade for it- unless it could be generalized to detect such cases between versions, in which case I'd be all for that in case we have other such changes. Not sure it's really worth it though, we don't have that happen all that often. Thanks! Stpehen
Attachment
On 2018-11-26 19:34:03 -0500, Stephen Frost wrote: > These two things seem entirely independent in my view, so I'm really not > sure that I'm following what you're getting at. All I said is that I don't think it's a reasonable policy to mark all functions that potentially could change across major versions as immutable. I've no problem with changing pg_config in particular, especially as it - as has been pointed - clearly can change in minor versions / recompiles.
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-11-26 19:34:03 -0500, Stephen Frost wrote: > > These two things seem entirely independent in my view, so I'm really not > > sure that I'm following what you're getting at. > > All I said is that I don't think it's a reasonable policy to mark all > functions that potentially could change across major versions as > immutable. I've no problem with changing pg_config in particular, > especially as it - as has been pointed - clearly can change in minor > versions / recompiles. I'm... not following. If a function's results can change across minor or major versions, we shouldn't be marking it as immutable because, by definition, it's not immutable. We, today, have a baked in assumption that any function marked as immutable will remain immutable across all major versions that we allow indexes to be retained through, which is all of them since about 8.3 at this point. We absolutely need a policy that if you decide to change the results of some immutable function across a major version change, you need to consider the results on indexes and possibly write into pg_upgrade checks to try and detect any usage of that immutable function. I hope we're in agreement there. In other words, maybe it isn't a sealed-in-concrete policy that you can't go around changing what an immutable function returns, but you certainly better have a really good justification for it, and write all of the code to detect any cases where that could cause incorrect results from the database, including pg_upgrade throwing a big error if it finds any indexes (or maybe even functions...) using it, and even then I'm pretty darn skeptical about accepting it. Thanks! Stephen
Attachment
On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote: > If a function's results can change across minor or major versions, we > shouldn't be marking it as immutable because, by definition, it's not > immutable. > > We, today, have a baked in assumption that any function marked as > immutable will remain immutable across all major versions that we allow > indexes to be retained through, which is all of them since about 8.3 at > this point. > > We absolutely need a policy that if you decide to change the results of > some immutable function across a major version change, you need to > consider the results on indexes and possibly write into pg_upgrade > checks to try and detect any usage of that immutable function. I hope > we're in agreement there. It's hard to make a guarantee that a function will never change in the future. What if we fix some rounding or overflow problem in a floating point function? If we went that far, hardly any function could be IMMUTABLE. I think it is best to use IMMUTABLE whenever we don't expect it to change and it is a function useful for indexes, and if it happens to change nonetheless, write into the release notes that certain indexes have to be rebuilt after upgrade. Of course, there is no problem to mark pg_config as stable, because there is little chance it will be used in an index anyway. Yours, Laurenz Albe
At Tue, 27 Nov 2018 06:33:42 +0100, Laurenz Albe <laurenz.albe@cybertec.at> wrote in <8dcb0bb248e9aaef0f1ef0faa27ab583bfa5bb84.camel@cybertec.at> > On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote: > > If a function's results can change across minor or major versions, we > > shouldn't be marking it as immutable because, by definition, it's not > > immutable. > > > > We, today, have a baked in assumption that any function marked as > > immutable will remain immutable across all major versions that we allow > > indexes to be retained through, which is all of them since about 8.3 at > > this point. > > > > We absolutely need a policy that if you decide to change the results of > > some immutable function across a major version change, you need to > > consider the results on indexes and possibly write into pg_upgrade > > checks to try and detect any usage of that immutable function. I hope > > we're in agreement there. > > It's hard to make a guarantee that a function will never change in the > future. What if we fix some rounding or overflow problem in a floating > point function? +1. Actually, some geometric comparisons are performed counting tolerance margin, the validity of which is in doubt. Their behavior has been changed in recent major version and still has a room for improvement, and the functions are parallel-safe and immutable. Immutablity is mentiond mainly in the light of optimization in the documentation. https://www.postgresql.org/docs/current/xfunc-volatility.html > If we went that far, hardly any function could be IMMUTABLE. > > I think it is best to use IMMUTABLE whenever we don't expect it to > change and it is a function useful for indexes, and if it happens to > change nonetheless, write into the release notes that certain indexes > have to be rebuilt after upgrade. > Of course, there is no problem to mark pg_config as stable, because > there is little chance it will be used in an index anyway. Agreed. We have the following description in the documentation. | A common error is to label a function IMMUTABLE when its results | depend on a configuration parameter. For example, a function that | manipulates timestamps might well have results that depend on the | TimeZone setting. For safety, such functions should be labeled | STABLE instead. Still this isn't mentioning upgrades. We could add to this that something like that immutablity is not guaranteed beyond PostgreSQL versions, the results of immutable functions might change for certain reasons including bug fixes, blah, blah.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Greetings, * Kyotaro HORIGUCHI (horiguchi.kyotaro@lab.ntt.co.jp) wrote: > At Tue, 27 Nov 2018 06:33:42 +0100, Laurenz Albe <laurenz.albe@cybertec.at> wrote in <8dcb0bb248e9aaef0f1ef0faa27ab583bfa5bb84.camel@cybertec.at> > > On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote: > > > If a function's results can change across minor or major versions, we > > > shouldn't be marking it as immutable because, by definition, it's not > > > immutable. > > > > > > We, today, have a baked in assumption that any function marked as > > > immutable will remain immutable across all major versions that we allow > > > indexes to be retained through, which is all of them since about 8.3 at > > > this point. > > > > > > We absolutely need a policy that if you decide to change the results of > > > some immutable function across a major version change, you need to > > > consider the results on indexes and possibly write into pg_upgrade > > > checks to try and detect any usage of that immutable function. I hope > > > we're in agreement there. > > > > It's hard to make a guarantee that a function will never change in the > > future. What if we fix some rounding or overflow problem in a floating > > point function? > > +1. > > Actually, some geometric comparisons are performed counting > tolerance margin, the validity of which is in doubt. Their > behavior has been changed in recent major version and still has a > room for improvement, and the functions are parallel-safe and > immutable. Immutablity is mentiond mainly in the light of > optimization in the documentation. I really don't buy off on these arguments in the least. I also didn't say that a function wasn't allowed to change- but that the output of an immutable function, for a given input, shouldn't change and in the very rare case where we absolutely had to make a change, it had better be for a very good reason and we need to consider the impact on user indexes. I don't agree that we can simply declare that all functional and partial indexes have to be rebuilt between every major version upgrade, which is the alternative. There really isn't another option- either we're extremely careful and take immutability of functions seriously and make sure to preserve behavior, and therefore indexes, across major versions, or we don't and we require indexes to be rebuilt. When it comes to the documentation, perhaps we should improve it to make it clear that if you change the results from an immutable function then you need to rebuild any indexes which use it. > > If we went that far, hardly any function could be IMMUTABLE. I don't agree. > > I think it is best to use IMMUTABLE whenever we don't expect it to > > change and it is a function useful for indexes, and if it happens to > > change nonetheless, write into the release notes that certain indexes > > have to be rebuilt after upgrade. I don't agree with the notion of "if it happens to change" as that sounds pretty cavalier, and only noting something in the release notes is a good way to cause a lot of pain for users. If it's possible to do better, which it often is (see the list of things pg_upgrade already checks...) then we should do so. If there's reasons we can't, fine, it happens, but that needs to be taken into account when we're considering making such a change and its impact on users. > > Of course, there is no problem to mark pg_config as stable, because > > there is little chance it will be used in an index anyway. > > Agreed. We have the following description in the documentation. I'm pretty sure we all agree that pg_config should be marked as stable and that changing it from immutable to stable won't be an issue. > | A common error is to label a function IMMUTABLE when its results > | depend on a configuration parameter. For example, a function that > | manipulates timestamps might well have results that depend on the > | TimeZone setting. For safety, such functions should be labeled > | STABLE instead. > > Still this isn't mentioning upgrades. We could add to this that > something like that immutablity is not guaranteed beyond > PostgreSQL versions, the results of immutable functions might > change for certain reasons including bug fixes, blah, blah.. Which means we're telling every user out there that they have to rebuild every functional or partial index they have across every major version. I don't agree with that and I don't think we really have some pressing need to be that free with making changes to immutable functions. I do think it'd be good to make a mention when talking about immutable functions that they're allowed to be used in indexes and, as such, users need to be careful to make sure that they don't change the results from an immutable function unless they're sure it's not used in an index or they're prepared to rebuild all indexes where it is used. Thanks! Stephen
Attachment
At Wed, 28 Nov 2018 21:17:17 -0500, Stephen Frost <sfrost@snowman.net> wrote in <20181129021717.GT3415@tamriel.snowman.net> > Greetings, > > * Kyotaro HORIGUCHI (horiguchi.kyotaro@lab.ntt.co.jp) wrote: > > At Tue, 27 Nov 2018 06:33:42 +0100, Laurenz Albe <laurenz.albe@cybertec.at> wrote in <8dcb0bb248e9aaef0f1ef0faa27ab583bfa5bb84.camel@cybertec.at> > > > On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote: > > > > If a function's results can change across minor or major versions, we > > > > shouldn't be marking it as immutable because, by definition, it's not > > > > immutable. > > > > > > > > We, today, have a baked in assumption that any function marked as > > > > immutable will remain immutable across all major versions that we allow > > > > indexes to be retained through, which is all of them since about 8.3 at > > > > this point. > > > > > > > > We absolutely need a policy that if you decide to change the results of > > > > some immutable function across a major version change, you need to > > > > consider the results on indexes and possibly write into pg_upgrade > > > > checks to try and detect any usage of that immutable function. I hope > > > > we're in agreement there. > > > > > > It's hard to make a guarantee that a function will never change in the > > > future. What if we fix some rounding or overflow problem in a floating > > > point function? > > > > +1. > > > > Actually, some geometric comparisons are performed counting > > tolerance margin, the validity of which is in doubt. Their > > behavior has been changed in recent major version and still has a > > room for improvement, and the functions are parallel-safe and > > immutable. Immutablity is mentiond mainly in the light of > > optimization in the documentation. > > I really don't buy off on these arguments in the least. I also didn't > say that a function wasn't allowed to change- but that the output of an > immutable function, for a given input, shouldn't change and in the very > rare case where we absolutely had to make a change, it had better be for > a very good reason and we need to consider the impact on user indexes. > > I don't agree that we can simply declare that all functional and partial > indexes have to be rebuilt between every major version upgrade, which is > the alternative. There really isn't another option- either we're > extremely careful and take immutability of functions seriously and make > sure to preserve behavior, and therefore indexes, across major versions, > or we don't and we require indexes to be rebuilt. Even though index is the major place where the results of immutable funcions are stored without uesrs' clear consciousness, things are not restricted to indexes. Just as a thouht experiment with an extreme case, we can store the results of immutable functions into tables even whether it makes sense or not. We can dump'n-load onto PostgreSQL running on another platform. I'm not sure the new database can work as the same with the original one. I can't express my feeling well, but this is a problem of how far we guarantee the immutability after all. We guarantee planning-execution duration immutability, yes. We guarantee using plan caches, yes. Beyond restart, yes. Beyond minor-versions, maybe. Beyond major-versions, presumably? Beyond platform, improbably? For example, avg(), sum(), stddev() are immutable but I'm not sure they are absolutely and in-exact-meaning immutable in all of the senses above. (However, I don't think the error makes problematic differences..) # Yes, as I said above this is a very extreme case. We might need to split the immutable into two kinds of volatility level. Immutable as it can be used for indexes, and immutable-2 that immutable at least on running server. I'm not sure there are any function or how many of the funcitons are categorized to the latter, or we can sort them out. And I thing that it would be inconvenient if avg and sum were moved out of the former category.. > When it comes to the documentation, perhaps we should improve it to make > it clear that if you change the results from an immutable function then > you need to rebuild any indexes which use it. > > > > If we went that far, hardly any function could be IMMUTABLE. > > I don't agree. > > > > I think it is best to use IMMUTABLE whenever we don't expect it to > > > change and it is a function useful for indexes, and if it happens to > > > change nonetheless, write into the release notes that certain indexes > > > have to be rebuilt after upgrade. > > I don't agree with the notion of "if it happens to change" as that > sounds pretty cavalier, and only noting something in the release notes > is a good way to cause a lot of pain for users. If it's possible to do > better, which it often is (see the list of things pg_upgrade already > checks...) then we should do so. If there's reasons we can't, fine, it > happens, but that needs to be taken into account when we're considering > making such a change and its impact on users. > > > > Of course, there is no problem to mark pg_config as stable, because > > > there is little chance it will be used in an index anyway. > > > > Agreed. We have the following description in the documentation. > > I'm pretty sure we all agree that pg_config should be marked as stable > and that changing it from immutable to stable won't be an issue. > > > | A common error is to label a function IMMUTABLE when its results > > | depend on a configuration parameter. For example, a function that > > | manipulates timestamps might well have results that depend on the > > | TimeZone setting. For safety, such functions should be labeled > > | STABLE instead. > > > > Still this isn't mentioning upgrades. We could add to this that > > something like that immutablity is not guaranteed beyond > > PostgreSQL versions, the results of immutable functions might > > change for certain reasons including bug fixes, blah, blah.. > > Which means we're telling every user out there that they have to rebuild > every functional or partial index they have across every major version. > > I don't agree with that and I don't think we really have some pressing > need to be that free with making changes to immutable functions. > > I do think it'd be good to make a mention when talking about immutable > functions that they're allowed to be used in indexes and, as such, users > need to be careful to make sure that they don't change the results from > an immutable function unless they're sure it's not used in an index or > they're prepared to rebuild all indexes where it is used. I suppose that not a minor part of users aren't conscious of immutability so clearly.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, 2018-11-28 at 21:17 -0500, Stephen Frost wrote: > I don't agree that we can simply declare that all functional and partial > indexes have to be rebuilt between every major version upgrade, which is > the alternative. There really isn't another option- either we're > extremely careful and take immutability of functions seriously and make > sure to preserve behavior, and therefore indexes, across major versions, > or we don't and we require indexes to be rebuilt. I absolutely agree with that. I guess that when I said that we should declare functions IMMUTABLE even if they might change owing to bug fixes, I didn't make it clear that each such occurrence would necessitate a fat warning in the release notes that indexes using them have to be rebuilt. Yours, Laurenz Albe
Stephen Frost <sfrost@snowman.net> writes: > We, today, have a baked in assumption that any function marked as > immutable will remain immutable across all major versions that we allow > indexes to be retained through, which is all of them since about 8.3 at > this point. That's the ideal situation, yes. > I don't agree that we can simply declare that all functional and partial > indexes have to be rebuilt between every major version upgrade, which is > the alternative. I agree that that's quite impractical, but ... > There really isn't another option- either we're > extremely careful and take immutability of functions seriously and make > sure to preserve behavior, and therefore indexes, across major versions, > or we don't and we require indexes to be rebuilt. ... I find that position to be uselessly simplistic. There's a huge range of gray area here, and all the interesting cases fall in the gray area. For a couple of quick examples: * What about bugs? Should I not have fixed cbdb8b4c0 (wrong answer for ftoi4(INT_MAX)) on the grounds that doing so might break somebody's index? * A lot of relevant behavior isn't as much under our control as one might wish. For example, recompiling with a new compiler, or different optimization options, might affect the behavior of roundoff-sensitive floating point calculations. It's not practical to forbid people from doing that, nor to promise that it won't change results. At the same time, restricting the "immutable" label to things that seem very unlikely to be affected by any such considerations isn't a useful position for us to take. Functional indexes are just too useful, and the cases where they actually fail are not that frequent. Certainly, if we make a change that we know is likely to require people to reindex affected indexes, that should be documented in the release notes. But I think it's pointless to imagine that we can achieve perfection in identifying trouble cases, or even to spend significantly more resources than we do now on trying. I've not seen many field trouble reports that seem to trace back to such issues. But to bring this back to the thread title: pg_config is *intended* to change results across versions, or even just rebuilds. There's no definition of "immutable" under which it should qualify. At the same time, there's no obvious use-case for indexing its result, so that not marking it as immutable doesn't seem to have much downside. A contrary case that's worth remarking on is to_tsvector and related functions, which are marked immutable so that they can be indexed, even though we know darn well that their results depend on changeable text search configuration parameters. I remember complaining about this when the feature was first put in. Oleg and Teodor responded that it wouldn't cause big problems, and they were right (so far as I've heard about). That's partly because a change in to_tsvector's behavior might cause text searches to fail to match, but it won't result in index corruption or anything like that. So it's important to consider context and potential bad effects when deciding what we should allow here. Maybe the tl;dr version of that is that the immutable/stable/volatile division is too simplistic and we need to refine it. But I don't know what a better design would look like. Also, I suspect that real users are already hard put to it to label their functions correctly in the three-way regime. Making it more complicated might make things actively worse, if it increases the odds of functions being mislabeled. regards, tom lane
Greetings, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Certainly, if we make a change that we know is likely to require people to > reindex affected indexes, that should be documented in the release notes. > But I think it's pointless to imagine that we can achieve perfection in > identifying trouble cases, or even to spend significantly more resources > than we do now on trying. I've not seen many field trouble reports that > seem to trace back to such issues. Just to be clear- I think the way we've been operating is striking a good balance and that's because people do think about these things and consider what can happen in indexes when we make changes to immutable functions. This thread feels like there's people arguing we should reduce our level of effort there to a point where we don't care about on-disk representation in indexes across major versions when immutable functions are involved and I strongly disagree with that. > Maybe the tl;dr version of that is that the immutable/stable/volatile > division is too simplistic and we need to refine it. But I don't > know what a better design would look like. Also, I suspect that real > users are already hard put to it to label their functions correctly > in the three-way regime. Making it more complicated might make things > actively worse, if it increases the odds of functions being mislabeled. I haven't got any great ideas about what a better design would look like either. Just brainstorming, but perhaps having a flag that basically says "this can be used in indexes" might be useful as a way to filter out any immutable functions that shouldn't be used in indexes? I'm not sure that we've actually got any of those and I wouldn't want to arbitrairly limit what users can do without there being a good reason either. Maybe such a flag would make people (users and hackers alike) think a bit more about making changes and if they need to rebuild indexes or such? Might also just get in the way though. Thanks! Stephen
Attachment
On Wed, Nov 28, 2018 at 9:17 PM Stephen Frost <sfrost@snowman.net> wrote: > > Actually, some geometric comparisons are performed counting > > tolerance margin, the validity of which is in doubt. Their > > behavior has been changed in recent major version and still has a > > room for improvement, and the functions are parallel-safe and > > immutable. Immutablity is mentiond mainly in the light of > > optimization in the documentation. > > I really don't buy off on these arguments in the least. I also didn't > say that a function wasn't allowed to change- but that the output of an > immutable function, for a given input, shouldn't change and in the very > rare case where we absolutely had to make a change, it had better be for > a very good reason and we need to consider the impact on user indexes. I think you're getting a little carried away here. Kyotaro-san's argument seems extremely strong to me, so much so that I can't really see how you can argue against it. If we have a bug that says 1 + 1 = 3, we are going to fix it. We're not going say, oh, well, it's an immutable function, so we're just going to carry on having it return the wrong answer. That would be ridiculous. It feels to me like you are trying to turn this into some kind of principled stand against evil people who don't care about immutability, but rumors that the barbarians are at the gates seem greatly exaggerated from where I sit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Greetings, * Robert Haas (robertmhaas@gmail.com) wrote: > On Wed, Nov 28, 2018 at 9:17 PM Stephen Frost <sfrost@snowman.net> wrote: > > > Actually, some geometric comparisons are performed counting > > > tolerance margin, the validity of which is in doubt. Their > > > behavior has been changed in recent major version and still has a > > > room for improvement, and the functions are parallel-safe and > > > immutable. Immutablity is mentiond mainly in the light of > > > optimization in the documentation. > > > > I really don't buy off on these arguments in the least. I also didn't > > say that a function wasn't allowed to change- but that the output of an > > immutable function, for a given input, shouldn't change and in the very > > rare case where we absolutely had to make a change, it had better be for > > a very good reason and we need to consider the impact on user indexes. > > I think you're getting a little carried away here. Kyotaro-san's > argument seems extremely strong to me, so much so that I can't really > see how you can argue against it. If we have a bug that says 1 + 1 = > 3, we are going to fix it. We're not going say, oh, well, it's an > immutable function, so we're just going to carry on having it return > the wrong answer. That would be ridiculous. > > It feels to me like you are trying to turn this into some kind of > principled stand against evil people who don't care about > immutability, but rumors that the barbarians are at the gates seem > greatly exaggerated from where I sit. I'd suggest you read through the rest of the thread and see my response to Tom. Thanks! Stephen
Attachment
On Thu, Nov 29, 2018 at 3:50 PM Stephen Frost <sfrost@snowman.net> wrote: > I'd suggest you read through the rest of the thread and see my response > to Tom. I already did that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Greetings, * Robert Haas (robertmhaas@gmail.com) wrote: > On Thu, Nov 29, 2018 at 3:50 PM Stephen Frost <sfrost@snowman.net> wrote: > > I'd suggest you read through the rest of the thread and see my response > > to Tom. > > I already did that. Great, then it's unclear, to me at least, what you're getting at in your response here. Just to be clear as to which I was referring to, what I said there was that I felt the way we've been operating is striking a good balance and that my concern on this thread was that people appeared to be trying to move us away from that. Today, at least, I feel like we are careful and that we consider the impact on indexes when we're making changes to immutable functions, and that when we do make such changes, they're for very good reasons and that we make a note of them in release notes and possibly go farther, such as considering if we could check for their usage during pg_upgrade. Do you feel that's overly aggressive and that we shouldn't care about the impact of changing immutable functions that could be used in indexes across major versions? I don't get the impression that you do from your prior response, but then it doesn't square with the later comments about how I'm being carried away and trying to take some principled stand against evil, so I admit to being confused. Thanks! Stephen
Attachment
On Thu, Nov 29, 2018 at 4:01 PM Stephen Frost <sfrost@snowman.net> wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: > > On Thu, Nov 29, 2018 at 3:50 PM Stephen Frost <sfrost@snowman.net> wrote: > > > I'd suggest you read through the rest of the thread and see my response > > > to Tom. > > > > I already did that. > > Great, then it's unclear, to me at least, what you're getting at in your > response here. Well, I don't think it was particularly unclear. You seem to be attacking Kyotaro-san's position when, in my opinion, he's correct and you are wrong. It looks to me like Tom said more or less the same thing that Kyotaro-san did, just in different words, and you partially agreed with him, so I don't really know what's going on here, either. > Just to be clear as to which I was referring to, what I said there was > that I felt the way we've been operating is striking a good balance and > that my concern on this thread was that people appeared to be trying to > move us away from that. Today, at least, I feel like we are careful and > that we consider the impact on indexes when we're making changes to > immutable functions, and that when we do make such changes, they're for > very good reasons and that we make a note of them in release notes and > possibly go farther, such as considering if we could check for their > usage during pg_upgrade. > > Do you feel that's overly aggressive and that we shouldn't care about > the impact of changing immutable functions that could be used in indexes > across major versions? I don't get the impression that you do from your > prior response, but then it doesn't square with the later comments about > how I'm being carried away and trying to take some principled stand > against evil, so I admit to being confused. Generally, I think Andres is wrong to argue that immutability shouldn't mean *anything* across major versions. If we can readily foresee that something is going to change in the future, then we shouldn't mark it immutable. However: 1. Andres agrees that pg_config() should be marked stable, not immutable. If we agree about how the functions we ACTUALLY HAVE should be marked, it may not be 100% necessary to go on arguing about what we should do in hypothetical cases. 2. I do not believe that the fact that we have marked something immutable bars us from changing the behavior of that thing in a new major release if we decide we don't like the old behavior, as in the hypothetical 1 + 1 = 3 example I gave before, or the geometric type behavior Kyotaro-san mentioned, or the ftoi4(INT_MAX) case Tom mentioned. 2a. Also, while I believe that such changes should be release-noted like all other changes, I'm not sure we need to do any more than that unless the change is something that's likely to bite a lot of people. Since we are unlikely to get 1 + 1 wrong, most of the actual cases where this sort of thing comes up are going to be corner cases, and for many users a reindex will be unnecessary even if they pg_upgrade. If we make a change that we can foresee is going to cause widespread breakage, then more vigorous action is appropriate. *Maybe* it would be a good idea to somehow mark in the release notes all changes that might theoretically require a reindex ... but I'm not volunteering to do the work. In short, I don't see that there has been any big problem here in the past, and I don't see anyone making a proposal that would cause a big problem in the future. There is some mild disagreement about certain hypothetical situations and corner cases. Peace, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Greetings, * Robert Haas (robertmhaas@gmail.com) wrote: > Generally, I think Andres is wrong to argue that immutability > shouldn't mean *anything* across major versions. If we can readily > foresee that something is going to change in the future, then we > shouldn't mark it immutable. However: This sounds like we're in agreement then. I'll admit that I was responding to multiple comments up-thread in my response to Kyotaro-san and that it wasn't all intended against the specific comments brought up there, so, my apologies to Kyotaro-san for that. I should have been clearer as to what I was referring to. By and large I had viewed it as a furtherance of Andres' position, which was a mistake on my part. > In short, I don't see that there has been any big problem here in the > past, and I don't see anyone making a proposal that would cause a big > problem in the future. There is some mild disagreement about certain > hypothetical situations and corner cases. I don't believe there's been any big problem here in the past and my concern was what you mention also disagreeing with above- the idea that immutability shouldn't mean *anything* across major versions. While I definitely agree that we should be looking at things on an individual case basis, the implication that we don't need to worry about immutability across major versions definitely had me concerned. I'll grant that Andres' comments were likely more in passing and not intended as a proposal, and perhaps should have been glossed over until and unless an actual proposal or change was put forward. Thanks! Stephen
Attachment
Hi, On 2018-11-29 16:23:42 -0500, Robert Haas wrote: > Generally, I think Andres is wrong to argue that immutability > shouldn't mean *anything* across major versions. If we can readily > foresee that something is going to change in the future, then we > shouldn't mark it immutable. However: I was too glib/brief. All I meant is that we shouldn't take immutable to *guarantee* anything across major versions. We, of course, shouldn't break things willy-nilly, and consider the consequences of such potential breaking changes. Including having to reindex. It's not like that's only the case for changing immutable functions, the index storage itself etc also matter. Greetings, Andres Freund
At Thu, 29 Nov 2018 15:03:00 -0800, Andres Freund <andres@anarazel.de> wrote in <20181129230300.vkj3csjwk7jt2cfv@alap3.anarazel.de> > Hi, > > On 2018-11-29 16:23:42 -0500, Robert Haas wrote: > > Generally, I think Andres is wrong to argue that immutability > > shouldn't mean *anything* across major versions. If we can readily > > foresee that something is going to change in the future, then we > > shouldn't mark it immutable. However: > > I was too glib/brief. All I meant is that we shouldn't take immutable to > *guarantee* anything across major versions. We, of course, shouldn't > break things willy-nilly, and consider the consequences of such > potential breaking changes. Including having to reindex. It's not like > that's only the case for changing immutable functions, the index storage > itself etc also matter. FWIW, I agree to this. # And returning to the topic, I vote for pg_config should be "stable". regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 11/30/18 3:30 AM, Kyotaro HORIGUCHI wrote: > # And returning to the topic, I vote for pg_config should be "stable". And on that note, Does this change does warrant backpatching, or should be applied to master only? Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
Attachment
Greetings, * Joe Conway (mail@joeconway.com) wrote: > On 11/30/18 3:30 AM, Kyotaro HORIGUCHI wrote: > > # And returning to the topic, I vote for pg_config should be "stable". > > And on that note, Does this change does warrant backpatching, or should > be applied to master only? Given that it's a catalog change, I would think just master.. Thanks! Stephen
Attachment
Joe Conway <mail@joeconway.com> writes: > On 11/30/18 3:30 AM, Kyotaro HORIGUCHI wrote: >> # And returning to the topic, I vote for pg_config should be "stable". > And on that note, Does this change does warrant backpatching, or should > be applied to master only? I don't think back-patching the catalog change is really a good idea. The amount of work involved (e.g. release-noting how to perform the update on existing databases) is way out of proportion to the benefit for this particular case. regards, tom lane
On 11/30/18 10:32 AM, Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: >> On 11/30/18 3:30 AM, Kyotaro HORIGUCHI wrote: >>> # And returning to the topic, I vote for pg_config should be "stable". > >> And on that note, Does this change does warrant backpatching, or should >> be applied to master only? > > I don't think back-patching the catalog change is really a good idea. > The amount of work involved (e.g. release-noting how to perform the > update on existing databases) is way out of proportion to the benefit > for this particular case. Closing out at least this part of the thread, committed and pushed, master only. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development