Thread: DETERMINISTIC as synonym for IMMUTABLE
Hello, While reviewing PostgreSQL 8's SQL:2003 conformance (see the documentation mailing list), I've come across a - possibly - easy-to-implement conformance improvement for CREATE FUNCTION: SQL:2003 uses the word DETERMINISTIC where PostgreSQL uses IMMUTABLE. And SQL:2003 uses "NOT DETERMINISTIC" where PostgreSQL uses STABLE or VOLATILE. I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that "NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's grammar for CREATE FUNCTION. I might try creating a patch, but I'd rather spend my time finishing the conformance review. -- Greetings from Troels Arvin, Copenhagen, Denmark
Troels Arvin <troels@arvin.dk> writes: > I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that > "NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's > grammar for CREATE FUNCTION. These do NOT mean the same thing. regards, tom lane
On Sun, 17 Oct 2004 18:08:00 -0400, Tom Lane wrote: >> I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that >> "NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's >> grammar for CREATE FUNCTION. > > These do NOT mean the same thing. I'm having a hard time seeing the difference between DETERMINISTIC and IMMUTABLE. My suggestion for "NOT DETERMINISTIC"==VOLATILE is because VOLATILE seems to be the least strict of the three PostgreSQL volatility categories. Do you disagree on both, or just the last one? -- Greetings from Troels Arvin, Copenhagen, Denmark
Troels Arvin <troels@arvin.dk> writes: > On Sun, 17 Oct 2004 18:08:00 -0400, Tom Lane wrote: >> These do NOT mean the same thing. > I'm having a hard time seeing the difference between DETERMINISTIC and > IMMUTABLE. Well, the spec is somewhat self-contradictory on the point, but I think their intention is to model it after their notion of a deterministic query: A <query expression> or <query specification> is possibly non- deterministic if an SQL-implementation might,at two different times where the state of the SQL-data is the same, produce results that differ by morethan the order of the rows due to General Rules that specify implementation-dependent behavior. [SQL99 4.17] Notice that it is okay for a deterministic query to produce different results when the content of the database changes; therefore this is not IMMUTABLE in our terms. It is however stronger than our STABLE condition (for example, "now()" is STABLE but is not deterministic per the above definition). It appears to me that they are thinking of functions like SELECT value FROM table WHERE pkey = $1 which is deterministic per their definition and also according to (what I think is) the common meaning of "deterministic". We could label this function as STABLE, but not IMMUTABLE; however we have no category that captures the notion that "it can't change as long as the database content doesn't change". What it actually says about deterministic functions in 4.23 is: An SQL-invoked routine is either deterministic or possibly non- deterministic. An SQL-invoked function thatis deterministic always returns the same return value for a given list of SQL argument values. An SQL-invokedprocedure that is deterministic always returns the same values in its output and inout SQL parameters for a given list of SQL argument values. An SQL-invoked routine is possibly non-deterministic if, during invocationof that SQL- invoked routine, an SQL-implementation might, at two different times when the stateof the SQL-data is the same, produce unequal results due to General Rules that specify implementation-dependent behavior. This is clearly bogus as written since it claims that there are only two possibilities when there are more than two. Any ordinary function that selects from the database will satisfy neither their "deterministic" nor their "possibly non-deterministic" definitions. I think that they meant to define SQL functions as nondeterministic if they act like or contain nondeterministic queries; for instance 13.5 says 3) An <SQL procedure statement> S is possibly non-deterministic if and only if at least one of the followingis satisfied: a) S is a <select statement: single row> that is possibly non- deterministic. b) S contains a <routine invocation> whose subject routine is an SQL-invoked routine that possiblymodifies SQL-data. c) S generally contains a <query specification> or a <query expression> that is possibly non-deterministic. d) S generally contains a <datetime value function>, CURRENT_ USER, CURRENT_ROLE, SESSION_USER, orSYSTEM_USER. Anybody know whether the SQL2003 text clarifies the intent at all? In any case, whether or not you think DETERMINISTIC means IMMUTABLE, I don't think it's very helpful to identify NOT DETERMINISTIC with VOLATILE. As a counterexample, now() is NOT DETERMINISTIC, but it isn't VOLATILE. regards, tom lane
On Sun, 17 Oct 2004 19:36:16 -0400, Tom Lane wrote: > Well, the spec is somewhat self-contradictory on the point, but I think > their intention is to model it after their notion of a deterministic > query: > > A <query expression> or <query specification> is possibly non- > deterministic if an SQL-implementation might, at two different > times where the state of the SQL-data is the same, produce results > that differ by more than the order of the rows due to General Rules > that specify implementation-dependent behavior. [SQL99 4.17] This section has been removed in SQL:2003. Instead, a new section 4.16 ("Determinism") has been added. The first paragraph of the new section states: In general, an operation is deterministic if that operation assuredly computes identical results when repeated with identicalinput values. For an SQL-invoked routine, the values in the argument list are regarded as the input; otherwise,the SQL-data and the set of privileges by which they are accessed is regarded as the input. In my reading of the new section, there is nothing which indicates that determinism is related to whether stored data are changed or not. > What it actually says about deterministic functions in 4.23 is: [...] > An SQL-invoked routine > is possibly non-deterministic if, during invocation of that SQL- > invoked routine, an SQL-implementation might, at two different > times when the state of the SQL-data is the same, produce unequal > results [...] This paragraph has also been altered in SQL:2003. In SQL:2003's section 4.27.2, there is still a section on deterministic vs. possibly non-deterministic routines; it doesn't say anything about the "state of the SQL-data" any more. It says: An SQL-invoked routine is either deterministic or possibly non-deterministic. An SQL-invoked function that is deterministicalways returns the identical return value for a given list of SQL argument values. [... cut stuff about sql-invokedprocedures which PostgreSQL doesn't support yet...] An SQL-invoked routine is possibly non-deterministic if itmight produce nonidentical results when invoked with the identical list of SQL argument values. > I think that they meant to define SQL functions as nondeterministic if > they act like or contain nondeterministic queries; for instance 13.5 > says > > 3) An <SQL procedure statement> S is possibly non-deterministic if > and only if at least one of the following is satisfied: > > a) S is a <select statement: single row> that is possibly non- > deterministic. > > b) S contains a <routine invocation> whose subject routine is an > SQL-invoked routine that possibly modifies SQL-data. [...] This has also been changed in SQL:2003: 4) [<SQL procedure statement>] is possibly non-deterministic if and only if S is not an <SQL schema statement> and at least one of the following is satisfied: a) S is a <select statement: single row> that is possibly non-deterministic. b) S contains a <routine invocation>whose subject routine is an SQL-invoked routine that is possibly non-deterministic. c) S generally containsa <query specification> or a <query expression> that is possibly non-deterministic. d) S generally containsa <value expression> that is possibly non-deterministic. > Anybody know whether the SQL2003 text clarifies the intent at all? I think it's fair to say that SQL:2003 is more clear on this subject, and that PostgreSQL's IMMUTABLE is equivalent to SQL:2003's DETERMINISTIC. The remaining problem now becomes if NOT DETERMINISTIC could be introduced as an alias to VOLATILE: > I don't think it's very helpful to identify NOT DETERMINISTIC with > VOLATILE. As a counterexample, now() is NOT DETERMINISTIC, but it > isn't VOLATILE. But does this have any semantic significance? - I mean: It's still safe to call a function including now() NOT DETERMINISTIC==VOLATILE; no unexpected results should result from this, except - potentially - lower performance. I think it's a common phenomenon that performance can sometimes be increased by utilizing certain product-specific expressions in stead of the standards-defined ones. -- Greetings from Troels Arvin, Copenhagen, Denmark
>Tom Lane wrote > In any case, whether or not you think DETERMINISTIC means IMMUTABLE, Tom, Your knowledge of the confusing bits of the standard puts us all to shame. Troels did have a point, which was to do with standards conformance and compatibility. The main point at issue is whether someone can run some ANSI compliant SQL against PostgreSQL and have it work. That's a worthy goal. AFAICS, your info shows that the standard's definition of DETERMINISTIC is confusing and contradictory. Most people's interpretation would be that DETERMINISTIC was the same as IMMUTABLE, so we should make the former a synonym for the latter and document the possible difference of interpretation. Seriously, if you can't put a blade of grass between them then they're OK to be equated. My understanding is that DETERMINISTIC in Oracle would work the same as IMMUTABLE in PostgreSQL... > I don't think it's very helpful to identify NOT DETERMINISTIC with > VOLATILE. As a counterexample, now() is NOT DETERMINISTIC, but it > isn't VOLATILE. > You're spot on again with your info. NOT DETERMINISTIC means either STABLE or VOLATILE in PostgreSQL terms, not just one of those. IMHO we should allow the use of NOT DETERMINISTIC and document that although it doesn't mean the same thing as VOLATILE, we should infer that meaning because that is the mapping that is always correct. If the user wishes to gain the possible performance advantages offered by STABLE, then they can alter their code to do so. We're allowed to have performance enhancing additions to the standard. This is a similar situation to PostgreSQL's implementation of transaction isolation levels. The implementation is both implemented according to the standard and transactionally correct, yet READ UNCOMMITTED doesn't work *exactly* as the standard says that level should, yet this is all clearly documented and we are happy with that. The standard ain't perfect, but we should get as close as possible and document the difference - as long as there's no loss of correctness, which I don't think is at issue here. I'll submit a patch unless there is substantial disagreement. Best Regards, Simon Riggs
"Simon Riggs" <simon@2ndquadrant.com> writes: > AFAICS, your info shows that the standard's definition of DETERMINISTIC is > confusing and contradictory. Most people's interpretation would be that > DETERMINISTIC was the same as IMMUTABLE, so we should make the former a > synonym for the latter and document the possible difference of > interpretation. Seriously, if you can't put a blade of grass between them > then they're OK to be equated. The problem is that you *can* put a blade of grass between them, and sooner or later we may wish to make the distinction. In particular I can think of optimization possibilities that rely on being able to identify a function as being a pure function of the database state. (This would mostly come into play if we ever try to support function result caching; we don't now, but certainly it's been asked for often enough.) If we equate DETERMINISTIC with IMMUTABLE then we won't be able to use DETERMINISTIC to describe cache-able functions. Troels's later followup says that SQL2003 has redefined DETERMINISTIC to in fact mean IMMUTABLE; if so I suppose we'll have to go with the flow. I have not looked closely at that version to see if I agree with his reading. regards, tom lane