Thread: DETERMINISTIC as synonym for IMMUTABLE

DETERMINISTIC as synonym for IMMUTABLE

From
Troels Arvin
Date:
Hello,

While reviewing PostgreSQL 8's SQL:2003 conformance (see the documentation
mailing list), I've come across a - possibly - easy-to-implement
conformance improvement for CREATE FUNCTION:

SQL:2003 uses the word DETERMINISTIC where PostgreSQL uses IMMUTABLE.
And SQL:2003 uses "NOT DETERMINISTIC" where PostgreSQL uses STABLE
or VOLATILE.

I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that
"NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's
grammar for CREATE FUNCTION.

I might try creating a patch, but I'd rather spend my time finishing the
conformance review.

-- 
Greetings from Troels Arvin, Copenhagen, Denmark




Re: DETERMINISTIC as synonym for IMMUTABLE

From
Tom Lane
Date:
Troels Arvin <troels@arvin.dk> writes:
> I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that
> "NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's
> grammar for CREATE FUNCTION.

These do NOT mean the same thing.
        regards, tom lane


Re: DETERMINISTIC as synonym for IMMUTABLE

From
Troels Arvin
Date:
On Sun, 17 Oct 2004 18:08:00 -0400, Tom Lane wrote:

>> I suggest that DETERMINISTIC be added as a synonym for IMMUTABLE and that
>> "NOT DETERMINISTIC" be added as an alias for VOLATILE in PostgreSQL's
>> grammar for CREATE FUNCTION.
> 
> These do NOT mean the same thing.

I'm having a hard time seeing the difference between DETERMINISTIC and
IMMUTABLE.

My suggestion for "NOT DETERMINISTIC"==VOLATILE is because VOLATILE seems
to be the least strict of the three PostgreSQL volatility categories.

Do you disagree on both, or just the last one?

-- 
Greetings from Troels Arvin, Copenhagen, Denmark




Re: DETERMINISTIC as synonym for IMMUTABLE

From
Tom Lane
Date:
Troels Arvin <troels@arvin.dk> writes:
> On Sun, 17 Oct 2004 18:08:00 -0400, Tom Lane wrote:
>> These do NOT mean the same thing.

> I'm having a hard time seeing the difference between DETERMINISTIC and
> IMMUTABLE.

Well, the spec is somewhat self-contradictory on the point, but I think
their intention is to model it after their notion of a deterministic
query:
        A <query expression> or <query specification> is possibly non-        deterministic if an SQL-implementation
might,at two different        times where the state of the SQL-data is the same, produce results        that differ by
morethan the order of the rows due to General Rules        that specify implementation-dependent behavior.   [SQL99
4.17]

Notice that it is okay for a deterministic query to produce different
results when the content of the database changes; therefore this is not
IMMUTABLE in our terms.  It is however stronger than our STABLE
condition (for example, "now()" is STABLE but is not deterministic per
the above definition).  It appears to me that they are thinking of
functions like
SELECT value FROM table WHERE pkey = $1

which is deterministic per their definition and also according to (what
I think is) the common meaning of "deterministic".  We could label this
function as STABLE, but not IMMUTABLE; however we have no category that
captures the notion that "it can't change as long as the database
content doesn't change".

What it actually says about deterministic functions in 4.23 is:
        An SQL-invoked routine is either deterministic or possibly non-        deterministic. An SQL-invoked function
thatis deterministic always        returns the same return value for a given list of SQL argument        values. An
SQL-invokedprocedure that is deterministic always        returns the same values in its output and inout SQL parameters
      for a given list of SQL argument values. An SQL-invoked routine        is possibly non-deterministic if, during
invocationof that SQL-        invoked routine, an SQL-implementation might, at two different        times when the
stateof the SQL-data is the same, produce unequal        results due to General Rules that specify
implementation-dependent       behavior.
 

This is clearly bogus as written since it claims that there are only two
possibilities when there are more than two.  Any ordinary function that
selects from the database will satisfy neither their "deterministic" nor
their "possibly non-deterministic" definitions.

I think that they meant to define SQL functions as nondeterministic if
they act like or contain nondeterministic queries; for instance 13.5
says
        3) An <SQL procedure statement> S is possibly non-deterministic if           and only if at least one of the
followingis satisfied:
 
           a) S is a <select statement: single row> that is possibly non-             deterministic.
           b) S contains a <routine invocation> whose subject routine is an             SQL-invoked routine that
possiblymodifies SQL-data.
 
           c) S generally contains a <query specification> or a <query             expression> that is possibly
non-deterministic.
           d) S generally contains a <datetime value function>, CURRENT_             USER, CURRENT_ROLE, SESSION_USER,
orSYSTEM_USER.
 

Anybody know whether the SQL2003 text clarifies the intent at all?

In any case, whether or not you think DETERMINISTIC means IMMUTABLE,
I don't think it's very helpful to identify NOT DETERMINISTIC with
VOLATILE.  As a counterexample, now() is NOT DETERMINISTIC, but it
isn't VOLATILE.
        regards, tom lane


Re: DETERMINISTIC as synonym for IMMUTABLE

From
Troels Arvin
Date:
On Sun, 17 Oct 2004 19:36:16 -0400, Tom Lane wrote:

> Well, the spec is somewhat self-contradictory on the point, but I think
> their intention is to model it after their notion of a deterministic
> query:
> 
>          A <query expression> or <query specification> is possibly non-
>          deterministic if an SQL-implementation might, at two different
>          times where the state of the SQL-data is the same, produce results
>          that differ by more than the order of the rows due to General Rules
>          that specify implementation-dependent behavior.   [SQL99 4.17]

This section has been removed in SQL:2003. Instead, a new section 4.16
("Determinism") has been added. The first paragraph of the new section
states:
 In general, an operation is deterministic if that operation assuredly computes identical results when repeated with
identicalinput values. For an SQL-invoked routine, the values in the argument list are regarded as the input;
otherwise,the SQL-data and the set of privileges by which they are accessed is regarded as the input.
 

In my reading of the new section, there is nothing which indicates that
determinism is related to whether stored data are changed or not.

> What it actually says about deterministic functions in 4.23 is:
[...]
>          An SQL-invoked routine
>          is possibly non-deterministic if, during invocation of that SQL-
>          invoked routine, an SQL-implementation might, at two different
>          times when the state of the SQL-data is the same, produce unequal
>          results
[...]

This paragraph has also been altered in SQL:2003. In SQL:2003's section
4.27.2, there is still a section on deterministic vs. possibly
non-deterministic routines; it doesn't say anything about the "state of
the SQL-data" any more. It says:
 An SQL-invoked routine is either deterministic or possibly non-deterministic. An SQL-invoked function that is
deterministicalways returns the identical return value for a given list of SQL argument values. [... cut stuff about
sql-invokedprocedures which PostgreSQL doesn't support yet...] An SQL-invoked routine is possibly non-deterministic if
itmight produce nonidentical results when invoked with the identical list of SQL argument values.
 

> I think that they meant to define SQL functions as nondeterministic if
> they act like or contain nondeterministic queries; for instance 13.5
> says
> 
>          3) An <SQL procedure statement> S is possibly non-deterministic if
>             and only if at least one of the following is satisfied:
> 
>             a) S is a <select statement: single row> that is possibly non-
>               deterministic.
> 
>             b) S contains a <routine invocation> whose subject routine is an
>               SQL-invoked routine that possibly modifies SQL-data.
[...]

This has also been changed in SQL:2003:

4) [<SQL procedure statement>] is possibly non-deterministic if and only
if S is not an <SQL schema statement> and at least one of the following is
satisfied:  a) S is a <select statement: single row> that is possibly     non-deterministic.  b) S contains a <routine
invocation>whose subject routine is an     SQL-invoked routine that is possibly non-deterministic.  c) S generally
containsa <query specification> or a     <query expression> that is possibly non-deterministic.  d) S generally
containsa <value expression> that is possibly     non-deterministic.
 

> Anybody know whether the SQL2003 text clarifies the intent at all?

I think it's fair to say that SQL:2003 is more clear on this subject, and
that PostgreSQL's IMMUTABLE is equivalent to SQL:2003's DETERMINISTIC. The
remaining problem now becomes if NOT DETERMINISTIC could be introduced as
an alias to VOLATILE:

> I don't think it's very helpful to identify NOT DETERMINISTIC with
> VOLATILE.  As a counterexample, now() is NOT DETERMINISTIC, but it
> isn't VOLATILE.

But does this have any semantic significance? - I mean: It's still safe to
call a function including now() NOT DETERMINISTIC==VOLATILE; no unexpected
results should result from this, except - potentially - lower performance.
I think it's a common phenomenon that performance can sometimes be
increased by utilizing certain product-specific expressions in stead of
the standards-defined ones.

-- 
Greetings from Troels Arvin, Copenhagen, Denmark




Re: DETERMINISTIC as synonym for IMMUTABLE

From
"Simon Riggs"
Date:
>Tom Lane wrote
> In any case, whether or not you think DETERMINISTIC means IMMUTABLE,

Tom, Your knowledge of the confusing bits of the standard puts us all to
shame.

Troels did have a point, which was to do with standards conformance and
compatibility. The main point at issue is whether someone can run some ANSI
compliant SQL against PostgreSQL and have it work. That's a worthy goal.

AFAICS, your info shows that the standard's definition of DETERMINISTIC is
confusing and contradictory. Most people's interpretation would be that
DETERMINISTIC was the same as IMMUTABLE, so we should make the former a
synonym for the latter and document the possible difference of
interpretation. Seriously, if you can't put a blade of grass between them
then they're OK to be equated.

My understanding is that DETERMINISTIC in Oracle would work the same as
IMMUTABLE in PostgreSQL...

> I don't think it's very helpful to identify NOT DETERMINISTIC with
> VOLATILE.  As a counterexample, now() is NOT DETERMINISTIC, but it
> isn't VOLATILE.
>

You're spot on again with your info. NOT DETERMINISTIC means either STABLE
or VOLATILE in PostgreSQL terms, not just one of those.

IMHO we should allow the use of NOT DETERMINISTIC and document that although
it doesn't mean the same thing as VOLATILE, we should infer that meaning
because that is the mapping that is always correct. If the user wishes to
gain the possible performance advantages offered by STABLE, then they can
alter their code to do so. We're allowed to have performance enhancing
additions to the standard.

This is a similar situation to PostgreSQL's implementation of transaction
isolation levels. The implementation is both implemented according to the
standard and transactionally correct, yet READ UNCOMMITTED doesn't work
*exactly* as the standard says that level should, yet this is all clearly
documented and we are happy with that.

The standard ain't perfect, but we should get as close as possible and
document the difference - as long as there's no loss of correctness, which I
don't think is at issue here.

I'll submit a patch unless there is substantial disagreement.

Best Regards,

Simon Riggs



Re: DETERMINISTIC as synonym for IMMUTABLE

From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> AFAICS, your info shows that the standard's definition of DETERMINISTIC is
> confusing and contradictory. Most people's interpretation would be that
> DETERMINISTIC was the same as IMMUTABLE, so we should make the former a
> synonym for the latter and document the possible difference of
> interpretation. Seriously, if you can't put a blade of grass between them
> then they're OK to be equated.

The problem is that you *can* put a blade of grass between them, and
sooner or later we may wish to make the distinction.  In particular
I can think of optimization possibilities that rely on being able to
identify a function as being a pure function of the database state.
(This would mostly come into play if we ever try to support function
result caching; we don't now, but certainly it's been asked for often
enough.)  If we equate DETERMINISTIC with IMMUTABLE then we won't be
able to use DETERMINISTIC to describe cache-able functions.

Troels's later followup says that SQL2003 has redefined DETERMINISTIC
to in fact mean IMMUTABLE; if so I suppose we'll have to go with the
flow.  I have not looked closely at that version to see if I agree with
his reading.
        regards, tom lane