Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query - Mailing list pgsql-bugs

From Andrew Gierth
Subject Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Date
Msg-id 87sh3g4h8n.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> Marking the function parallel safe doesn't seem wrong to me. The
 >> non-parallel-safe part is that the input gets fed to it in different
 >> order in different workers. And I don't really think that to be the
 >> function's fault.

 Tom> So that basically opens the question of whether *any* window
 Tom> function calculation can safely be pushed down to parallel
 Tom> workers.

Grepping the spec for the phrase "possibly non-deterministic" is quite
enlightening. Leaving out non-determinisms caused by timezone or actual
volatility, leaving out cases of non-determinism that we'd call
"stable", and leaving out features like multisets that we don't support
at all, here's the list of interesting cases (comments after each quoted
paragraph are mine):

6.28 <value expression>

  d) An <array value constructor by query>.

i.e. ARRAY(select)

  o) An <aggregate function> that specifies MIN or MAX and that simply
     contains a <value expression> whose declared type is based on a
     character string type, user-defined type, or datetime with time
     zone type.

i.e. MIN(x) is non-deterministic if "x" can have distinguishable values
that compare equal. PG doesn't have that for text or timestamptz, unlike
the spec, but it does for citext or other user-defined types.

  q) An <array aggregate function>.

i.e. array_agg()

  u) A <window function> that specifies ROW_NUMBER, FIRST_VALUE,
     LAST_VALUE, NTH_VALUE, NTILE, LEAD, or LAG, or whose associated
     <window specification> specifies ROWS.

This covers those cases where window functions don't treat peer rows
together.

7.6 <table reference>

  27) A <table reference> is possibly non-deterministic if the simply
      contained <table primary> or <joined table> is possibly
      non-deterministic or if <sample clause> is specified.

i.e. TABLESAMPLE is non-deterministic

7.16 <query specification>

  a) The <set quantifier> DISTINCT is specified and one of the columns
     of T has a data type of character string, user-defined type, TIME
     WITH TIME ZONE, or TIMESTAMP WITH TIME ZONE.

  c) The <select list>, <having clause>, or <window clause> contains a
     reference to a column C of T that has a data type of character
     string, user-defined type, TIME WITH TIME ZONE, or TIMESTAMP WITH
     TIME ZONE, and the functional dependency G C, where G is the set
     consisting of the grouping columns of T, holds in T.

For both the above two cases, if distinguishable values of a type
compare equal, it's non-deterministic which gets into the result.

7.17 <query expression>

  a) The <query expression> contains a <result offset clause>.

  b) The <query expression> contains a <fetch first clause>.

  f) Both of the following are true:

    i) T contains a set operator UNION and ALL is not specified, or T
       contains either of the set operators EXCEPT or INTERSECT.

   ii) At least one of the following is true:

       1) The first or second operand contains a column that has a
          declared type of character string.

       2) The first or second operand contains a column that has a
          declared type of datetime with time zone.

       3) The first or second operand contains a column that has a
          declared type that is a user-defined type.

(I've left out the many clauses which just amount to "if $thing contains
something which is possibly non-deterministic then it is possibly
non-deterministic")

-- 
Andrew (irc:RhodiumToad)


pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #15326: keep getting "ERROR: type "earth" does not exist" whileit does exist
Next
From: Amit Kapila
Date:
Subject: Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query