Re: row() is [not] null infelicities - Mailing list pgsql-hackers

From Joe Conway
Subject Re: row() is [not] null infelicities
Date
Msg-id 44B46EA0.8090500@joeconway.com
Whole thread Raw
In response to Re: row() is [not] null infelicities  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
> 
>>The SQL spec has some detailed discussion of some strange null behaviours.
> 
> BTW, Teodor Sigaev pointed out today that we are also doing array
> comparisons (array_eq, array_cmp) wrong.

Seems to me like at least array_eq is correct (from SQL2003):

4.10 Collection types
Let A1 and A2 be arrays of EDT. A1 and A2 are identical if and only if
A1 and A2 have the same cardinality n and if, for all i in the range 1
(one) ≤ i ≤ n, the element at ordinal position i in A1 is *identical* to
the element at ordinal position i in A2.

9.8 Determination of identical values
Function
Determine whether two instances of values are identical, that is to say, 
are occurrences of the same value.
General Rules
1) Let V1 and V2 be two values specified in an application of this 
Subclause.
NOTE 211 — This Subclause is invoked implicitly wherever the word 
identical is used of two values.
2) Case:
a) If V1 and V2 are both null, then V1 is *identical* to V2.


regression=# select array[1,null,3] = array[1,null,3]; ?column?
---------- t
(1 row)

> In the recent extension to make arrays support NULL entries, I had made these functions
> treat NULL as greater than all non-nulls, per btree sort order.
> But this seems wrong and also counter to spec: if an array comparison
> finds a NULL before determining its result, it should return NULL,
> same as a row comparison would do.  The problem with this is that it
> breaks btree indexing of array columns (... and I think btree indexing
> of rowtypes has a problem too ...).  btree wants to have a well-defined
> ordering of any two non-null values.  Ideas?

Interestingly, I see this in SQL2003:

9.12 Ordering operations
Function
Specify the prohibitions and restrictions by data type on operations 
that involve ordering of data.
[...]
3) The declared type of an operand of an ordering operation shall not be
LOB-ordered, array-ordered, multisetordered, reference-ordered,
UDT-EC-ordered, or UDT-NC-ordered.


4.1.4 Comparison and ordering
[...]
— T is a collection type and the element type of T is S-ordered.
[...]
The notion of S-ordered is applied in the following definitions:
[...]
— A type T is array-ordered if T is ARR-ordered, where ARR is the set of 
array types.

Does that say arrays can't be ordered? Or does it say that the ordering 
follows the say rules as the array element type? If it is the latter, 
aren't we already doing the right thing?

regression=# (select 1 union all select null union all select 3) order by 1; ?column?
----------        1        3

(3 rows)

regression=# (select array[1,1] union all select array[1,null] union all 
select array[1,3]) order by 1;  array
---------- {1,1} {1,3} {1,NULL}
(3 rows)


> A nearby issue is that the spec seems to want IS [NOT] DISTINCT FROM
> to drill down into array and row values, ie, comparing arrays with
> these functions needs to consider null entries as comparable instead
> of forcing a null result.  AFAICS this will require special-casing
> array and row types in IS [NOT] DISTINCT FROM ... anyone see a better
> way?

Yup, that's the way I read it too. Of course, that seems to work too:

regression=# select array[1,null,3] is distinct from array[1,null,3]; ?column?
---------- f
(1 row)

regression=# select array[1,null,3] is not distinct from array[1,null,3]; ?column?
---------- t
(1 row)

Hmm, did I miss a commit message since you posted this?

Joe


pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Three weeks left until feature freeze
Next
From: Greg Stark
Date:
Subject: Re: [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()