Thread: WIP json generation enhancements

WIP json generation enhancements

From
Andrew Dunstan
Date:
Here is a WIP patch for enhancements to json generation.

First, there is the much_requested json_agg, which will aggregate rows
directly to json. So the following will now work:

     select json_agg(my_table) from mytable;
     select json_agg(q) from (<myquery here>) q;

One open question regarding this feature is whether this should return
NULL or '[]' for 0 rows. Currently it returns NULL but I could be
convinced to return '[]', and the change would be very small.

Next is to_json(), which will turn any value into json, so we're no
longer restricted to rows and arrays.

Non-builtin types are now searched for a cast to json, and if it exists
it is used instead of the type's text representation. I didn't add a
special type to look for a cast to, as was discussed before, as it
seemed a bit funky and unnecessary. It can easily be added, but I'm
still not convinced it's a good idea. Note that this is only done for
types that aren't builtin - we know how to turn all of those into json
without needing to look for a cast.

Along with this there is an hstore_to_json() function added to the
hstore module, and a cast from hstore to json that uses it. This
function treats every value in the hstore as a string. There is also a
function with the working title of hstore_to_json_loose() that does a
heuristic conversion that treats values of 't' and 'f' as booleans, and
strings that look like numbers as numbers unless they start with a
leading 0 followed by another digit (could be zip codes, phone numbers
etc.) The difference between these is illustrated here (notice that
quoted '"t"' becomes unquoted 'true' and quoted '"1"' becomes '1'):

    andrew=# select json_agg(q) from foo q;
                                 json_agg
    -----------------------------------------------------------------
      [{"a":"a","b":1,"h":{"c": "t", "d": null, "q": "1", "x": "y"}}]
    (1 row)

    andrew=# select json_agg(q) from (select a, b, hstore_to_json_loose(h) as h from foo) q;
                                 json_agg
    ----------------------------------------------------------------
      [{"a":"a","b":1,"h":{"c": true, "d": null, "q": 1, "x": "y"}}]
    (1 row)

Note: this patch will need a change in the oids used for the new functions if applied against git tip, as they have
beenovertaken by time. 


Comments welcome.


cheers

andrew




Attachment

Re: WIP json generation enhancements

From
Andrew Dunstan
Date:
On 11/21/2012 03:16 PM, Andrew Dunstan wrote:
> Here is a WIP patch for enhancements to json generation.
>
> First, there is the much_requested json_agg, which will aggregate rows
> directly to json. So the following will now work:
>
>     select json_agg(my_table) from mytable;
>     select json_agg(q) from (<myquery here>) q;
>
> One open question regarding this feature is whether this should return
> NULL or '[]' for 0 rows. Currently it returns NULL but I could be
> convinced to return '[]', and the change would be very small.
>
> Next is to_json(), which will turn any value into json, so we're no
> longer restricted to rows and arrays.
>
> Non-builtin types are now searched for a cast to json, and if it
> exists it is used instead of the type's text representation. I didn't
> add a special type to look for a cast to, as was discussed before, as
> it seemed a bit funky and unnecessary. It can easily be added, but I'm
> still not convinced it's a good idea. Note that this is only done for
> types that aren't builtin - we know how to turn all of those into json
> without needing to look for a cast.
>
> Along with this there is an hstore_to_json() function added to the
> hstore module, and a cast from hstore to json that uses it. This
> function treats every value in the hstore as a string. There is also a
> function with the working title of hstore_to_json_loose() that does a
> heuristic conversion that treats values of 't' and 'f' as booleans,
> and strings that look like numbers as numbers unless they start with a
> leading 0 followed by another digit (could be zip codes, phone numbers
> etc.) The difference between these is illustrated here (notice that
> quoted '"t"' becomes unquoted 'true' and quoted '"1"' becomes '1'):
>
>    andrew=# select json_agg(q) from foo q;
>                                 json_agg
> -----------------------------------------------------------------
>      [{"a":"a","b":1,"h":{"c": "t", "d": null, "q": "1", "x": "y"}}]
>    (1 row)
>
>    andrew=# select json_agg(q) from (select a, b,
> hstore_to_json_loose(h) as h from foo) q;
>                                 json_agg
> ----------------------------------------------------------------
>      [{"a":"a","b":1,"h":{"c": true, "d": null, "q": 1, "x": "y"}}]
>    (1 row)
>
> Note: this patch will need a change in the oids used for the new
> functions if applied against git tip, as they have been overtaken by
> time.
>
>
> Comments welcome.
>
>
>

Updated patch that works with git tip and has regression tests.

cheers

andrew




Attachment

Re: WIP json generation enhancements

From
Dimitri Fontaine
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Here is a WIP patch for enhancements to json generation.
>
> First, there is the much_requested json_agg, which will aggregate rows
> directly to json. So the following will now work:
>
>     select json_agg(my_table) from mytable;
>     select json_agg(q) from (<myquery here>) q;

Awesome, thanks!

How do you handle the nesting of the source elements? I would expect a
variant of the aggregate that takes two input parameters, the datum and
the current nesting level.

Consider a tree table using parent_id and a recursive query to display
the tree. You typically handle the nesting with an accumulator and a
call to repeat() to prepend some spaces before the value columns. What
about passing that nesting level (integer) to the json_agg()?

Here's a worked out example:
   CREATE TABLE parent_child (       parent_id integer NOT NULL,       this_node_id integer NULL   );      INSERT INTO
parent_child(parent_id, this_node_id) VALUES (0, 1);   INSERT INTO parent_child (parent_id, this_node_id) VALUES (1,
2);  INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 3);   INSERT INTO parent_child (parent_id,
this_node_id)VALUES (1, 4);   INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 5);   INSERT INTO
parent_child(parent_id, this_node_id) VALUES (2, 6);   INSERT INTO parent_child (parent_id, this_node_id) VALUES (4,
7);  INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 8);   INSERT INTO parent_child (parent_id,
this_node_id)VALUES (4, 9);   INSERT INTO parent_child (parent_id, this_node_id) VALUES (9, 10);   WITH RECURSIVE
tree(id,level, parents) AS (       SELECT this_node_id as id, 0 as level, '{}'::int[] as parents         FROM
parent_child       WHERE parent_id = 0               UNION ALL              SELECT this_node_id as id, t.level + 1,
t.parents|| c.parent_id         FROM parent_child c         JOIN tree t ON t.id = c.parent_id   )   SELECT json_agg(id,
level)    FROM tree;
 

I've left the parents column in the query above as a debug facility, but
it's not needed in that case.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: WIP json generation enhancements

From
Andrew Dunstan
Date:
On 11/22/2012 05:54 AM, Dimitri Fontaine wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Here is a WIP patch for enhancements to json generation.
>>
>> First, there is the much_requested json_agg, which will aggregate rows
>> directly to json. So the following will now work:
>>
>>      select json_agg(my_table) from mytable;
>>      select json_agg(q) from (<myquery here>) q;
> Awesome, thanks!
>
> How do you handle the nesting of the source elements? I would expect a
> variant of the aggregate that takes two input parameters, the datum and
> the current nesting level.
>
> Consider a tree table using parent_id and a recursive query to display
> the tree. You typically handle the nesting with an accumulator and a
> call to repeat() to prepend some spaces before the value columns. What
> about passing that nesting level (integer) to the json_agg()?
>
> Here's a worked out example:
>
>      CREATE TABLE parent_child (
>          parent_id integer NOT NULL,
>          this_node_id integer NULL
>      );
>      
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (0, 1);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 2);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 3);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 4);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 5);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 6);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 7);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 8);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 9);
>      INSERT INTO parent_child (parent_id, this_node_id) VALUES (9, 10);
>     
>     
>      WITH RECURSIVE tree(id, level, parents) AS (
>          SELECT this_node_id as id, 0 as level, '{}'::int[] as parents
>            FROM parent_child
>           WHERE parent_id = 0
>           
>          UNION ALL
>          
>          SELECT this_node_id as id, t.level + 1, t.parents || c.parent_id
>            FROM parent_child c
>            JOIN tree t ON t.id = c.parent_id
>      )
>      SELECT json_agg(id, level)
>        FROM tree;
>
> I've left the parents column in the query above as a debug facility, but
> it's not needed in that case.
>

the function only takes a single argument and aggregates all the values 
into a json array. If the arguments are composites they will produce 
json objects.

People complained that to get a resultset as json you have to do in 9.2
    select array_to_json(array_agg(q)) ...

which is both a bit cumbersome and fairly inefficient. json_agg(q) is 
equivalent to the above expression but is both simpler and much more 
efficient.

If you want a tree structured object you'll need to construct it 
yourself - this function won't do the nesting for you. That's beyond its 
remit.

cheers

andrew



Re: WIP json generation enhancements

From
Robert Haas
Date:
On Wed, Nov 21, 2012 at 3:16 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Non-builtin types are now searched for a cast to json, and if it exists it
> is used instead of the type's text representation. I didn't add a special
> type to look for a cast to, as was discussed before, as it seemed a bit
> funky and unnecessary. It can easily be added, but I'm still not convinced
> it's a good idea. Note that this is only done for types that aren't builtin
> - we know how to turn all of those into json without needing to look for a
> cast.

The place where I fear this will cause problems is with non-core
text-like datatypes, such as citext.  For example, I wonder if
creating a cast from citext to json - which seems like a sensible
thing to want to do for other reasons - changes the semantics of this
function when applied to citext objects.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: WIP json generation enhancements

From
Andrew Dunstan
Date:
On 11/26/2012 10:55 AM, Robert Haas wrote:
> On Wed, Nov 21, 2012 at 3:16 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Non-builtin types are now searched for a cast to json, and if it exists it
>> is used instead of the type's text representation. I didn't add a special
>> type to look for a cast to, as was discussed before, as it seemed a bit
>> funky and unnecessary. It can easily be added, but I'm still not convinced
>> it's a good idea. Note that this is only done for types that aren't builtin
>> - we know how to turn all of those into json without needing to look for a
>> cast.
> The place where I fear this will cause problems is with non-core
> text-like datatypes, such as citext.  For example, I wonder if
> creating a cast from citext to json - which seems like a sensible
> thing to want to do for other reasons - changes the semantics of this
> function when applied to citext objects.
>


I don't understand why you would want to create such a cast. If the cast 
doesn't exist it will do exactly what it does now, i.e. use the type's 
output function and then json quote and escape it, which in the case of 
citext is the Right Thing (tm):
   andrew=# select to_json('foo"bar'::citext);      to_json   ------------     "foo\"bar"


cheers

andrew





Re: WIP json generation enhancements

From
Robert Haas
Date:
On Mon, Nov 26, 2012 at 11:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> I don't understand why you would want to create such a cast. If the cast
> doesn't exist it will do exactly what it does now, i.e. use the type's
> output function and then json quote and escape it, which in the case of
> citext is the Right Thing (tm):
>
>    andrew=# select to_json('foo"bar'::citext);
>       to_json
>    ------------
>      "foo\"bar"

I'm not sure either, but setting up a system where seemingly innocuous
actions can in fact have surprising and not-easily-fixable
consequences in other parts of the system doesn't seem like good
design to me.  Of course, maybe I'm misunderstanding what will happen;
I haven't actually tested it myself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: WIP json generation enhancements

From
Merlin Moncure
Date:
On Thu, Nov 22, 2012 at 4:54 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Here is a WIP patch for enhancements to json generation.
>>
>> First, there is the much_requested json_agg, which will aggregate rows
>> directly to json. So the following will now work:
>>
>>     select json_agg(my_table) from mytable;
>>     select json_agg(q) from (<myquery here>) q;
>
> Awesome, thanks!
>
> How do you handle the nesting of the source elements? I would expect a
> variant of the aggregate that takes two input parameters, the datum and
> the current nesting level.
>
> Consider a tree table using parent_id and a recursive query to display
> the tree. You typically handle the nesting with an accumulator and a
> call to repeat() to prepend some spaces before the value columns. What
> about passing that nesting level (integer) to the json_agg()?
>
> Here's a worked out example:
>
>     CREATE TABLE parent_child (
>         parent_id integer NOT NULL,
>         this_node_id integer NULL
>     );
>
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (0, 1);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 2);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 3);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 4);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 5);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 6);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 7);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 8);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 9);
>     INSERT INTO parent_child (parent_id, this_node_id) VALUES (9, 10);
>
>
>     WITH RECURSIVE tree(id, level, parents) AS (
>         SELECT this_node_id as id, 0 as level, '{}'::int[] as parents
>           FROM parent_child
>          WHERE parent_id = 0
>
>         UNION ALL
>
>         SELECT this_node_id as id, t.level + 1, t.parents || c.parent_id
>           FROM parent_child c
>           JOIN tree t ON t.id = c.parent_id
>     )
>     SELECT json_agg(id, level)
>       FROM tree;
>
> I've left the parents column in the query above as a debug facility, but
> it's not needed in that case.

I don't think there is any way a json_agg() function could reasonably
do this.  The only possible dataset it could work on would be for
homogeneously typed array-like data which is not very interesting for
the broader case of nested json productions. I think the right way to
do it is to work out the precise structure in sql and do the
transformation directly on the type using the already existing
transformation functions.

That said, recursive structures are a pain presently because postgres
composite types are not allowed to be recursive.  So ISTM getting that
restriction relaxed is the way to go; then you build it in sql and
just fire whatever xxx_to_json happens to be the most
appropriate...then your example would be a snap.

merlin



Re: WIP json generation enhancements

From
Peter Eisentraut
Date:
On 11/21/12 3:16 PM, Andrew Dunstan wrote:
> One open question regarding this feature is whether this should return
> NULL or '[]' for 0 rows. Currently it returns NULL but I could be
> convinced to return '[]', and the change would be very small.

Although my intuition would be [], the existing concatenation-like
aggregates return null for no input rows, so this probably ought to be
consistent with those.



Re: WIP json generation enhancements

From
Hannu Krosing
Date:
On 11/26/2012 08:12 PM, Peter Eisentraut wrote:
> On 11/21/12 3:16 PM, Andrew Dunstan wrote:
>> One open question regarding this feature is whether this should return
>> NULL or '[]' for 0 rows. Currently it returns NULL but I could be
>> convinced to return '[]', and the change would be very small.
> Although my intuition would be [], the existing concatenation-like
> aggregates return null for no input rows, so this probably ought to be
> consistent with those.
>
In some previous mail Tom Lane claimed that by SQL standard
either an array of all NULLs or a record with all fields NULLs (I
don't remember which) is also considered NULL. If this is true,
then an empty array - which can be said to consist of nothing
but NULLs - should itself be NULL.

If this is so, than the existing behaviour of returning NULL in such
cases is what standard requires.

Hannu Krosing



Re: WIP json generation enhancements

From
Andrew Dunstan
Date:
On 11/26/2012 02:46 PM, Hannu Krosing wrote:
> On 11/26/2012 08:12 PM, Peter Eisentraut wrote:
>> On 11/21/12 3:16 PM, Andrew Dunstan wrote:
>>> One open question regarding this feature is whether this should return
>>> NULL or '[]' for 0 rows. Currently it returns NULL but I could be
>>> convinced to return '[]', and the change would be very small.
>> Although my intuition would be [], the existing concatenation-like
>> aggregates return null for no input rows, so this probably ought to be
>> consistent with those.
>>
> In some previous mail Tom Lane claimed that by SQL standard
> either an array of all NULLs or a record with all fields NULLs (I
> don't remember which) is also considered NULL. If this is true,
> then an empty array - which can be said to consist of nothing
> but NULLs - should itself be NULL.
>
> If this is so, than the existing behaviour of returning NULL in such
> cases is what standard requires.
>
>


That would be more relevant if we were talking about postgres arrays, 
but the '[]' here would not be a postgres array - it would be a piece of 
json text.

But in any case, the consensus seems to be to return null, and on the 
principle of doing the least work required I'm happy to comply.

cheers

andrew




Re: WIP json generation enhancements: fk-tree-to-json()

From
Hannu Krosing
Date:
On 11/22/2012 11:54 AM, Dimitri Fontaine wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Here is a WIP patch for enhancements to json generation.
>>
>> First, there is the much_requested json_agg, which will aggregate rows
>> directly to json. So the following will now work:
>>
>>      select json_agg(my_table) from mytable;
>>      select json_agg(q) from (<myquery here>) q;
> Awesome, thanks!
>
> How do you handle the nesting of the source elements? I would expect a
> variant of the aggregate that takes two input parameters, the datum and
> the current nesting level.
>
> Consider a tree table using parent_id and a recursive query to display
> the tree. You typically handle the nesting with an accumulator and a
> call to repeat() to prepend some spaces before the value columns. What
> about passing that nesting level (integer) to the json_agg()?
It still would not produxe nesting, just a nicer format.

If you want real nesting, you may want a version of my pl/python function
row-with-all-dependents-by-foreign-key-to-json()

which outputs a table row and then recursively all rows from other (or the same) table which have a foreign key
relationshipto this row
 

I use it to backup removed objects.

I would love to have something similar as a built-in function, though 
the current version
has some limitations and lacks some checks, like check for FK loops.


Function follows:
-------------------------------------------------------
CREATE OR REPLACE FUNCTION record_to_json_with_detail(table_name text, 
pk_value int) RETURNS text AS $$

import json,re

def fk_info(table_name):    fkplan = plpy.prepare("""    SELECT conrelid::regclass as reftable,
pg_get_constraintdef(c.oid)as condef      FROM pg_constraint c     WHERE c.confrelid::regclass = $1::regclass       AND
c.contype= 'f'    """, ("text",))    cdefrx = re.compile('FOREIGN KEY [(](.*)[)] REFERENCES .*[(](.*)[)].*')    fkres =
plpy.execute(fkplan,(table_name,))    for row in fkres:        reffields, thisfields =
cdefrx.match(row['condef']).groups()       yield thisfields, row['reftable'],reffields
 

def select_from_table_by_col(table_name, col_name, col_value):    qplan = plpy.prepare('select * from %s where %s = $1'
%
 
(table_name, col_name), ('int',))    return plpy.execute(qplan, (col_value,))

def recursive_rowdict(table_name, row_dict):    rd = dict([(a,b) for (a,b) in row_dict.items() if b is not None]) # 
skip NULLs    rd['__row_class__'] = table_name    for id_col, ref_tab, ref_col in fk_info(table_name):        q2res =
select_from_table_by_col(ref_tab,
 
ref_col,row_dict[id_col])        if q2res:            try:                rd['__refs__::' + id_col] += 
[recursive_rowdict(ref_tab,row) for row in q2res]            except KeyError:                rd['__refs__::' + id_col]
=
 
[recursive_rowdict(ref_tab,row) for row in q2res]    return rd

q1res = select_from_table_by_col(table_name, 'id', pk_value)
return json.dumps(recursive_rowdict(table_name, q1res[0]), indent=3)
$$ LANGUAGE plpythonu;

create table test1(id serial primary key, selfkey int references test1, 
data text);
create table test2(id serial primary key, test1key int references test1, 
data text);

insert into test1 values(1,null,'top');
insert into test1 values(2,1,'lvl1');
insert into test2 values(1,1,'lvl1-2');
insert into test2 values(2,2,'lvl2-2');

select record_to_json_with_detail('test1',1);        record_to_json_with_detail
------------------------------------------- {    "__row_class__": "test1",    "data": "top",    "id": 1,
"__refs__::id":[       {          "__row_class__": "test1",          "selfkey": 1,          "data": "lvl1",
"id":2,          "__refs__::id": [             {                "__row_class__": "test2",                "test1key": 2,
              "data": "lvl2-2",                "id": 2             }          ]       },       {
"__row_class__":"test2",          "test1key": 1,          "data": "lvl1-2",          "id": 1       }    ] }
 
(1 row)

Time: 6.576 ms

---------------------------------------
Hannu Krosing




Re: WIP json generation enhancements

From
Tom Lane
Date:
Hannu Krosing <hannu@2ndQuadrant.com> writes:
> In some previous mail Tom Lane claimed that by SQL standard
> either an array of all NULLs or a record with all fields NULLs (I
> don't remember which) is also considered NULL. If this is true,
> then an empty array - which can be said to consist of nothing
> but NULLs - should itself be NULL.

What I think you're referring to is that the spec says that "foo IS
NULL" should return true if foo is a record containing only null fields.
That's a fairly narrow statement.  It does NOT say that NULL and
(NULL,NULL,...) are indistinguishable for all purposes; only that
this particular test doesn't distinguish them.  Also I don't think they
have the same statement for arrays.

The analogy to other aggregates is probably a better thing to argue
from.  On the other hand, I don't know anyone outside the SQL standards
committee who thinks it's actually a good idea that SUM() across no rows
returns null rather than zero.
        regards, tom lane



Re: WIP json generation enhancements : strange IS NULL behaviour

From
Hannu Krosing
Date:
On 11/26/2012 09:05 PM, Tom Lane wrote:
> Hannu Krosing <hannu@2ndQuadrant.com> writes:
>> In some previous mail Tom Lane claimed that by SQL standard
>> either an array of all NULLs or a record with all fields NULLs (I
>> don't remember which) is also considered NULL. If this is true,
>> then an empty array - which can be said to consist of nothing
>> but NULLs - should itself be NULL.
> What I think you're referring to is that the spec says that "foo IS
> NULL" should return true if foo is a record containing only null fields.
Is this requirement recursive ?

That is , should

ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL
also be true ?

Currently PostgreSQL does this kind of IS NULL for "simple" rows

hannu=# SELECT ROW(NULL, NULL) IS NULL; ?column?
---------- t
(1 row)

and also for first level row types

hannu=# SELECT ROW(NULL, ROW(NULL, NULL)) IS NULL; ?column?
---------- t
(1 row)

but then mysteriously stops working at third level

hannu=# SELECT ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL; ?column?
---------- f
(1 row)


> That's a fairly narrow statement.  It does NOT say that NULL and
> (NULL,NULL,...) are indistinguishable for all purposes; only that
> this particular test doesn't distinguish them.  Also I don't think they
> have the same statement for arrays.
>
> The analogy to other aggregates is probably a better thing to argue
> from.  On the other hand, I don't know anyone outside the SQL standards
> committee who thinks it's actually a good idea that SUM() across no rows
> returns null rather than zero.
>
Might be done in order to be in sync with other aggregates - for
example the "return NULL for no rows" behaviour makes perfect
sense for MIN(), AVG(), etc.

------------------------
Hannu






Re: WIP json generation enhancements : strange IS NULL behaviour

From
Tom Lane
Date:
Hannu Krosing <hannu@2ndQuadrant.com> writes:
> On 11/26/2012 09:05 PM, Tom Lane wrote:
>> The analogy to other aggregates is probably a better thing to argue
>> from.  On the other hand, I don't know anyone outside the SQL standards
>> committee who thinks it's actually a good idea that SUM() across no rows
>> returns null rather than zero.

> Might be done in order to be in sync with other aggregates - for
> example the "return NULL for no rows" behaviour makes perfect
> sense for MIN(), AVG(), etc.

Well, if they'd made COUNT() of no rows return null, then I'd agree that
they were pursuing consistency.  As it stands, it's neither consistent
nor very sensible.
        regards, tom lane



Re: WIP json generation enhancements

From
Robert Haas
Date:
On Mon, Nov 26, 2012 at 3:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The analogy to other aggregates is probably a better thing to argue
> from.  On the other hand, I don't know anyone outside the SQL standards
> committee who thinks it's actually a good idea that SUM() across no rows
> returns null rather than zero.

Me neither.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: WIP json generation enhancements

From
Andrew Dunstan
Date:
On 11/26/2012 12:31 PM, Robert Haas wrote:
> On Mon, Nov 26, 2012 at 11:43 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> I don't understand why you would want to create such a cast. If the cast
>> doesn't exist it will do exactly what it does now, i.e. use the type's
>> output function and then json quote and escape it, which in the case of
>> citext is the Right Thing (tm):
>>
>>     andrew=# select to_json('foo"bar'::citext);
>>        to_json
>>     ------------
>>       "foo\"bar"
> I'm not sure either, but setting up a system where seemingly innocuous
> actions can in fact have surprising and not-easily-fixable
> consequences in other parts of the system doesn't seem like good
> design to me.  Of course, maybe I'm misunderstanding what will happen;
> I haven't actually tested it myself.
>


I'm all for caution, but the argument here seems a bit nebulous. We 
could create a sort of auxiliary type, as has been previously suggested, 
that would be in all respects the same as the json type except that it 
would be the target of the casts that would be used in to_json() and 
friends. But, that's a darned ugly thing to have to do, so I'd want a 
good concrete reason for doing it. Right now I'm having a hard time 
envisioning a problem that could be caused by just using the 
straightforward solution that's in my latest patch.

cheers

andrew





Re: WIP json generation enhancements

From
Andrew Dunstan
Date:
On 11/27/2012 02:38 PM, Andrew Dunstan wrote:
>
> On 11/26/2012 12:31 PM, Robert Haas wrote:
>> On Mon, Nov 26, 2012 at 11:43 AM, Andrew Dunstan 
>> <andrew@dunslane.net> wrote:
>>> I don't understand why you would want to create such a cast. If the 
>>> cast
>>> doesn't exist it will do exactly what it does now, i.e. use the type's
>>> output function and then json quote and escape it, which in the case of
>>> citext is the Right Thing (tm):
>>>
>>>     andrew=# select to_json('foo"bar'::citext);
>>>        to_json
>>>     ------------
>>>       "foo\"bar"
>> I'm not sure either, but setting up a system where seemingly innocuous
>> actions can in fact have surprising and not-easily-fixable
>> consequences in other parts of the system doesn't seem like good
>> design to me.  Of course, maybe I'm misunderstanding what will happen;
>> I haven't actually tested it myself.
>>
>
>
> I'm all for caution, but the argument here seems a bit nebulous. We 
> could create a sort of auxiliary type, as has been previously 
> suggested, that would be in all respects the same as the json type 
> except that it would be the target of the casts that would be used in 
> to_json() and friends. But, that's a darned ugly thing to have to do, 
> so I'd want a good concrete reason for doing it. Right now I'm having 
> a hard time envisioning a problem that could be caused by just using 
> the straightforward solution that's in my latest patch.
>
>


So, are there any other opinions on this besides mine and Robert's? I'd 
like to move forward but I want to get this resolved first.

cheers

andrew




Re: WIP json generation enhancements

From
"David E. Wheeler"
Date:
On Nov 26, 2012, at 11:12 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

> Although my intuition would be [], the existing concatenation-like
> aggregates return null for no input rows, so this probably ought to be
> consistent with those.

This annoys me at times, but I wrap such calls in COALESCE() and forget about it. So I agree to keep it consistent with
otherarray-returning aggregate functions. 

Best,

David




Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Mon, Nov 26, 2012 at 09:29:19PM +0100, Hannu Krosing wrote:
> On 11/26/2012 09:05 PM, Tom Lane wrote:
> >Hannu Krosing <hannu@2ndQuadrant.com> writes:
> >>In some previous mail Tom Lane claimed that by SQL standard
> >>either an array of all NULLs or a record with all fields NULLs (I
> >>don't remember which) is also considered NULL. If this is true,
> >>then an empty array - which can be said to consist of nothing
> >>but NULLs - should itself be NULL.
> >What I think you're referring to is that the spec says that "foo IS
> >NULL" should return true if foo is a record containing only null fields.
> Is this requirement recursive ?
>
> That is , should
>
> ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL
> also be true ?
>
> Currently PostgreSQL does this kind of IS NULL for "simple" rows
>
> hannu=# SELECT ROW(NULL, NULL) IS NULL;
>  ?column?
> ----------
>  t
> (1 row)
>
> and also for first level row types
>
> hannu=# SELECT ROW(NULL, ROW(NULL, NULL)) IS NULL;
>  ?column?
> ----------
>  t
> (1 row)
>
> but then mysteriously stops working at third level
>
> hannu=# SELECT ROW(NULL, NULL, ROW(NULL, ROW(NULL, NULL))) IS NULL;
>  ?column?
> ----------
>  f
> (1 row)

I finally had time to look at this, and it is surprising.  I used
EXPLAIN VERBOSE to see what the optimizer was outputting:

    EXPLAIN VERBOSE SELECT ROW(null) IS NULL;
                    QUERY PLAN
    ------------------------------------------
     Result  (cost=0.00..0.01 rows=1 width=0)
-->       Output: true

    EXPLAIN VERBOSE SELECT ROW(ROW(null)) IS NULL;
                    QUERY PLAN
    ------------------------------------------
     Result  (cost=0.00..0.01 rows=1 width=0)
-->       Output: (ROW(NULL::unknown) IS NULL)

    EXPLAIN VERBOSE SELECT ROW(ROW(ROW(null))) IS NULL;
                     QUERY PLAN
    ---------------------------------------------
     Result  (cost=0.00..0.01 rows=1 width=0)
-->       Output: (ROW(ROW(NULL::unknown)) IS NULL)

The first test outputs a constant, 'true'.  The second test is
ROW(NULL::unknown) because the inner ROW(NULL) was converted to
NULL:unknown.  The third one, which returns false (wrong), happens
because you have ROW embedded in ROW, which the optimizer can't process,
and the executor can't either.

I developed the attached patch which properly recurses into ROW()
records checking for NULLs;  you can see it returns the right answer in
all cases (and constant folds too):

    test=> EXPLAIN VERBOSE SELECT ROW(null) IS NULL;
                    QUERY PLAN
    ------------------------------------------
     Result  (cost=0.00..0.01 rows=1 width=0)
-->       Output: true
    (2 rows)

    test=> EXPLAIN VERBOSE SELECT ROW(ROW(null)) IS NULL;
                    QUERY PLAN
    ------------------------------------------
     Result  (cost=0.00..0.01 rows=1 width=0)
-->       Output: true
    (2 rows)

    test=> EXPLAIN VERBOSE SELECT ROW(ROW(ROW(null))) IS NULL;
                    QUERY PLAN
    ------------------------------------------
     Result  (cost=0.00..0.01 rows=1 width=0)
-->       Output: true
    (2 rows)

You might think the problem is only with constants, but it extends to
column values too (non-patched server):

    CREATE TABLE test (x INT);
    CREATE TABLE

    INSERT INTO test VALUES (1), (NULL);
    INSERT 0 2

    SELECT ROW(x) IS NULL FROM test;
     ?column?
    ----------
     f
     t

    SELECT ROW(ROW(x)) IS NULL FROM test;
     ?column?
    ----------
     f
     t

    SELECT ROW(ROW(ROW(x))) IS NULL FROM test;
     ?column?
    ----------
-->     f
-->     f

With the patch, that works too:

    SELECT ROW(ROW(ROW(x))) IS NULL FROM test;
     ?column?
    ----------
     f
     t

The optimizer seems like the right place to fix this, per my patch.  It
already flattens IS NULL tests into a series of AND clauses, and now by
recursing, it handles nested ROW values properly too.

This fix would be for head only.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

Re: strange IS NULL behaviour

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I developed the attached patch which properly recurses into ROW()
> records checking for NULLs;  you can see it returns the right answer in
> all cases (and constant folds too):

My recollection of the previous discussion is that we didn't have
consensus on what the "right" behavior is, so I'm not sure you can just
assert that this patch is right.  In any case this is only touching the
tip of the iceberg.  If we intend that rows of nulls should be null,
then we have got issues with, for example, NOT NULL column constraint
checks, which don't have any such recursion built into them.  I think
the same is true for plpgsql variable NOT NULL restrictions, and there
are probably some other places.

> The optimizer seems like the right place to fix this, per my patch.

No, it isn't, or at least it's far from the only place.  If we're going
to change this, we would also want to change the behavior of tests on
RECORD values, which is something that would have to happen at runtime.
        regards, tom lane



Re: strange IS NULL behaviour

From
Alvaro Herrera
Date:
Tom Lane wrote:

> My recollection of the previous discussion is that we didn't have
> consensus on what the "right" behavior is, so I'm not sure you can just
> assert that this patch is right.  In any case this is only touching the
> tip of the iceberg.  If we intend that rows of nulls should be null,
> then we have got issues with, for example, NOT NULL column constraint
> checks, which don't have any such recursion built into them.

FWIW if changing the behavior of NOT NULL constraints is desired, I
still have the patch to catalogue them around, if anyone wants to play
around.  I haven't gotten around to finishing it up, yet :-(

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: strange IS NULL behaviour

From
Peter Eisentraut
Date:
On 7/4/13 5:06 PM, Alvaro Herrera wrote:
> FWIW if changing the behavior of NOT NULL constraints is desired, I
> still have the patch to catalogue them around, if anyone wants to play
> around.  I haven't gotten around to finishing it up, yet :-(

If your latest patch isn't publicly available, I'd like to see it.  I
might work on that later this year.





Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Thu, Jul  4, 2013 at 04:29:20PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I developed the attached patch which properly recurses into ROW()
> > records checking for NULLs;  you can see it returns the right answer in
> > all cases (and constant folds too):
> 
> My recollection of the previous discussion is that we didn't have
> consensus on what the "right" behavior is, so I'm not sure you can just
> assert that this patch is right.  In any case this is only touching the
> tip of the iceberg.  If we intend that rows of nulls should be null,
> then we have got issues with, for example, NOT NULL column constraint
> checks, which don't have any such recursion built into them.  I think
> the same is true for plpgsql variable NOT NULL restrictions, and there
> are probably some other places.

Well we have three cases:
1  SELECT ROW(NULL) IS NULL;2  SELECT ROW(ROW(NULL)) IS NULL;3  SELECT ROW(ROW(ROW(NULL))) IS NULL;

I think we could have them all return false, or all true, or the first
one true, and the rest false.  What I don't think we can justify is 1
and 2 as true, and 3 false.

Can someone show how those others behave?  I don't know enough to test
it.

> > The optimizer seems like the right place to fix this, per my patch.
> 
> No, it isn't, or at least it's far from the only place.  If we're going
> to change this, we would also want to change the behavior of tests on
> RECORD values, which is something that would have to happen at runtime.

I checked RECORD and that behaves with recursion:
SELECT RECORD(NULL) IS NULL; ?column?---------- tSELECT RECORD(RECORD(NULL)) IS NULL; ?column?---------- tSELECT
RECORD(RECORD(RECORD(NULL)))IS NULL; ?column?---------- t
 

Am I missing something?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Thu, Jul  4, 2013 at 04:29:20PM -0400, Tom Lane wrote:
>> No, it isn't, or at least it's far from the only place.  If we're going
>> to change this, we would also want to change the behavior of tests on
>> RECORD values, which is something that would have to happen at runtime.

> I checked RECORD and that behaves with recursion:

Apparently you don't even understand the problem.  All of these examples
you're showing are constants.  Try something like
declare r record;...select ... into r ...if (r is null) ...
        regards, tom lane



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Fri, Jul  5, 2013 at 11:03:56AM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Thu, Jul  4, 2013 at 04:29:20PM -0400, Tom Lane wrote:
> >> No, it isn't, or at least it's far from the only place.  If we're going
> >> to change this, we would also want to change the behavior of tests on
> >> RECORD values, which is something that would have to happen at runtime.
> 
> > I checked RECORD and that behaves with recursion:
> 
> Apparently you don't even understand the problem.  All of these examples
> you're showing are constants.  Try something like
> 
>     declare r record;
>     ...
>     select ... into r ...
>     if (r is null) ...

Not aparently --- I already said I didn't understand the problem. 
Should I just mark this as a TODO?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Should I just mark this as a TODO?

I thought it was on the list already.
        regards, tom lane



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Fri, Jul  5, 2013 at 12:43:57PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Should I just mark this as a TODO?
> 
> I thought it was on the list already.

We only have:
Improve handling of NULLs in arrays 

and some pl/pgsql items regarding nulls.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Fri, Jul  5, 2013 at 11:03:56AM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Thu, Jul  4, 2013 at 04:29:20PM -0400, Tom Lane wrote:
> >> No, it isn't, or at least it's far from the only place.  If we're going
> >> to change this, we would also want to change the behavior of tests on
> >> RECORD values, which is something that would have to happen at runtime.
> 
> > I checked RECORD and that behaves with recursion:
> 
> Apparently you don't even understand the problem.  All of these examples
> you're showing are constants.  Try something like
> 
>     declare r record;
>     ...
>     select ... into r ...
>     if (r is null) ...

OK, I created the following test on git head (without my patch), and the
results look correct:
DO LANGUAGE plpgsql $$DECLARE        r RECORD;BEGIN
DROP TABLE IF EXISTS test;CREATE TABLE test (x INT, y INT);
INSERT INTO test VALUES (1, NULL), (NULL, 1), (NULL, NULL);FOR r IN SELECT * FROM testLOOP        IF (r IS NULL)
THENRAISE NOTICE 'true';        ELSE RAISE NOTICE 'false';        END IF;END LOOP;END;$$;
 
NOTICE:  falseNOTICE:  falseNOTICE:  true

Am I missing something?

Is this an example of NOT NULL contraints not testing NULLs?
CREATE TABLE test3(x INT, y INT);CREATE TABLE test5(z test3 NOT NULL);
INSERT INTO test5 VALUES (ROW(NULL, NULL));
SELECT * FROM test5;  z----- (,)

Looks like I have to modify ExecEvalNullTest().  If I fix this, is it
going to cause problems with pg_upgraded databases now having values
that are no longer validated by the NOT NULL constraint?  

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Sun, Jul  7, 2013 at 01:04:05PM -0400, Bruce Momjian wrote:
> Looks like I have to modify ExecEvalNullTest().

Oops, I mean ExecConstraints().  This could be tricky.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Alvaro Herrera
Date:
Peter Eisentraut wrote:
> On 7/4/13 5:06 PM, Alvaro Herrera wrote:
> > FWIW if changing the behavior of NOT NULL constraints is desired, I
> > still have the patch to catalogue them around, if anyone wants to play
> > around.  I haven't gotten around to finishing it up, yet :-(
>
> If your latest patch isn't publicly available, I'd like to see it.  I
> might work on that later this year.

Here it is.  Note that it is quite incomplete: there are parts that are
not even close to compiling.  I think I was trying to figure out how to
determine whether a column was of a composite type.

I might get back to it eventually, so if you start working on it, do let
me know.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Fri, Jul  5, 2013 at 10:21:19AM -0400, Bruce Momjian wrote:
> On Thu, Jul  4, 2013 at 04:29:20PM -0400, Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > I developed the attached patch which properly recurses into ROW()
> > > records checking for NULLs;  you can see it returns the right answer in
> > > all cases (and constant folds too):
> >
> > My recollection of the previous discussion is that we didn't have
> > consensus on what the "right" behavior is, so I'm not sure you can just
> > assert that this patch is right.  In any case this is only touching the
> > tip of the iceberg.  If we intend that rows of nulls should be null,
> > then we have got issues with, for example, NOT NULL column constraint
> > checks, which don't have any such recursion built into them.  I think
> > the same is true for plpgsql variable NOT NULL restrictions, and there
> > are probably some other places.
>
> Well we have three cases:
>
>     1  SELECT ROW(NULL) IS NULL;
>     2  SELECT ROW(ROW(NULL)) IS NULL;
>     3  SELECT ROW(ROW(ROW(NULL))) IS NULL;
>
> I think we could have them all return false, or all true, or the first
> one true, and the rest false.  What I don't think we can justify is 1
> and 2 as true, and 3 false.

I have done some more research in this, and was able to verify Tom's
concern that PL/pgSQL's IS NULL doesn't recurse into ROW expressions:

    DO LANGUAGE plpgsql $$
            DECLARE
                    r RECORD;
            BEGIN

            SELECT NULL INTO r;
              IF (r IS NULL)
                    THEN RAISE NOTICE 'true';
                    ELSE RAISE NOTICE 'false';
                    END IF;
            END;
            $$;
    NOTICE:  true
    DO

In this test, SELECT NULL (which internally would produce SELECT
ROW(NULL)), returns TRUE, while SELECT ROW(NULL) and further nesting
returns false.

This has made me adjust my goal and change it so SELECT ROW(NULL) IS
NULL returns true, and any further nesting returns false.

Attached is a patch which accomplishes this, and a documentation update.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

Re: strange IS NULL behaviour

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> This has made me adjust my goal and change it so SELECT ROW(NULL) IS
> NULL returns true, and any further nesting returns false.

AFAICS, the only good argument for breaking backwards compatibility here
is if you can convince people that the new behavior is more conformant to
the SQL spec.  Where's the chapter and verse that argues for this
interpretation?

And I will say once more that a patch that affects only the behavior of
eval_const_expressions can be rejected on its face.  That code has to be
kept in sync with the behavior of execQual.c, not just whacked around by
itself.  And then there are the NOT NULL constraint cases to worry about.
        regards, tom lane



Re: strange IS NULL behaviour

From
Tom Lane
Date:
I wrote:
> And I will say once more that a patch that affects only the behavior of
> eval_const_expressions can be rejected on its face.  That code has to be
> kept in sync with the behavior of execQual.c, not just whacked around by
> itself.  And then there are the NOT NULL constraint cases to worry about.

Hmm ... actually, it's already not in sync, because:

regression=# create table tt (x int);
CREATE TABLE
regression=# insert into tt values(null);
INSERT 0 1
regression=# select row(x) from tt;row 
-----()
(1 row)

regression=# select row(row(x)) from tt; row   
--------("()")
(1 row)

regression=# select row(row(row(x))) from tt;    row      
--------------("(""()"")")
(1 row)

There's certainly no excuse for this behaving differently from the cases
with a simple constant NULL.  So I'm a bit inclined to say that we should
rip out the special case in eval_const_expressions, not make it even less
self-consistent.  It's possible to argue that existing applications won't
be too sensitive to the behavior of the constant cases, but they surely
must be depending on the behavior in the non-constant cases.
        regards, tom lane



Re: strange IS NULL behaviour

From
Merlin Moncure
Date:
On Tue, Sep 3, 2013 at 8:32 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Fri, Jul  5, 2013 at 10:21:19AM -0400, Bruce Momjian wrote:
>> On Thu, Jul  4, 2013 at 04:29:20PM -0400, Tom Lane wrote:
>> > Bruce Momjian <bruce@momjian.us> writes:
>> > > I developed the attached patch which properly recurses into ROW()
>> > > records checking for NULLs;  you can see it returns the right answer in
>> > > all cases (and constant folds too):
>> >
>> > My recollection of the previous discussion is that we didn't have
>> > consensus on what the "right" behavior is, so I'm not sure you can just
>> > assert that this patch is right.  In any case this is only touching the
>> > tip of the iceberg.  If we intend that rows of nulls should be null,
>> > then we have got issues with, for example, NOT NULL column constraint
>> > checks, which don't have any such recursion built into them.  I think
>> > the same is true for plpgsql variable NOT NULL restrictions, and there
>> > are probably some other places.
>>
>> Well we have three cases:
>>
>>       1  SELECT ROW(NULL) IS NULL;
>>       2  SELECT ROW(ROW(NULL)) IS NULL;
>>       3  SELECT ROW(ROW(ROW(NULL))) IS NULL;
>>
>> I think we could have them all return false, or all true, or the first
>> one true, and the rest false.  What I don't think we can justify is 1
>> and 2 as true, and 3 false.
>
> I have done some more research in this, and was able to verify Tom's
> concern that PL/pgSQL's IS NULL doesn't recurse into ROW expressions:
>
>         DO LANGUAGE plpgsql $$
>                 DECLARE
>                         r RECORD;
>                 BEGIN
>
>                 SELECT NULL INTO r;
>                   IF (r IS NULL)
>                         THEN RAISE NOTICE 'true';
>                         ELSE RAISE NOTICE 'false';
>                         END IF;
>                 END;
>                 $$;
>         NOTICE:  true
>         DO
>
> In this test, SELECT NULL (which internally would produce SELECT
> ROW(NULL)), returns TRUE, while SELECT ROW(NULL) and further nesting
> returns false.
>
> This has made me adjust my goal and change it so SELECT ROW(NULL) IS
> NULL returns true, and any further nesting returns false.

It gets worse and worse.  The IS NULL operator is already pretty much
special cased -- in just about all other case concerning rowtypes (for
example coalesce) 'null containing rowtypes are *not* considered to be
null as the container itself has a null bit independent of it's
elements which is expressly contrary to the SQL standard.  This is
tragic; postgres's way of doing it (except with IS NULL) makes an
awful lot of sense to me but here we are.  I think before making any
behavior changes at all, we need to ask ourselves:

1. Are we willing to break compatibility in order to move to spec
compliant behavior?
2. and if so, what mountains do we have to cross to get there?

Your proposed change (implementation details aside) seems ok in the
sense that it doesn't seem to have a lot of obvious side effects but
the elephant in the room is #1; if the answer is 'no', then I'd say
the best course of action is to let things be.

merlin



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Tue, Sep  3, 2013 at 09:43:14PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > This has made me adjust my goal and change it so SELECT ROW(NULL) IS
> > NULL returns true, and any further nesting returns false.
> 
> AFAICS, the only good argument for breaking backwards compatibility here
> is if you can convince people that the new behavior is more conformant to
> the SQL spec.  Where's the chapter and verse that argues for this
> interpretation?

The SQL03 standard in section 8.7, table 14, says "degree 1: null" and
"degree > 1: all null".  Does that mean they are considering nested rows
as degree > 1, or is that the number of values in the row? A footnote
says:
For all R, "R IS NOT NULL" has the sameresult as "NOT R IS NULL" if and only if R is ofdegree 1.

which seems to support the idea that degree is the number of values,
meaning they don't discuss nesting.

> And I will say once more that a patch that affects only the behavior of
> eval_const_expressions can be rejected on its face.  That code has to be
> kept in sync with the behavior of execQual.c, not just whacked around by
> itself.  And then there are the NOT NULL constraint cases to worry about.

I thought execQual.c was already not recursing so I didn't see a need to
change that.

I could not figure out how to test a NOT NULL constraint for nesting.

What is driving my research here is that our current behavior is really
not documentable.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Tue, Sep  3, 2013 at 10:27:38PM -0400, Tom Lane wrote:
> I wrote:
> > And I will say once more that a patch that affects only the behavior of
> > eval_const_expressions can be rejected on its face.  That code has to be
> > kept in sync with the behavior of execQual.c, not just whacked around by
> > itself.  And then there are the NOT NULL constraint cases to worry about.
> 
> Hmm ... actually, it's already not in sync, because:
> 
> regression=# create table tt (x int);
> CREATE TABLE
> regression=# insert into tt values(null);
> INSERT 0 1
> regression=# select row(x) from tt;
>  row 
> -----
>  ()
> (1 row)
> 
> regression=# select row(row(x)) from tt;
>   row   
> --------
>  ("()")
> (1 row)
> 
> regression=# select row(row(row(x))) from tt;
>      row      
> --------------
>  ("(""()"")")
> (1 row)


Uh, I see the same output you show for a NULL constant:
SELECT ROW(NULL); row----- ()SELECT ROW(ROW(NULL));  row-------- ("()")SELECT ROW(ROW(ROW(NULL)));
row--------------("(""()"")")
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Tue, Sep  3, 2013 at 09:44:33PM -0500, Merlin Moncure wrote:
> It gets worse and worse.  The IS NULL operator is already pretty much
> special cased -- in just about all other case concerning rowtypes (for
> example coalesce) 'null containing rowtypes are *not* considered to be
> null as the container itself has a null bit independent of it's
> elements which is expressly contrary to the SQL standard.  This is
> tragic; postgres's way of doing it (except with IS NULL) makes an
> awful lot of sense to me but here we are.  I think before making any
> behavior changes at all, we need to ask ourselves:
> 
> 1. Are we willing to break compatibility in order to move to spec
> compliant behavior?
> 2. and if so, what mountains do we have to cross to get there?
> 
> Your proposed change (implementation details aside) seems ok in the
> sense that it doesn't seem to have a lot of obvious side effects but
> the elephant in the room is #1; if the answer is 'no', then I'd say
> the best course of action is to let things be.

Yes, these are the right questions.  If we leave things unchanged, can
we document the behavior we currently have?  What I _think_ we have now
is that IS NULL in queries checks two levels deep for nulls, but checks
only one level deep for PL/pgSQL and constraints.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Tue, Sep  3, 2013 at 09:32:44PM -0400, Bruce Momjian wrote:
> In this test, SELECT NULL (which internally would produce SELECT
> ROW(NULL)), returns TRUE, while SELECT ROW(NULL) and further nesting
> returns false.
> 
> This has made me adjust my goal and change it so SELECT ROW(NULL) IS
> NULL returns true, and any further nesting returns false.
> 
> Attached is a patch which accomplishes this, and a documentation update.

I have not heard any feedback on this patch, so I would like to apply it
to give us a nested ROW/IS NULL API we can document.  It would have to
be marked in the release notes as a backward incompatibility.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Robert Haas
Date:
On Wed, Sep 4, 2013 at 9:26 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Tue, Sep  3, 2013 at 09:32:44PM -0400, Bruce Momjian wrote:
>> In this test, SELECT NULL (which internally would produce SELECT
>> ROW(NULL)), returns TRUE, while SELECT ROW(NULL) and further nesting
>> returns false.
>>
>> This has made me adjust my goal and change it so SELECT ROW(NULL) IS
>> NULL returns true, and any further nesting returns false.
>>
>> Attached is a patch which accomplishes this, and a documentation update.
>
> I have not heard any feedback on this patch, so I would like to apply it
> to give us a nested ROW/IS NULL API we can document.  It would have to
> be marked in the release notes as a backward incompatibility.

I don't have time to look at this in detail right now, but I think
that's considerably premature.  I'm not convinced that we understand
all of the problems in this area are yet, let alone the solutions.
And I notice that you haven't substantively responded to some of Tom's
concerns.

So -1 from me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: strange IS NULL behaviour

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Sep 4, 2013 at 9:26 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> I have not heard any feedback on this patch, so I would like to apply it
>> to give us a nested ROW/IS NULL API we can document.  It would have to
>> be marked in the release notes as a backward incompatibility.

> I don't have time to look at this in detail right now, but I think
> that's considerably premature.  I'm not convinced that we understand
> all of the problems in this area are yet, let alone the solutions.
> And I notice that you haven't substantively responded to some of Tom's
> concerns.

In particular, I don't think it's a good idea to change
eval_const_expression's behavior in an incompatible way that simply
makes it differently inconsistent with other IS NULL code paths.
We should leave things alone until we have a full fix, otherwise we'll
just be breaking people's apps repeatedly.

I would also say that allowing eval_const_expression to drive what we
think the "right" behavior is is completely backwards, because it's
about the least performance-critical case.  You should be looking at
execQual.c first.
        regards, tom lane



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Thu, Sep  5, 2013 at 02:14:39PM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Wed, Sep 4, 2013 at 9:26 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >> I have not heard any feedback on this patch, so I would like to apply it
> >> to give us a nested ROW/IS NULL API we can document.  It would have to
> >> be marked in the release notes as a backward incompatibility.
>
> > I don't have time to look at this in detail right now, but I think
> > that's considerably premature.  I'm not convinced that we understand
> > all of the problems in this area are yet, let alone the solutions.
> > And I notice that you haven't substantively responded to some of Tom's
> > concerns.
>
> In particular, I don't think it's a good idea to change
> eval_const_expression's behavior in an incompatible way that simply
> makes it differently inconsistent with other IS NULL code paths.

Let me summarize what eval_const_expressions_mutator() does:  it takes a
ROW() IS NULL construct, and converts it to individual _value_ IS NULL
clauses so they can be better optimized.  It changes:

    ROW(x, ROW(y)) IS NULL

to

    x IS NULL AND ROW(y) IS NULL

By removing this one level of ROW construct, it causes ROW(ROW(NULL)) IS
NULL to return true, while more nested cases do not.  It also tries to
short-circuit the IS NULL clause if it finds a literal NULL constant
inside the row.

My patch uses this short-circuit logic to return false for IS NULL if it
finds an embedded ROW construct.  This makes the code match the code in
ExecEvalNullTest(), which already treats embedded ROW values as
non-null, e.g. it already treats ROW(ROW(x)) IS NULL as false.
Specifically, any code path that doesn't pass through
eval_const_expressions_mutator() and gets to execQual.c will return
false for this.

> We should leave things alone until we have a full fix, otherwise we'll
> just be breaking people's apps repeatedly.

Yes, we don't want that.

> I would also say that allowing eval_const_expression to drive what we
> think the "right" behavior is is completely backwards, because it's
> about the least performance-critical case.  You should be looking at
> execQual.c first.

I am making eval_const_expression match execQual.c, specifically
ExecEvalNullTest(), which calls heap_attisnull(), and looks at the NULL
indicator, which can't be true for an attribute containing a ROW value.

I am confused why others don't see what I am seeing.

Another possible fix would be to avoid the IS NULL value optimizer
expansion if a ROW construct is inside a ROW().  I have attached a patch
that does this for review.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
> Another possible fix would be to avoid the IS NULL value optimizer
> expansion if a ROW construct is inside a ROW().  I have attached a patch
> that does this for review.

Having received no replies, do people perfer this version of the patch
that just punts nested ROW IS NULL testing to execQual.c?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
>> Another possible fix would be to avoid the IS NULL value optimizer
>> expansion if a ROW construct is inside a ROW().  I have attached a patch
>> that does this for review.

> Having received no replies, do people perfer this version of the patch
> that just punts nested ROW IS NULL testing to execQual.c?

For some reason I read your previous message as saying you were willing to
wait for considered reviews this time.  If not, I'll just write a blanket
-1 for any version of this patch.

I don't think you've shown that this is more spec-compliant than what
we had before, nor that you've made all the relevant code (execQual,
eval_const_expressions, column NOT NULL constraints, plpgsql variable
NOT NULL constraints, maybe other places) mutually consistent.

I'm not a fan of incremental improvements in application-visible
semantics: if we change this repeatedly over several releases, that's an
application author's worst nightmare, because he'll have to try to work
with multiple different behaviors.  We need to change this *once* and get
it right.  You haven't proven that this is now right where it wasn't
before, and the patch is certainly not so obviously right that it should
go in without considered review.
        regards, tom lane



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Fri, Sep  6, 2013 at 11:00:24PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
> >> Another possible fix would be to avoid the IS NULL value optimizer
> >> expansion if a ROW construct is inside a ROW().  I have attached a patch
> >> that does this for review.
> 
> > Having received no replies, do people perfer this version of the patch
> > that just punts nested ROW IS NULL testing to execQual.c?
> 
> For some reason I read your previous message as saying you were willing to
> wait for considered reviews this time.  If not, I'll just write a blanket
> -1 for any version of this patch.

Are you saying people will comment later?  I wasn't clear that was the
plan.  I can certainly wait.

> I don't think you've shown that this is more spec-compliant than what
> we had before, nor that you've made all the relevant code (execQual,
> eval_const_expressions, column NOT NULL constraints, plpgsql variable
> NOT NULL constraints, maybe other places) mutually consistent.

I believe all the other places (execQual, plpgsql variables) all treat
embedded row in rows as non-null, but I don't even know how to test all
the place.  Can someone do that?

> I'm not a fan of incremental improvements in application-visible
> semantics: if we change this repeatedly over several releases, that's an
> application author's worst nightmare, because he'll have to try to work
> with multiple different behaviors.  We need to change this *once* and get
> it right.  You haven't proven that this is now right where it wasn't
> before, and the patch is certainly not so obviously right that it should
> go in without considered review.

Yes, we have to be sure to get this right.  However, I am not able to
test all the places you have mentioned, so unless someone else finds
this important enough to work on, I will just document it as a TODO and
close it.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Andres Freund
Date:
On 2013-09-06 23:07:04 -0400, Bruce Momjian wrote:
> On Fri, Sep  6, 2013 at 11:00:24PM -0400, Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
> > >> Another possible fix would be to avoid the IS NULL value optimizer
> > >> expansion if a ROW construct is inside a ROW().  I have attached a patch
> > >> that does this for review.
> > 
> > > Having received no replies, do people perfer this version of the patch
> > > that just punts nested ROW IS NULL testing to execQual.c?
> > 
> > For some reason I read your previous message as saying you were willing to
> > wait for considered reviews this time.  If not, I'll just write a blanket
> > -1 for any version of this patch.
> 
> Are you saying people will comment later?  I wasn't clear that was the
> plan.  I can certainly wait.

You do realize mere mortals in the project frequently have to wait
*months* to get comments on their patches? Not getting any for less than
48h doesn't seem to be saying much.

Why don't you add the proposal to the commitfest?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Sat, Sep  7, 2013 at 07:42:55AM +0200, Andres Freund wrote:
> On 2013-09-06 23:07:04 -0400, Bruce Momjian wrote:
> > On Fri, Sep  6, 2013 at 11:00:24PM -0400, Tom Lane wrote:
> > > Bruce Momjian <bruce@momjian.us> writes:
> > > > On Thu, Sep  5, 2013 at 05:06:41PM -0400, Bruce Momjian wrote:
> > > >> Another possible fix would be to avoid the IS NULL value optimizer
> > > >> expansion if a ROW construct is inside a ROW().  I have attached a patch
> > > >> that does this for review.
> > > 
> > > > Having received no replies, do people perfer this version of the patch
> > > > that just punts nested ROW IS NULL testing to execQual.c?
> > > 
> > > For some reason I read your previous message as saying you were willing to
> > > wait for considered reviews this time.  If not, I'll just write a blanket
> > > -1 for any version of this patch.
> > 
> > Are you saying people will comment later?  I wasn't clear that was the
> > plan.  I can certainly wait.
> 
> You do realize mere mortals in the project frequently have to wait
> *months* to get comments on their patches? Not getting any for less than
> 48h doesn't seem to be saying much.

My original problem report was November, 2012:
http://www.postgresql.org/message-id/50B3D11F.20408@2ndQuadrant.com

and my patch to fix this was July 4.  Tom gave me a code snipped to test
PL/pgSQL's handling of record types.  I tested that and it looked ok,
but there are other place to test that I don't know about.

> Why don't you add the proposal to the commitfest?

This issue is so much larger than the patch's validity that I don't see
how that would work.

I obviously can't complete this, so I am adding a TODO item:
IS NULL testing of nested ROW() values is inconsistent

All my patches are in that thread in case someone who can complete this
item wants to take it over.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Sat, Sep  7, 2013 at 10:59:08AM -0400, Bruce Momjian wrote:
> My original problem report was November, 2012:
> 
>     http://www.postgresql.org/message-id/50B3D11F.20408@2ndQuadrant.com
> 
> and my patch to fix this was July 4.  Tom gave me a code snipped to test
> PL/pgSQL's handling of record types.  I tested that and it looked ok,
> but there are other place to test that I don't know about.

Correction --- my updates to Tom's suggestion was only posted September
4. Anyway, it is on the TODO list now.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Robert Haas
Date:
On Sat, Sep 7, 2013 at 10:59 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> Why don't you add the proposal to the commitfest?
>
> This issue is so much larger than the patch's validity that I don't see
> how that would work.

I hate to be rude here, but I think you're being ridiculous.  We have
a well-established procedure for getting patches reviewed around here,
and while it is not perfect, it mostly works.  If you try that
procedure and it doesn't work, then I think you have a right to
complain.  But to object, on the one hand, that people aren't going to
look at the patch, and then to refuse to add it to the tracking tool
that the project uses to ensure that patches get looked at, seems
patently unfair.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Mon, Sep  9, 2013 at 12:37:25PM -0400, Robert Haas wrote:
> On Sat, Sep 7, 2013 at 10:59 AM, Bruce Momjian <bruce@momjian.us> wrote:
> >> Why don't you add the proposal to the commitfest?
> >
> > This issue is so much larger than the patch's validity that I don't see
> > how that would work.
> 
> I hate to be rude here, but I think you're being ridiculous.  We have
> a well-established procedure for getting patches reviewed around here,
> and while it is not perfect, it mostly works.  If you try that
> procedure and it doesn't work, then I think you have a right to
> complain.  But to object, on the one hand, that people aren't going to
> look at the patch, and then to refuse to add it to the tracking tool
> that the project uses to ensure that patches get looked at, seems
> patently unfair.

The problem is that I don't believe this patch is commit-ready ---
someone needs to research the IS NULL tests in all areas of our code to
see if they match this patch, and I can't do that.  Is that something a
reviewer is going to be willing to do?  I don't think I have ever seen a
commit-fest item that still required serious research outside the patch
area before committing.  I could ask just for feedback, but I have
already received enough feedback to know I can't get the patch to a
ready-enough state.

I think requiring commit-fest reviewers come to the same conclusion is
just making extra work for them.  Still, if you want me to add it to the
next commit-fest, please let me know.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Robert Haas
Date:
On Mon, Sep 9, 2013 at 3:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
> The problem is that I don't believe this patch is commit-ready ---
> someone needs to research the IS NULL tests in all areas of our code to
> see if they match this patch, and I can't do that.  Is that something a
> reviewer is going to be willing to do?  I don't think I have ever seen a
> commit-fest item that still required serious research outside the patch
> area before committing.  I could ask just for feedback, but I have
> already received enough feedback to know I can't get the patch to a
> ready-enough state.

OK, well then there's probably not much point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Tue, Sep 10, 2013 at 08:45:14AM -0400, Robert Haas wrote:
> On Mon, Sep 9, 2013 at 3:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > The problem is that I don't believe this patch is commit-ready ---
> > someone needs to research the IS NULL tests in all areas of our code to
> > see if they match this patch, and I can't do that.  Is that something a
> > reviewer is going to be willing to do?  I don't think I have ever seen a
> > commit-fest item that still required serious research outside the patch
> > area before committing.  I could ask just for feedback, but I have
> > already received enough feedback to know I can't get the patch to a
> > ready-enough state.
> 
> OK, well then there's probably not much point.

FYI, I think these queries below prove that NOT NULL constraints do not
follow the single-depth ROW NULL inspection rule that PL/pgSQL follows,
and that my patch was trying to promote for queries:
CREATE TABLE test2(x test NOT NULL);CREATE TABLEINSERT INTO test2 VALUES (null);ERROR:  null value in column "x"
violatesnot-null constraintDETAIL:  Failing row contains (null).
 
-->    INSERT INTO test2 VALUES (row(null));INSERT 0 1

So, in summary, NOT NULL constraints don't inspect into ROW values for
NULLs, PL/pgSQL goes one level deep into ROW, and queries go two levels
deep. I am not sure what other areas need checking.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Merlin Moncure
Date:
On Tue, Sep 10, 2013 at 8:56 AM, Bruce Momjian <bruce@momjian.us> wrote:
> On Tue, Sep 10, 2013 at 08:45:14AM -0400, Robert Haas wrote:
>> On Mon, Sep 9, 2013 at 3:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> > The problem is that I don't believe this patch is commit-ready ---
>> > someone needs to research the IS NULL tests in all areas of our code to
>> > see if they match this patch, and I can't do that.  Is that something a
>> > reviewer is going to be willing to do?  I don't think I have ever seen a
>> > commit-fest item that still required serious research outside the patch
>> > area before committing.  I could ask just for feedback, but I have
>> > already received enough feedback to know I can't get the patch to a
>> > ready-enough state.
>>
>> OK, well then there's probably not much point.
>
> FYI, I think these queries below prove that NOT NULL constraints do not
> follow the single-depth ROW NULL inspection rule that PL/pgSQL follows,
> and that my patch was trying to promote for queries:
>
>         CREATE TABLE test2(x test NOT NULL);
>         CREATE TABLE
>         INSERT INTO test2 VALUES (null);
>         ERROR:  null value in column "x" violates not-null constraint
>         DETAIL:  Failing row contains (null).
> -->     INSERT INTO test2 VALUES (row(null));
>         INSERT 0 1
>
> So, in summary, NOT NULL constraints don't inspect into ROW values for
> NULLs, PL/pgSQL goes one level deep into ROW, and queries go two levels
> deep. I am not sure what other areas need checking.

Our composite null handling (as noted) is an absolute minefield of
issues.  Consider:

postgres=# select coalesce(row(null,null), row('no', 'bueno'));coalesce
----------(,)

postgres=# select case when row(null,null) is null then row('no', 'bueno') end;   case
------------(no,bueno)

It's just a mess.  So it bears repeating: do we or do we not want to
implement SQL standard composite null handing?  If so, you probably
have to hit all the targets.  If not, I'd either A: leave things alone
or B: remove the special case logic in IS NULL (so that it behaves as
coalesce() does) and document our divergence from the standard.  Point
being: B might actually be the best choice, but it should be
understood that we are not going in that direction before pushing
patches that go in the other direction.

merlin



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Tue, Sep 10, 2013 at 09:12:08AM -0500, Merlin Moncure wrote:
> > FYI, I think these queries below prove that NOT NULL constraints do not
> > follow the single-depth ROW NULL inspection rule that PL/pgSQL follows,
> > and that my patch was trying to promote for queries:
> >
> >         CREATE TABLE test2(x test NOT NULL);
> >         CREATE TABLE
> >         INSERT INTO test2 VALUES (null);
> >         ERROR:  null value in column "x" violates not-null constraint
> >         DETAIL:  Failing row contains (null).
> > -->     INSERT INTO test2 VALUES (row(null));
> >         INSERT 0 1
> >
> > So, in summary, NOT NULL constraints don't inspect into ROW values for
> > NULLs, PL/pgSQL goes one level deep into ROW, and queries go two levels
> > deep. I am not sure what other areas need checking.
> 
> Our composite null handling (as noted) is an absolute minefield of
> issues.  Consider:
> 
> postgres=# select coalesce(row(null,null), row('no', 'bueno'));
>  coalesce
> ----------
>  (,)
> 
> postgres=# select case when row(null,null) is null then row('no', 'bueno') end;
>     case
> ------------
>  (no,bueno)
> 
> It's just a mess.  So it bears repeating: do we or do we not want to

Wow, OK, more inconsistent places.  :-(

> implement SQL standard composite null handing?  If so, you probably

I am unclear if section 8.7, Null Predicate, in the SQL 2003 standard is
talking about just IS NULL, or all NULL tests.

> have to hit all the targets.  If not, I'd either A: leave things alone
> or B: remove the special case logic in IS NULL (so that it behaves as
> coalesce() does) and document our divergence from the standard.  Point
> being: B might actually be the best choice, but it should be
> understood that we are not going in that direction before pushing
> patches that go in the other direction.

I see.  So going one-level deep in the ROW NULL inspection is something
we do for IS NULL in queries (actually double-deep inspection)q, but it
was never consistently implemented across all NULL tests.


--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Tue, Sep 10, 2013 at 10:50:32AM -0400, Bruce Momjian wrote:
> > have to hit all the targets.  If not, I'd either A: leave things alone
> > or B: remove the special case logic in IS NULL (so that it behaves as
> > coalesce() does) and document our divergence from the standard.  Point
> > being: B might actually be the best choice, but it should be
> > understood that we are not going in that direction before pushing
> > patches that go in the other direction.
>
> I see.  So going one-level deep in the ROW NULL inspection is something
> we do for IS NULL in queries (actually double-deep inspection)q, but it
> was never consistently implemented across all NULL tests.

Using your examples and others I have collected, I have created an SQL
script which shows our inconsistent behavior, attached, and its output.

If we agree that a single-level NULL inspection of ROWS is the right
approach, it would seem we need my patch, and we need to fix coalesce()
and NOT NULL constraint testing?  Is that accurate?  Is there more
areas?

Nested RECORDS seem to collapse to a single level, so I don't think we
have to change the recursion there:

    SELECT RECORD(RECORD(RECORD(NULL)));
     record
    --------
     (null)

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

Re: strange IS NULL behaviour

From
Merlin Moncure
Date:
On Tue, Sep 10, 2013 at 1:54 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Tue, Sep 10, 2013 at 10:50:32AM -0400, Bruce Momjian wrote:
>> > have to hit all the targets.  If not, I'd either A: leave things alone
>> > or B: remove the special case logic in IS NULL (so that it behaves as
>> > coalesce() does) and document our divergence from the standard.  Point
>> > being: B might actually be the best choice, but it should be
>> > understood that we are not going in that direction before pushing
>> > patches that go in the other direction.
>>
>> I see.  So going one-level deep in the ROW NULL inspection is something
>> we do for IS NULL in queries (actually double-deep inspection)q, but it
>> was never consistently implemented across all NULL tests.
>
> Using your examples and others I have collected, I have created an SQL
> script which shows our inconsistent behavior, attached, and its output.
>
> If we agree that a single-level NULL inspection of ROWS is the right
> approach, it would seem we need my patch, and we need to fix coalesce()
> and NOT NULL constraint testing?  Is that accurate?  Is there more
> areas?
>
> Nested RECORDS seem to collapse to a single level, so I don't think we
> have to change the recursion there:
>
>         SELECT RECORD(RECORD(RECORD(NULL)));
>          record
>         --------
>          (null)


Also consider:
STRICT
PQisNull
EXCEPT/UNION
WHERE NOT IN (see how it plays with 'null gotcha')
etc

(but I don't think your lens should be focused on 'record recursion'
-- there is a deeper problem for which that is just a particular
symptom)

merlin



Re: strange IS NULL behaviour

From
Kevin Grittner
Date:
Bruce Momjian <bruce@momjian.us> wrote:

> FYI, I think these queries below prove that NOT NULL constraints do not
> follow the single-depth ROW NULL inspection rule that PL/pgSQL follows,
> and that my patch was trying to promote for queries:
>
>     CREATE TABLE test2(x test NOT NULL);
>     CREATE TABLE
>     INSERT INTO test2 VALUES (null);
>     ERROR:  null value in column "x" violates not-null constraint
>     DETAIL:  Failing row contains (null).
> -->    INSERT INTO test2 VALUES (row(null));
>     INSERT 0 1

If I remember correctly, the standard wants a NOT NULL constraint
on a column with a composite type to behave the same as

  CHECK (col IS DISTINCT FROM NULL)

... which is consistent with the behavior you show.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: strange IS NULL behaviour

From
Bruce Momjian
Date:
On Tue, Sep 10, 2013 at 12:48:08PM -0700, Kevin Grittner wrote:
> Bruce Momjian <bruce@momjian.us> wrote:
> 
> > FYI, I think these queries below prove that NOT NULL constraints do not
> > follow the single-depth ROW NULL inspection rule that PL/pgSQL follows,
> > and that my patch was trying to promote for queries:
> >
> >     CREATE TABLE test2(x test NOT NULL);
> >     CREATE TABLE
> >     INSERT INTO test2 VALUES (null);
> >     ERROR:  null value in column "x" violates not-null constraint
> >     DETAIL:  Failing row contains (null).
> > -->    INSERT INTO test2 VALUES (row(null));
> >     INSERT 0 1
> 
> If I remember correctly, the standard wants a NOT NULL constraint
> on a column with a composite type to behave the same as
> 
>   CHECK (col IS DISTINCT FROM NULL)
> 
> ... which is consistent with the behavior you show.

Is IS DISTINCT FROM correct though?
SELECT ROW(NULL) IS DISTINCT FROM NULL; ?column?---------- t(1 row)

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: strange IS NULL behaviour

From
Kevin Grittner
Date:
Bruce Momjian <bruce@momjian.us> wrote:

> Is IS DISTINCT FROM correct though?
>
>     SELECT ROW(NULL) IS DISTINCT FROM NULL;
>     ?column?
>     ----------
>     t
>     (1 row)

My recollection from previous discussions is that this is what is
required by the standard.  ROW(NULL) IS NULL, but it is DISTINCT
FROM NULL.  The IS NULL predicate, when applied to a row or record
is meant to indicate whether that row or record *contains only NULL
elements*, and IS NOT NULL is meant to indicate that a row or
record *contains only NOT NULL elements*.  So this is all as
required:

test=# create table x (c1 int, c2 int);
CREATE TABLE
test=# insert into x values (1, 1), (2, null), (null, 3), (null, null);
INSERT 0 4
test=# select * from x where x is not null;
 c1 | c2
----+----
  1 |  1
(1 row)

test=# select * from x where x is null;
 c1 | c2
----+----
    |   
(1 row)

test=# select * from x where not x is null;
 c1 | c2
----+----
  1 |  1
  2 |   
    |  3
(3 rows)

test=# select * from x where not x is not null;
 c1 | c2
----+----
  2 |   
    |  3
    |   
(3 rows)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company