Re: Document NULL - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Document NULL
Date
Msg-id CAKFQuwZoeWrM0WuH2io+MoiFKtQjp+Y+8gO1PUDRs8iEzwj16w@mail.gmail.com
Whole thread Raw
In response to Re: Document NULL  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Document NULL
List pgsql-hackers
Thank you for the review.  Changes noted below will be part of v5.

On Fri, Nov 22, 2024 at 12:00 PM Jeff Davis <pgsql@j-davis.com> wrote:

One idea is to have a brief guidance section to help users know how to
use nulls in their data model effectively. For instance, if you allow
nulls for middle_name to mean "no middle name", then you have to be
careful when concatenating it as part of a larger string (otherwise it
will make the entire result null).

I haven't explicitly included such an example but have expanded in this direction a bit.

Using COALESCE() can be a good
strategy here.


I have now mentioned coalesce and nullif.

   <para>
    When dealing with null values it is often useful to explicitly to convert
    data to and from a null value given a known non-null representation
    (e.g., the empty string, the numbers 0 or 1, or boolean false).
    The <link>COALESCE</link> and <link>NULLIF</link> functions are useful
    for this purpose.
   </para>

 
2.

It would be helpful to go through a combined example that shows how
these varous behaviors interact.

I have not done this.  This is already a large patch and this kind of example doesn't seem like our norm.  I'm not opposed to more content like this but for now would leave considering it as something an interested party can propose once this goes in.
 

3. "...more formally, the Law of the Excluded Middle does not hold:
i.e., p OR NOT(p) != true; for all p."

Switching to formal language here is confusing (and wrong, I think). I
suggest rewording and I don't think you need formal language here:

Agreed.  This isn't the place for that presentation and material.

   <para>
    The presence of null values in the system results in three-valued logic.
    In conventional two-valued (binary) logic every outcome is either true or false.
    In three-valued logic the concept of unknown, represented using the null value, is
    also an outcome.  This results in falsifying the common-sense notion
    that "p OR NOT p" is always true.
   </para>


4. COUNT() with no input is a special case that returns zero, and I
think that's worth mentioning somewhere.


I added a parenthetical to the following sentence to address this point:

When executing an aggregate or window function the state tracking component
   (which may be initialized to a non-null value, e.g., 0 for the count function)
   will remain unchanged even if the underlying processing
   function returns a null value, whether from being defined strict
   or it simply returns a null value upon execution.

I'm hesitant to add an example for it though...the implication of the note seems sufficiently clear - if there are zero rows providing non-null inputs to an aggregate its concept of initialized non-null value will be returned.  Since count doesn't have an input function to check the only way to see zero such rows is if the underlying thing being counted is empty.

David J.

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Unmark gen_random_uuid() function leakproof
Next
From: Nathan Bossart
Date:
Subject: Re: unused-but-set-variable warning on REL_13_STABLE