Re: Missing update of all_hasnulls in BRIN opclasses - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Missing update of all_hasnulls in BRIN opclasses
Date
Msg-id 20230303103219.ttf42jeozpkzewci@alvherre.pgsql
Whole thread Raw
In response to Re: Missing update of all_hasnulls in BRIN opclasses  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Missing update of all_hasnulls in BRIN opclasses  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
Thanks for doing all this.  (Do I understand correctly that this patch
is not in the commitfest?)

I think my mental model for this was that "allnulls" meant that either
there are no values for the column in question or that the values were
all nulls (For minmax without NULL handling, which is where this all
started, these two things are essentially the same: the range is not to
be returned.  So this became a bug the instant I added handling for NULL
values.)  I failed to realize that these were two different things, and
this is likely the origin of all these troubles.

What do you think of using the unused bit in BrinTuple->bt_info to
denote a range that contains no heap tuples?  This also means we need it
in BrinMemTuple, I think we can do this:

@@ -44,6 +44,7 @@ typedef struct BrinValues
 typedef struct BrinMemTuple
 {
     bool        bt_placeholder; /* this is a placeholder tuple */
+    bool        bt_empty_range;    /* range has no tuples */
     BlockNumber bt_blkno;        /* heap blkno that the tuple is for */
     MemoryContext bt_context;    /* memcxt holding the bt_columns values */
     /* output arrays for brin_deform_tuple: */
@@ -69,7 +70,7 @@ typedef struct BrinTuple
      *
      * 7th (high) bit: has nulls
      * 6th bit: is placeholder tuple
-     * 5th bit: unused
+     * 5th bit: range has no tuples
      * 4-0 bit: offset of data
      * ---------------
      */
@@ -82,7 +83,7 @@ typedef struct BrinTuple
  * bt_info manipulation macros
  */
 #define BRIN_OFFSET_MASK        0x1F
-/* bit 0x20 is not used at present */
+#define BRIN_EMPTY_RANGE        0x20
 #define BRIN_PLACEHOLDER_MASK    0x40
 #define BRIN_NULLS_MASK            0x80
 
(Note that bt_empty_range uses a hole in the struct, so there's no ABI
change.)

This is BRIN-tuple-level, not column-level, so conceptually it seems
more appropriate.  (In the case where both are empty in union_tuples, we
can return without entering the per-attribute loop at all, though I
admit it's not a very interesting case.)  This approach avoids having to
invent the strange combination of all+has to mean empty.


On 2023-Feb-24, Tomas Vondra wrote:

> I wonder what's the best
> way to test this in an automated way - it's very dependent on timing of
> the concurrent updated. For example we need to do something like this:
> 
>     T1: run pg_summarize_range() until it inserts the placeholder tuple
>     T2: do an insert into the page range (updates placeholder)
>     T1: continue pg_summarize_range() to merge into the placeholder
> 
> But there are no convenient ways to do this, I think. I had to check the
> various cases using breakpoints in gdb etc.

Yeah, I struggled with this during initial development but in the end
did nothing.  I think we would need to introduce some new framework,
perhaps Korotkov stop-events stuff at 
https://postgr.es/m/CAPpHfdsTeb+hBT5=qxghjNG_cHcJLDaNQ9sdy9vNwBF2E2PuZA@mail.gmail.com
which seemed to me a good fit -- we would add a stop point after the
placeholder tuple is inserted.

> I'm not very happy with the union_tuples() changes - it's quite verbose,
> perhaps a bit too verbose. We have to check for empty ranges first, and
> then various combinations of allnulls/hasnulls flags for both BRIN
> tuples. There are 9 combinations, and the current code just checks them
> one by one - I was getting repeatedly confused by the original code, but
> maybe it's too much.

I think it's okay.  I tried to make it more compact (by saying "these
two combinations here are case 2, and these two other are case 4", and
keeping each of the other combinations a separate case; so there are
really 7 cases).  But that doesn't make it any easier to follow, on the
contrary it was more convoluted.  I think a dozen extra lines of source
is not a problem.

> The alternative is to apply the same fix to every BRIN_PROCNUM_UNION
> opclass procedure out there. I guess doing that for minmax+inclusion is
> not a huge deal, but what about external opclasses? And without the fix
> the indexes are effectively broken. Fixing this outside in brin.c (in
> the union procedure) fixes this for every opclass procedure, without any
> actual limitation of functinality (14+ does that anyway).

About the hypothetical question, you could as well ask what about
unicorns.  I have never seen any hint that any external opclass exist.
I am all for maintaining compatibility, but I think this concern is
overblown for BRIN.  Anyway, I think your proposed fix is better than
changing individual 'union' support procs, so it doesn't matter.

As far as I understood, you're now worried that there will be an
incompatibility because we will fail to call the 'union' procedure in
cases where we previously called it?  In other words, you fear that some
hypothetical opclass was handling the NULL values in some way that's
incompatible with this?  I haven't thought terribly hard about this, but
I can't see a way for this to cause incompatibilities.

> But maybe someone thinks this is a bad idea and we should do something
> else in the backbranches?

I think the new handling of NULLs in commit 72ccf55cb99c ("Move IS [NOT]
NULL handling from BRIN support functions") is better than what was
there before, so I don't object to backpatching it now that we know it's
necessary to fix a bug, and also we have field experience that the
approach is solid.

The attached patch is just a pointer to comments that I think need light
edition.  There's also a typo "bot" (for "both") in a comment that I
think would go away if you accept my suggestion to store 'empty' at the
tuple level.  Note that I worked with the REL_14_STABLE sources, because
for some reason I thought that that was the newest that needed
backpatching of 72ccf55cb99c, but now that I'm finishing this email I
realize that I should have used 13 instead /facepalm

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La persona que no quería pecar / estaba obligada a sentarse
 en duras y empinadas sillas    / desprovistas, por cierto
 de blandos atenuantes"                          (Patricio Vogel)

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: cataloguing NOT NULL constraints
Next
From: Amit Kapila
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher