Thread: suspicious valgrind reports about radixtree/tidstore on arm64

suspicious valgrind reports about radixtree/tidstore on arm64

From
Tomas Vondra
Date:
Hi,

I tried running master under valgrind on 64-bit ARM (rpi5 running debian
12.5), and I got some suspicous reports, all related to the radixtree
code used by tidstore. I'm used to valgrind on arm sometimes reporting
harmless issues, but this seems like it might be an actual issue.

I'm attaching a snippet with a couple example reports. I can provide the
complete report, but AFAIK it's all just repetitions of these cases. If
needed, I can probably provide access to the rpi5 machine.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> I tried running master under valgrind on 64-bit ARM (rpi5 running debian
> 12.5), and I got some suspicous reports, all related to the radixtree
> code used by tidstore.

What's the test scenario that triggers this?

            regards, tom lane



Re: suspicious valgrind reports about radixtree/tidstore on arm64

From
Tomas Vondra
Date:

On 6/19/24 17:11, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
>> I tried running master under valgrind on 64-bit ARM (rpi5 running debian
>> 12.5), and I got some suspicous reports, all related to the radixtree
>> code used by tidstore.
> 
> What's the test scenario that triggers this?
> 

I haven't investigated that yet, I just ran "make check".


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> On 6/19/24 17:11, Tom Lane wrote:
>> What's the test scenario that triggers this?

> I haven't investigated that yet, I just ran "make check".

I've reproduced what looks like about the same thing on
my Pi 4 using Fedora 38: just run "make installcheck-parallel"
under valgrind, and kaboom.  Definitely needs investigation.

            regards, tom lane

==00:00:04:08.988 16548== Memcheck, a memory error detector
==00:00:04:08.988 16548== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==00:00:04:08.988 16548== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==00:00:04:08.988 16548== Command: postgres -F
==00:00:04:08.988 16548== Parent PID: 16278
==00:00:04:08.988 16548==
==00:00:04:13.280 16548== Conditional jump or move depends on uninitialised value(s)
==00:00:04:13.280 16548==    at 0x4AD2CC: local_ts_node_16_search_eq (radixtree.h:1025)
==00:00:04:13.280 16548==    by 0x4AD2CC: local_ts_node_search (radixtree.h:1057)
==00:00:04:13.280 16548==    by 0x4AF173: local_ts_get_slot_recursive (radixtree.h:1667)
==00:00:04:13.280 16548==    by 0x4AF173: local_ts_set (radixtree.h:1744)
==00:00:04:13.280 16548==    by 0x4AF173: TidStoreSetBlockOffsets (tidstore.c:427)
==00:00:04:13.280 16548==    by 0x4FCEEF: dead_items_add (vacuumlazy.c:2892)
==00:00:04:13.280 16548==    by 0x4FE5EF: lazy_scan_prune (vacuumlazy.c:1500)
==00:00:04:13.280 16548==    by 0x4FE5EF: lazy_scan_heap (vacuumlazy.c:975)
==00:00:04:13.280 16548==    by 0x4FE5EF: heap_vacuum_rel (vacuumlazy.c:497)
==00:00:04:13.280 16548==    by 0x681527: table_relation_vacuum (tableam.h:1720)
==00:00:04:13.280 16548==    by 0x681527: vacuum_rel (vacuum.c:2210)
==00:00:04:13.281 16548==    by 0x682957: vacuum (vacuum.c:622)
==00:00:04:13.281 16548==    by 0x7B7E77: autovacuum_do_vac_analyze (autovacuum.c:3100)
==00:00:04:13.281 16548==    by 0x7B7E77: do_autovacuum (autovacuum.c:2417)
==00:00:04:13.281 16548==    by 0x7B830B: AutoVacWorkerMain (autovacuum.c:1569)
==00:00:04:13.281 16548==    by 0x7BB937: postmaster_child_launch (launch_backend.c:265)
==00:00:04:13.281 16548==    by 0x7BD64F: StartChildProcess (postmaster.c:3928)
==00:00:04:13.281 16548==    by 0x7BF9FB: StartAutovacuumWorker (postmaster.c:3997)
==00:00:04:13.281 16548==    by 0x7BF9FB: process_pm_pmsignal (postmaster.c:3809)
==00:00:04:13.281 16548==    by 0x7BF9FB: ServerLoop.isra.0 (postmaster.c:1667)
==00:00:04:13.281 16548==    by 0x7C0EBF: PostmasterMain (postmaster.c:1372)
==00:00:04:13.281 16548==
==00:00:04:13.673 16548== Conditional jump or move depends on uninitialised value(s)
==00:00:04:13.673 16548==    at 0x4AD2CC: local_ts_node_16_search_eq (radixtree.h:1025)
==00:00:04:13.673 16548==    by 0x4AD2CC: local_ts_node_search (radixtree.h:1057)
==00:00:04:13.673 16548==    by 0x4AFD4F: local_ts_find (radixtree.h:1111)
==00:00:04:13.673 16548==    by 0x4AFD4F: TidStoreIsMember (tidstore.c:443)
==00:00:04:13.673 16548==    by 0x5110D7: btvacuumpage (nbtree.c:1235)
==00:00:04:13.673 16548==    by 0x51171B: btvacuumscan (nbtree.c:1023)
==00:00:04:13.673 16548==    by 0x51185B: btbulkdelete (nbtree.c:824)
==00:00:04:13.673 16548==    by 0x683D3B: vac_bulkdel_one_index (vacuum.c:2498)
==00:00:04:13.673 16548==    by 0x4FDA9B: lazy_vacuum_one_index (vacuumlazy.c:2443)
==00:00:04:13.673 16548==    by 0x4FDA9B: lazy_vacuum_all_indexes (vacuumlazy.c:2026)
==00:00:04:13.673 16548==    by 0x4FDA9B: lazy_vacuum (vacuumlazy.c:1944)
==00:00:04:13.673 16548==    by 0x4FE7F3: lazy_scan_heap (vacuumlazy.c:1050)
==00:00:04:13.674 16548==    by 0x4FE7F3: heap_vacuum_rel (vacuumlazy.c:497)
==00:00:04:13.674 16548==    by 0x681527: table_relation_vacuum (tableam.h:1720)
==00:00:04:13.674 16548==    by 0x681527: vacuum_rel (vacuum.c:2210)
==00:00:04:13.674 16548==    by 0x682957: vacuum (vacuum.c:622)
==00:00:04:13.674 16548==    by 0x7B7E77: autovacuum_do_vac_analyze (autovacuum.c:3100)
==00:00:04:13.674 16548==    by 0x7B7E77: do_autovacuum (autovacuum.c:2417)
==00:00:04:13.674 16548==    by 0x7B830B: AutoVacWorkerMain (autovacuum.c:1569)
==00:00:04:13.674 16548==
==00:00:04:13.681 16548== Conditional jump or move depends on uninitialised value(s)
==00:00:04:13.681 16548==    at 0x4AD2CC: local_ts_node_16_search_eq (radixtree.h:1025)
==00:00:04:13.681 16548==    by 0x4AD2CC: local_ts_node_search (radixtree.h:1057)
==00:00:04:13.681 16548==    by 0x4AFD4F: local_ts_find (radixtree.h:1111)
==00:00:04:13.681 16548==    by 0x4AFD4F: TidStoreIsMember (tidstore.c:443)
==00:00:04:13.681 16548==    by 0x51126B: btreevacuumposting (nbtree.c:1405)
==00:00:04:13.681 16548==    by 0x51126B: btvacuumpage (nbtree.c:1249)
==00:00:04:13.681 16548==    by 0x51171B: btvacuumscan (nbtree.c:1023)
==00:00:04:13.681 16548==    by 0x51185B: btbulkdelete (nbtree.c:824)
==00:00:04:13.681 16548==    by 0x683D3B: vac_bulkdel_one_index (vacuum.c:2498)
==00:00:04:13.681 16548==    by 0x4FDA9B: lazy_vacuum_one_index (vacuumlazy.c:2443)
==00:00:04:13.681 16548==    by 0x4FDA9B: lazy_vacuum_all_indexes (vacuumlazy.c:2026)
==00:00:04:13.681 16548==    by 0x4FDA9B: lazy_vacuum (vacuumlazy.c:1944)
==00:00:04:13.681 16548==    by 0x4FE7F3: lazy_scan_heap (vacuumlazy.c:1050)
==00:00:04:13.681 16548==    by 0x4FE7F3: heap_vacuum_rel (vacuumlazy.c:497)
==00:00:04:13.681 16548==    by 0x681527: table_relation_vacuum (tableam.h:1720)
==00:00:04:13.681 16548==    by 0x681527: vacuum_rel (vacuum.c:2210)
==00:00:04:13.681 16548==    by 0x682957: vacuum (vacuum.c:622)
==00:00:04:13.681 16548==    by 0x7B7E77: autovacuum_do_vac_analyze (autovacuum.c:3100)
==00:00:04:13.681 16548==    by 0x7B7E77: do_autovacuum (autovacuum.c:2417)
==00:00:04:13.681 16548==    by 0x7B830B: AutoVacWorkerMain (autovacuum.c:1569)
==00:00:04:13.681 16548==
==00:00:04:35.444 16548==
==00:00:04:35.445 16548== HEAP SUMMARY:
==00:00:04:35.445 16548==     in use at exit: 1,936,233 bytes in 285 blocks
==00:00:04:35.445 16548==   total heap usage: 59,846 allocs, 1,316 frees, 84,811,772 bytes allocated
==00:00:04:35.445 16548==
==00:00:04:37.660 16548== LEAK SUMMARY:
==00:00:04:37.660 16548==    definitely lost: 31,979 bytes in 234 blocks
==00:00:04:37.660 16548==    indirectly lost: 14,436 bytes in 48 blocks
==00:00:04:37.660 16548==      possibly lost: 371,045 bytes in 1,075 blocks
==00:00:04:37.660 16548==    still reachable: 586,591 bytes in 1,410 blocks
==00:00:04:37.660 16548==         suppressed: 0 bytes in 0 blocks
==00:00:04:37.660 16548== Rerun with --leak-check=full to see details of leaked memory
==00:00:04:37.660 16548==
==00:00:04:37.660 16548== Use --track-origins=yes to see where uninitialised values come from
==00:00:04:37.660 16548== For lists of detected and suppressed errors, rerun with: -s
==00:00:04:37.660 16548== ERROR SUMMARY: 7781 errors from 3 contexts (suppressed: 0 from 0)

I wrote:
> I've reproduced what looks like about the same thing on
> my Pi 4 using Fedora 38: just run "make installcheck-parallel"
> under valgrind, and kaboom.  Definitely needs investigation.

The problem appears to be that RT_ALLOC_NODE doesn't bother to
initialize the chunks[] array when making a RT_NODE_16 node.
If we fill fewer than RT_FANOUT_16_MAX of the chunks[] entries,
then when RT_NODE_16_SEARCH_EQ applies vector operations that
read the entire array, it's operating partially on uninitialized
data.  Now, that's actually OK because of the "mask off invalid
entries" step, but aarch64 valgrind complains anyway.

I hypothesize that the reason we're not seeing equivalent failures
on x86_64 is one of

1. x86_64 valgrind is stupider than aarch64, and fails to track that
the contents of the SIMD registers are only partially defined.

2. x86_64 valgrind is smarter than aarch64, and is able to see
that the "mask off invalid entries" step removes all the
potentially-uninitialized bits.

The first attached patch, "radixtree-fix-minimal.patch", is enough
to stop the aarch64 valgrind failure for me.  However, I think
that the coding here is pretty penny-wise and pound-foolish,
and that what we really ought to do is the second patch,
"radixtree-fix-proposed.patch".  I do not believe that asking
memset to zero the three-byte RT_NODE structure produces code
that's either shorter or faster than having it zero 8 bytes
(as for RT_NODE_4) or having it do that and then separately
zero some more stuff (as for the larger node types).  Leaving
RT_NODE_4's chunks[] array uninitialized is going to bite us
someday, too, even if it doesn't right now.  So I think we
ought to just zero the whole fixed-size part of the nodes,
which is what radixtree-fix-proposed.patch does.

(The RT_NODE_48 case could be optimized a bit if we cared
to swap the order of its slot_idxs[] and isset[] arrays;
then the initial zeroing could just go up to slot_idxs[].
I don't know if there's any reason why the current order
is preferable.)

            regards, tom lane

diff --git a/src/include/lib/radixtree.h b/src/include/lib/radixtree.h
index 338e1d741d..e21f7be3f9 100644
--- a/src/include/lib/radixtree.h
+++ b/src/include/lib/radixtree.h
@@ -849,8 +849,14 @@ RT_ALLOC_NODE(RT_RADIX_TREE * tree, const uint8 kind, const RT_SIZE_CLASS size_c
     switch (kind)
     {
         case RT_NODE_KIND_4:
-        case RT_NODE_KIND_16:
             break;
+        case RT_NODE_KIND_16:
+            {
+                RT_NODE_16 *n16 = (RT_NODE_16 *) node;
+
+                memset(n16->chunks, 0, sizeof(n16->chunks));
+                break;
+            }
         case RT_NODE_KIND_48:
             {
                 RT_NODE_48 *n48 = (RT_NODE_48 *) node;
diff --git a/src/include/lib/radixtree.h b/src/include/lib/radixtree.h
index 338e1d741d..4510f7c4cd 100644
--- a/src/include/lib/radixtree.h
+++ b/src/include/lib/radixtree.h
@@ -845,27 +845,25 @@ RT_ALLOC_NODE(RT_RADIX_TREE * tree, const uint8 kind, const RT_SIZE_CLASS size_c

     /* initialize contents */

-    memset(node, 0, sizeof(RT_NODE));
     switch (kind)
     {
         case RT_NODE_KIND_4:
+            memset(node, 0, offsetof(RT_NODE_4, children));
+            break;
         case RT_NODE_KIND_16:
+            memset(node, 0, offsetof(RT_NODE_16, children));
             break;
         case RT_NODE_KIND_48:
             {
                 RT_NODE_48 *n48 = (RT_NODE_48 *) node;

-                memset(n48->isset, 0, sizeof(n48->isset));
+                memset(n48, 0, offsetof(RT_NODE_48, children));
                 memset(n48->slot_idxs, RT_INVALID_SLOT_IDX, sizeof(n48->slot_idxs));
                 break;
             }
         case RT_NODE_KIND_256:
-            {
-                RT_NODE_256 *n256 = (RT_NODE_256 *) node;
-
-                memset(n256->isset, 0, sizeof(n256->isset));
-                break;
-            }
+            memset(node, 0, offsetof(RT_NODE_256, children));
+            break;
         default:
             pg_unreachable();
     }

I wrote:
> I hypothesize that the reason we're not seeing equivalent failures
> on x86_64 is one of

> 1. x86_64 valgrind is stupider than aarch64, and fails to track that
> the contents of the SIMD registers are only partially defined.

> 2. x86_64 valgrind is smarter than aarch64, and is able to see
> that the "mask off invalid entries" step removes all the
> potentially-uninitialized bits.

Side note: it struck me that this could also be a valgrind version
skew issue.  But the machine I'm seeing the failure on is running
valgrind-3.22.0-1.fc38.aarch64, which is the same upstream version
as valgrind-3.22.0-2.el8.x86_64, where I don't see it.  So apparently
not.  (There is a 3.23.0 out recently, but its release notes don't
mention anything that seems related.)

            regards, tom lane



I wrote:
> I hypothesize that the reason we're not seeing equivalent failures
> on x86_64 is one of

> 1. x86_64 valgrind is stupider than aarch64, and fails to track that
> the contents of the SIMD registers are only partially defined.

> 2. x86_64 valgrind is smarter than aarch64, and is able to see
> that the "mask off invalid entries" step removes all the
> potentially-uninitialized bits.

Hah: it's the second case.  If I patch radixtree.h as attached,
then x86_64 valgrind complains about

==00:00:00:32.759 247596== Conditional jump or move depends on uninitialised value(s)
==00:00:00:32.759 247596==    at 0x52F668: local_ts_node_16_search_eq (radixtree.h:1018)

showing that it knows that the result of vector8_highbit_mask is
only partly defined.  Kind of odd though that aarch64 valgrind
is getting the hard part right and not the easy(?) part.

            regards, tom lane

diff --git a/src/include/lib/radixtree.h b/src/include/lib/radixtree.h
index 338e1d741d..267ec6de03 100644
--- a/src/include/lib/radixtree.h
+++ b/src/include/lib/radixtree.h
@@ -1015,12 +1015,15 @@ RT_NODE_16_SEARCH_EQ(RT_NODE_16 * node, uint8 chunk)
     /* convert comparison to a bitfield */
     bitfield = vector8_highbit_mask(cmp1) | (vector8_highbit_mask(cmp2) << sizeof(Vector8));

-    /* mask off invalid entries */
-    bitfield &= ((UINT64CONST(1) << count) - 1);
-
-    /* convert bitfield to index by counting trailing zeros */
     if (bitfield)
-        slot_simd = &node->children[pg_rightmost_one_pos32(bitfield)];
+    {
+        /* mask off invalid entries */
+        bitfield &= ((UINT64CONST(1) << count) - 1);
+
+        /* convert bitfield to index by counting trailing zeros */
+        if (bitfield)
+            slot_simd = &node->children[pg_rightmost_one_pos32(bitfield)];
+    }

     Assert(slot_simd == slot);
     return slot_simd;

Re: suspicious valgrind reports about radixtree/tidstore on arm64

From
Masahiko Sawada
Date:
Hi,

On Thu, Jun 20, 2024 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > I've reproduced what looks like about the same thing on
> > my Pi 4 using Fedora 38: just run "make installcheck-parallel"
> > under valgrind, and kaboom.  Definitely needs investigation.
>
> The problem appears to be that RT_ALLOC_NODE doesn't bother to
> initialize the chunks[] array when making a RT_NODE_16 node.
> If we fill fewer than RT_FANOUT_16_MAX of the chunks[] entries,
> then when RT_NODE_16_SEARCH_EQ applies vector operations that
> read the entire array, it's operating partially on uninitialized
> data.  Now, that's actually OK because of the "mask off invalid
> entries" step, but aarch64 valgrind complains anyway.
>
> I hypothesize that the reason we're not seeing equivalent failures
> on x86_64 is one of
>
> 1. x86_64 valgrind is stupider than aarch64, and fails to track that
> the contents of the SIMD registers are only partially defined.
>
> 2. x86_64 valgrind is smarter than aarch64, and is able to see
> that the "mask off invalid entries" step removes all the
> potentially-uninitialized bits.
>
> The first attached patch, "radixtree-fix-minimal.patch", is enough
> to stop the aarch64 valgrind failure for me.  However, I think
> that the coding here is pretty penny-wise and pound-foolish,
> and that what we really ought to do is the second patch,
> "radixtree-fix-proposed.patch".  I do not believe that asking
> memset to zero the three-byte RT_NODE structure produces code
> that's either shorter or faster than having it zero 8 bytes
> (as for RT_NODE_4) or having it do that and then separately
> zero some more stuff (as for the larger node types).  Leaving
> RT_NODE_4's chunks[] array uninitialized is going to bite us
> someday, too, even if it doesn't right now.  So I think we
> ought to just zero the whole fixed-size part of the nodes,
> which is what radixtree-fix-proposed.patch does.

I agree with radixtree-fix-proposed.patch. Even if we zero more fields
in the node it would not add noticeable overheads.

>
> (The RT_NODE_48 case could be optimized a bit if we cared
> to swap the order of its slot_idxs[] and isset[] arrays;
> then the initial zeroing could just go up to slot_idxs[].
> I don't know if there's any reason why the current order
> is preferable.)

IIUC there is no particular reason for the current order in RT_NODE_48.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: suspicious valgrind reports about radixtree/tidstore on arm64

From
Michael Paquier
Date:
On Thu, Jun 20, 2024 at 10:11:37AM +0900, Masahiko Sawada wrote:
> I agree with radixtree-fix-proposed.patch. Even if we zero more fields
> in the node it would not add noticeable overheads.

This needs to be tracked as an open item, so I have added one now.
--
Michael

Attachment
On Thu, Jun 20, 2024 at 8:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> On Thu, Jun 20, 2024 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >

> I agree with radixtree-fix-proposed.patch. Even if we zero more fields
> in the node it would not add noticeable overheads.

+1 in general, although I'm slightly concerned about this part:

> > (The RT_NODE_48 case could be optimized a bit if we cared
> > to swap the order of its slot_idxs[] and isset[] arrays;
> > then the initial zeroing could just go up to slot_idxs[].

- memset(n48->isset, 0, sizeof(n48->isset));
+ memset(n48, 0, offsetof(RT_NODE_48, children));
  memset(n48->slot_idxs, RT_INVALID_SLOT_IDX, sizeof(n48->slot_idxs));

I was a bit surprised that neither gcc 14 nor clang 18 can figure out
that they can skip zeroing the slot index array since it's later
filled in with "invalid index", so they actually zero out 272 bytes
before re-initializing 256 of those bytes. It may not matter in
practice, but it's also not free, and trivial to avoid.

> > I don't know if there's any reason why the current order
> > is preferable.)
>
> IIUC there is no particular reason for the current order in RT_NODE_48.

Yeah. I found that simply swapping them enables clang to avoid
double-initialization, but gcc still can't figure it out and must be
told to stop at slot_idxs[]. I'd prefer to do it that way and document
that slot_idxs is purposefully the last member of the fixed part of
the struct. If that's agreeable I'll commit it that way tomorrow
unless someone beats me to it.



John Naylor <johncnaylorls@gmail.com> writes:
> On Thu, Jun 20, 2024 at 8:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> IIUC there is no particular reason for the current order in RT_NODE_48.

> Yeah. I found that simply swapping them enables clang to avoid
> double-initialization, but gcc still can't figure it out and must be
> told to stop at slot_idxs[]. I'd prefer to do it that way and document
> that slot_idxs is purposefully the last member of the fixed part of
> the struct.

WFM.

> If that's agreeable I'll commit it that way tomorrow
> unless someone beats me to it.

I was going to push it, but feel free.

            regards, tom lane



Re: suspicious valgrind reports about radixtree/tidstore on arm64

From
Ranier Vilela
Date:
Em qua., 19 de jun. de 2024 às 20:52, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
I wrote:
> I hypothesize that the reason we're not seeing equivalent failures
> on x86_64 is one of

> 1. x86_64 valgrind is stupider than aarch64, and fails to track that
> the contents of the SIMD registers are only partially defined.

> 2. x86_64 valgrind is smarter than aarch64, and is able to see
> that the "mask off invalid entries" step removes all the
> potentially-uninitialized bits.

Hah: it's the second case.  If I patch radixtree.h as attached,
then x86_64 valgrind complains about

==00:00:00:32.759 247596== Conditional jump or move depends on uninitialised value(s)
==00:00:00:32.759 247596==    at 0x52F668: local_ts_node_16_search_eq (radixtree.h:1018)

showing that it knows that the result of vector8_highbit_mask is
only partly defined.
I wouldn't be surprised if *RT_NODE_16_GET_INSERTPOS* (src/include/lib/radixtree.h),
does not suffer from the same problem?
Even with Assert trying to protect.

Does the fix not apply here too?

best regards,
Ranier Vilela

Re: suspicious valgrind reports about radixtree/tidstore on arm64

From
Masahiko Sawada
Date:
On Thu, Jun 20, 2024 at 4:58 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Thu, Jun 20, 2024 at 8:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > On Thu, Jun 20, 2024 at 7:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > I don't know if there's any reason why the current order
> > > is preferable.)
> >
> > IIUC there is no particular reason for the current order in RT_NODE_48.
>
> Yeah. I found that simply swapping them enables clang to avoid
> double-initialization, but gcc still can't figure it out and must be
> told to stop at slot_idxs[]. I'd prefer to do it that way and document
> that slot_idxs is purposefully the last member of the fixed part of
> the struct.

+1

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Ranier Vilela <ranier.vf@gmail.com> writes:
> Em qua., 19 de jun. de 2024 às 20:52, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
>> Hah: it's the second case.  If I patch radixtree.h as attached,
>> then x86_64 valgrind complains about
>> ==00:00:00:32.759 247596== Conditional jump or move depends on
>> uninitialised value(s)
>> ==00:00:00:32.759 247596==    at 0x52F668: local_ts_node_16_search_eq
>> (radixtree.h:1018)
>> showing that it knows that the result of vector8_highbit_mask is
>> only partly defined.

> I wouldn't be surprised if *RT_NODE_16_GET_INSERTPOS*
> (src/include/lib/radixtree.h),
> does not suffer from the same problem?

Dunno, I only saw valgrind complaints in local_ts_node_16_search_eq,
and Tomas reported the same.

It seems moderately likely to me that this is a bug in aarch64
valgrind.  Still, if it is that then people will have to deal with it
for awhile yet.  It won't cost us anything meaningful to work around
it (he says, without having done actual measurements...)

            regards, tom lane



On Thu, Jun 20, 2024 at 2:58 PM John Naylor <johncnaylorls@gmail.com> wrote:
> the struct. If that's agreeable I'll commit it that way tomorrow
> unless someone beats me to it.

Pushed. I'll clear the open item once all buildfarm members have reported in.



John Naylor <johncnaylorls@gmail.com> writes:
> Pushed. I'll clear the open item once all buildfarm members have reported in.

Just to confirm, my raspberry pi 4 got through "make
installcheck-parallel" under valgrind after this commit.

            regards, tom lane