Home > mailing lists

Thread: casting operand to proper type in BlockIdGetBlockNumber

casting operand to proper type in BlockIdGetBlockNumber

From

Zhihong Yu

Date:

03 March 2022, 15:34:08

Hi,

In test output, I saw:

src/backend/utils/adt/tid.c:112:16: runtime error: left shift of 65535 by 16 places cannot be represented in type 'int'

I think this was due to the left shift in BlockIdGetBlockNumber not properly casting its operand.

Please see the proposed change in patch.

Thanks

Attachment

get-block-number.patch

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Tom Lane

Date:

03 March 2022, 15:44:13

Zhihong Yu <zyu@yugabyte.com> writes:
> In test output, I saw:
> src/backend/utils/adt/tid.c:112:16: runtime error: left shift of 65535 by
> 16 places cannot be represented in type 'int'

What compiler is that?

            regards, tom lane

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Zhihong Yu

Date:

03 March 2022, 15:57:27

On Thu, Mar 3, 2022 at 7:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Zhihong Yu <zyu@yugabyte.com> writes:
> In test output, I saw:
> src/backend/utils/adt/tid.c:112:16: runtime error: left shift of 65535 by
> 16 places cannot be represented in type 'int'

What compiler is that?

regards, tom lane

Hi,

Jenkins build is alma8-clang12-asan

So it is clang12 on alma.

Cheers

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Tom Lane

Date:

03 March 2022, 16:24:28

Zhihong Yu <zyu@yugabyte.com> writes:
> On Thu, Mar 3, 2022 at 7:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Zhihong Yu <zyu@yugabyte.com> writes:
>>> In test output, I saw:
>>> src/backend/utils/adt/tid.c:112:16: runtime error: left shift of 65535 by
>>> 16 places cannot be represented in type 'int'

> Jenkins build is alma8-clang12-asan

Oh, I misread this as a compile-time warning, but it must be from ASAN.
Was the test case one of your own, or just our normal regression tests?

(I think the code is indeed incorrect, but I'm wondering why this hasn't
been reported before.  It's been like that for a long time.)

            regards, tom lane

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Zhihong Yu

Date:

03 March 2022, 16:34:33

On Thu, Mar 3, 2022 at 8:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Zhihong Yu <zyu@yugabyte.com> writes:
> On Thu, Mar 3, 2022 at 7:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Zhihong Yu <zyu@yugabyte.com> writes:
>>> In test output, I saw:
>>> src/backend/utils/adt/tid.c:112:16: runtime error: left shift of 65535 by
>>> 16 places cannot be represented in type 'int'

> Jenkins build is alma8-clang12-asan

Oh, I misread this as a compile-time warning, but it must be from ASAN.
Was the test case one of your own, or just our normal regression tests?

(I think the code is indeed incorrect, but I'm wondering why this hasn't
been reported before. It's been like that for a long time.)

regards, tom lane

Hi,

The Jenkins test is ported from contrib/postgres_fdw/sql/postgres_fdw.sql - so theoretically PG would see the same error for clang12 on Alma.

Here were a few lines prior to the sanitizer complaint:

ts1|pid123867|:30045 2022-03-02 01:47:57.098 UTC [124161] STATEMENT: CREATE TRIGGER trig_row_before
ts1|pid123867|:30045 BEFORE INSERT OR UPDATE OR DELETE ON rem1
ts1|pid123867|:30045 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');
ts1|pid123867|:30045 2022-03-02 01:47:57.106 UTC [124161] ERROR: function trigger_data() does not exist
ts1|pid123867|:30045 2022-03-02 01:47:57.106 UTC [124161] STATEMENT: CREATE TRIGGER trig_row_after
ts1|pid123867|:30045 AFTER INSERT OR UPDATE OR DELETE ON rem1

ts1|pid123867|:30045 FOR EACH ROW EXECUTE PROCEDURE trigger_data(23,'skidoo');

I think the ASAN build on Alma is able to detect errors such as this.

Cheers

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Tom Lane

Date:

03 March 2022, 17:13:40

Zhihong Yu <zyu@yugabyte.com> writes:
> On Thu, Mar 3, 2022 at 8:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Oh, I misread this as a compile-time warning, but it must be from ASAN.
>> Was the test case one of your own, or just our normal regression tests?

> The Jenkins test is ported from contrib/postgres_fdw/sql/postgres_fdw.sql -
> so theoretically PG would see the same error for clang12 on Alma.

Hmph.  I tried enabling -fsanitize=undefined here, and I get some
complaints about passing null pointers to memcmp and the like, but
nothing about this shift (tested with clang 12.0.1 on RHEL8 as well
as clang 13.0.0 on Fedora 35).  What compiler switches are being
used exactly?

            regards, tom lane

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Zhihong Yu

Date:

03 March 2022, 17:28:57

On Thu, Mar 3, 2022 at 9:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Zhihong Yu <zyu@yugabyte.com> writes:
> On Thu, Mar 3, 2022 at 8:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Oh, I misread this as a compile-time warning, but it must be from ASAN.
>> Was the test case one of your own, or just our normal regression tests?

> The Jenkins test is ported from contrib/postgres_fdw/sql/postgres_fdw.sql -
> so theoretically PG would see the same error for clang12 on Alma.

Hmph. I tried enabling -fsanitize=undefined here, and I get some
complaints about passing null pointers to memcmp and the like, but
nothing about this shift (tested with clang 12.0.1 on RHEL8 as well
as clang 13.0.0 on Fedora 35). What compiler switches are being
used exactly?

regards, tom lane

Hi,

This is from (internal Jenkins) build log:

CMAKE_C_FLAGS -Werror -fno-strict-aliasing -Wall -msse4.2 -Winvalid-pch -pthread -DBOOST_BIND_NO_PLACEHOLDERS -DBOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX -DROCKSDB_PLATFORM_POSIX -DBOOST_ERROR_CODE_HEADER_ONLY -march=ivybridge -mcx16 -DYB_COMPILER_TYPE=clang12 -DYB_COMPILER_VERSION=12.0.1 -DROCKSDB_LIB_IO_POSIX -DSNAPPY -DLZ4 -DZLIB -mno-avx -mno-bmi -mno-bmi2 -mno-fma -D__STDC_FORMAT_MACROS -Wno-deprecated-declarations -DGFLAGS=gflags -Werror=enum-compare -Werror=switch -Werror=return-type -Werror=string-plus-int -Werror=return-stack-address -Werror=implicit-fallthrough -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS -Wthread-safety-analysis -Wshorten-64-to-32 -ggdb -O1 -fno-omit-frame-pointer -DFASTDEBUG -Wno-ambiguous-member-template -Wimplicit-fallthrough -Qunused-arguments -stdlib=libc++ -D_GLIBCXX_EXTERN_TEMPLATE=0 -nostdinc++ -stdlib=libc++ -D_GLIBCXX_EXTERN_TEMPLATE=0 -nostdinc++ -shared-libasan -fsanitize=address -DADDRESS_SANITIZER -fsanitize=undefined -fno-sanitize-recover=all -fno-sanitize=alignment,vptr -fsanitize-recover=float-cast-overflow -fsanitize-blacklist=... -fPIC

I would suggest trying out the build on Alma Linux.

FYI

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Andres Freund

Date:

03 March 2022, 17:29:37

Hi,

On 2022-03-03 12:13:40 -0500, Tom Lane wrote:
> Zhihong Yu <zyu@yugabyte.com> writes:
> > On Thu, Mar 3, 2022 at 8:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Oh, I misread this as a compile-time warning, but it must be from ASAN.
> >> Was the test case one of your own, or just our normal regression tests?
> 
> > The Jenkins test is ported from contrib/postgres_fdw/sql/postgres_fdw.sql -
> > so theoretically PG would see the same error for clang12 on Alma.
> 
> Hmph.  I tried enabling -fsanitize=undefined here, and I get some
> complaints about passing null pointers to memcmp and the like, but
> nothing about this shift (tested with clang 12.0.1 on RHEL8 as well
> as clang 13.0.0 on Fedora 35).

We should fix these passing-null-pointer cases...


> What compiler switches are being used exactly?

FWIW, I've successfully used:
-Og -fsanitize=alignment,undefined -fno-sanitize=nonnull-attribute -fno-sanitize=float-cast-overflow
-fno-sanitize-recover=all

Need to manually add -ldl, because -fsanitize breaks our dl test (it uses
dlopen, but not dlsym). Was planning to submit a fix for that...

Greetings,

Andres Freund

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Tom Lane

Date:

03 March 2022, 17:45:22

Andres Freund <andres@anarazel.de> writes:
> On 2022-03-03 12:13:40 -0500, Tom Lane wrote:
>> Hmph.  I tried enabling -fsanitize=undefined here, and I get some
>> complaints about passing null pointers to memcmp and the like, but
>> nothing about this shift (tested with clang 12.0.1 on RHEL8 as well
>> as clang 13.0.0 on Fedora 35).

> We should fix these passing-null-pointer cases...

Yeah, working on that now.  But I'm pretty confused about why I can't
duplicate this shift complaint.  Alma is a Red Hat clone no?  Why
doesn't its compiler act the same as RHEL8's?

> Need to manually add -ldl, because -fsanitize breaks our dl test (it uses
> dlopen, but not dlsym). Was planning to submit a fix for that...

Hmm ... didn't get through check-world yet, but I don't see that
so far.

            regards, tom lane

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Andres Freund

Date:

03 March 2022, 17:50:22

Hi,

On 2022-03-03 12:45:22 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-03-03 12:13:40 -0500, Tom Lane wrote:
> >> Hmph.  I tried enabling -fsanitize=undefined here, and I get some
> >> complaints about passing null pointers to memcmp and the like, but
> >> nothing about this shift (tested with clang 12.0.1 on RHEL8 as well
> >> as clang 13.0.0 on Fedora 35).
> 
> > We should fix these passing-null-pointer cases...
> 
> Yeah, working on that now.  But I'm pretty confused about why I can't
> duplicate this shift complaint.  Alma is a Red Hat clone no?  Why
> doesn't its compiler act the same as RHEL8's?

I didn't see that either. It could be a question of building with full
optimizations / asserts vs without?


> > Need to manually add -ldl, because -fsanitize breaks our dl test (it uses
> > dlopen, but not dlsym). Was planning to submit a fix for that...
> 
> Hmm ... didn't get through check-world yet, but I don't see that
> so far.

Oh, for me it doesn't even build. Perhaps one of the dependencies injects it
as well?

Greetings,

Andres Freund

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Tom Lane

Date:

03 March 2022, 19:00:14

I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> We should fix these passing-null-pointer cases...

> Yeah, working on that now.

The attached is enough to get through check-world with
"-fsanitize=undefined" using RHEL8's clang 12.0.1.
Most of it is the same old null-pointer-with-zero-count
business, but the change in numeric.c is a different
issue: "ln(-1.0)" ends up computing log10(0), which
produces -Inf, and then tries to assign that to an integer.
We don't actually care about the garbage result in that case,
so it's only a sanitizer complaint not a live bug.

I'm not sure whether to back-patch --- looking through the
git logs, it seems we've back-patched some fixes like these
and not others.  Thoughts?

In any case, if we're going to take this seriously it seems
like we need a buildfarm machine or two testing this option.

            regards, tom lane

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 59d43e2ba9..4e6aeba315 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -328,7 +328,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
     /*
      * copy the scan key, if appropriate
      */
-    if (key != NULL)
+    if (key != NULL && scan->rs_base.rs_nkeys > 0)
         memcpy(scan->rs_base.rs_key, key, scan->rs_base.rs_nkeys * sizeof(ScanKeyData));

     /*
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index ceadac70d5..ff0b8a688d 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -1564,8 +1564,8 @@ HeapTupleHeaderIsOnlyLocked(HeapTupleHeader tuple)
 static bool
 TransactionIdInArray(TransactionId xid, TransactionId *xip, Size num)
 {
-    return bsearch(&xid, xip, num,
-                   sizeof(TransactionId), xidComparator) != NULL;
+    return num > 0 &&
+        bsearch(&xid, xip, num, sizeof(TransactionId), xidComparator) != NULL;
 }

 /*
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index de787c3d37..3d9088a704 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -297,8 +297,9 @@ TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
     if (all_xact_same_page && xid == MyProc->xid &&
         nsubxids <= THRESHOLD_SUBTRANS_CLOG_OPT &&
         nsubxids == MyProc->subxidStatus.count &&
-        memcmp(subxids, MyProc->subxids.xids,
-               nsubxids * sizeof(TransactionId)) == 0)
+        (nsubxids == 0 ||
+         memcmp(subxids, MyProc->subxids.xids,
+                nsubxids * sizeof(TransactionId)) == 0))
     {
         /*
          * If we can immediately acquire XactSLRULock, we update the status of
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index adf763a8ea..8964ddf3eb 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5353,8 +5353,9 @@ SerializeTransactionState(Size maxsize, char *start_address)
     {
         if (FullTransactionIdIsValid(s->fullTransactionId))
             workspace[i++] = XidFromFullTransactionId(s->fullTransactionId);
-        memcpy(&workspace[i], s->childXids,
-               s->nChildXids * sizeof(TransactionId));
+        if (s->nChildXids > 0)
+            memcpy(&workspace[i], s->childXids,
+                   s->nChildXids * sizeof(TransactionId));
         i += s->nChildXids;
     }
     Assert(i == nxids);
diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c
index 45b0dfc062..603cf9b0fa 100644
--- a/src/backend/storage/ipc/shm_mq.c
+++ b/src/backend/storage/ipc/shm_mq.c
@@ -773,8 +773,11 @@ shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)

         /* Copy as much as we can. */
         Assert(mqh->mqh_partial_bytes + rb <= nbytes);
-        memcpy(&mqh->mqh_buffer[mqh->mqh_partial_bytes], rawdata, rb);
-        mqh->mqh_partial_bytes += rb;
+        if (rb > 0)
+        {
+            memcpy(&mqh->mqh_buffer[mqh->mqh_partial_bytes], rawdata, rb);
+            mqh->mqh_partial_bytes += rb;
+        }

         /*
          * Update count of bytes that can be consumed, accounting for
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 975d7dcf47..45547f6ae7 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -10048,12 +10048,20 @@ exp_var(const NumericVar *arg, NumericVar *result, int rscale)
  *
  * Essentially, we're approximating log10(abs(ln(var))).  This is used to
  * determine the appropriate rscale when computing natural logarithms.
+ *
+ * Note: many callers call this before range-checking the input.  Therefore,
+ * we must be robust against values that are invalid to apply ln() to.
+ * We don't wish to throw an error here, so just return zero in such cases.
  */
 static int
 estimate_ln_dweight(const NumericVar *var)
 {
     int            ln_dweight;

+    /* Caller should fail on ln(negative), but for the moment return zero */
+    if (var->sign != NUMERIC_POS)
+        return 0;
+
     if (cmp_var(var, &const_zero_point_nine) >= 0 &&
         cmp_var(var, &const_one_point_one) <= 0)
     {
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index a0b81bf154..a0be0c411a 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -536,12 +536,14 @@ SetTransactionSnapshot(Snapshot sourcesnap, VirtualTransactionId *sourcevxid,
     CurrentSnapshot->xmax = sourcesnap->xmax;
     CurrentSnapshot->xcnt = sourcesnap->xcnt;
     Assert(sourcesnap->xcnt <= GetMaxSnapshotXidCount());
-    memcpy(CurrentSnapshot->xip, sourcesnap->xip,
-           sourcesnap->xcnt * sizeof(TransactionId));
+    if (sourcesnap->xcnt > 0)
+        memcpy(CurrentSnapshot->xip, sourcesnap->xip,
+               sourcesnap->xcnt * sizeof(TransactionId));
     CurrentSnapshot->subxcnt = sourcesnap->subxcnt;
     Assert(sourcesnap->subxcnt <= GetMaxSnapshotSubxidCount());
-    memcpy(CurrentSnapshot->subxip, sourcesnap->subxip,
-           sourcesnap->subxcnt * sizeof(TransactionId));
+    if (sourcesnap->subxcnt > 0)
+        memcpy(CurrentSnapshot->subxip, sourcesnap->subxip,
+               sourcesnap->subxcnt * sizeof(TransactionId));
     CurrentSnapshot->suboverflowed = sourcesnap->suboverflowed;
     CurrentSnapshot->takenDuringRecovery = sourcesnap->takenDuringRecovery;
     /* NB: curcid should NOT be copied, it's a local matter */
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index 2c8e58ebf5..dcdb2e0d0c 100644
--- a/src/fe_utils/print.c
+++ b/src/fe_utils/print.c
@@ -966,7 +966,8 @@ print_aligned_text(const printTableContent *cont, FILE *fout, bool is_pager)

             more_col_wrapping = col_count;
             curr_nl_line = 0;
-            memset(header_done, false, col_count * sizeof(bool));
+            if (col_count > 0)
+                memset(header_done, false, col_count * sizeof(bool));
             while (more_col_wrapping)
             {
                 if (opt_border == 2)

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Andres Freund

Date:

03 March 2022, 19:46:00

Hi,

On 2022-03-03 14:00:14 -0500, Tom Lane wrote:
> The attached is enough to get through check-world with
> "-fsanitize=undefined" using RHEL8's clang 12.0.1.

Cool.

> I'm not sure whether to back-patch --- looking through the
> git logs, it seems we've back-patched some fixes like these
> and not others.  Thoughts?

It'd be easier to run a BF animal if we fixed it everywhere.

> In any case, if we're going to take this seriously it seems like we need a
> buildfarm machine or two testing this option.

I was planning to add it to the CI runs, just didn't have energy to fix the
failures yet. But you just did (although I think there might be failure or two
more on new-ish debians).

For the buildfarm, I could enable it on flaviventris? That runs an
experimental gcc, without optimization (whereas serinus runs with
optimization). Which seems reasonable to combine with sanitizers?

For CI I compared the cost of the different sanitizers. It looks like
alignment sanitizer is almost free, undefined is pretty cheap, and address
sanitizer is pretty expensive (but still much cheaper than valgrind).

Greetings,

Andres Freund

PS: Hm, seems mylodon died a while ago... Need to check what's up with that.

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Tom Lane

Date:

03 March 2022, 20:31:51

Andres Freund <andres@anarazel.de> writes:
> On 2022-03-03 14:00:14 -0500, Tom Lane wrote:
>> I'm not sure whether to back-patch --- looking through the
>> git logs, it seems we've back-patched some fixes like these
>> and not others.  Thoughts?

> It'd be easier to run a BF animal if we fixed it everywhere.

Fair enough, will BP.

>> In any case, if we're going to take this seriously it seems like we need a
>> buildfarm machine or two testing this option.

> For the buildfarm, I could enable it on flaviventris? That runs an
> experimental gcc, without optimization (whereas serinus runs with
> optimization). Which seems reasonable to combine with sanitizers?

Dunno.  I already found out that my Mac laptop (w/ clang 13) detects
the numeric.c problem but not any of the other ones.  The messages
on RHEL8 cite where the system headers declare memcmp and friends
with "attribute nonnull", so I'm betting that Apple's headers lack
that annotation.

I also tried adding the various -m switches shown in Zhihong's
CFLAGS setting, but that still didn't repro the Alma warning
for me.

So it definitely seems like it's *real* system dependent which of
these warnings you get :-(.

            regards, tom lane

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Andres Freund

Date:

03 March 2022, 21:11:27

Hi,

On 2022-03-03 15:31:51 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-03-03 14:00:14 -0500, Tom Lane wrote:
> > For the buildfarm, I could enable it on flaviventris? That runs an
> > experimental gcc, without optimization (whereas serinus runs with
> > optimization). Which seems reasonable to combine with sanitizers?
> 
> Dunno.  I already found out that my Mac laptop (w/ clang 13) detects
> the numeric.c problem but not any of the other ones.  The messages
> on RHEL8 cite where the system headers declare memcmp and friends
> with "attribute nonnull", so I'm betting that Apple's headers lack
> that annotation.

The sanitizers are documented to work best on linux... As flaviventris runs
linux, so I'm not sure what your concern is?

I think basically newer glibc versions have more annotations, so ubsan will
have more things to fail against. So it'd be good to have a fairly regularly
updated OS.

> I also tried adding the various -m switches shown in Zhihong's
> CFLAGS setting, but that still didn't repro the Alma warning
> for me.

The compilation flags make it look like it's from a run of yugabyte's fork,
rather than plain postgres.

The message says:
src/backend/utils/adt/tid.c:112:16: runtime error: left shift of 65535 by 16 places cannot be represented in type
'int'

Afaics that means bi_hi is 65535. So either we're dealing with a very large
relation or BlockIdGetBlockNumber() is getting passed InvalidBlockNumber?

It might be enough to do something like
SELECT * FROM pg_class WHERE ctid = '(65535, 17)';
to trigger the problem?

Greetings,

Andres Freund

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Zhihong Yu

Date:

03 March 2022, 21:21:39

On Thu, Mar 3, 2022 at 1:11 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2022-03-03 15:31:51 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-03-03 14:00:14 -0500, Tom Lane wrote:
> > For the buildfarm, I could enable it on flaviventris? That runs an
> > experimental gcc, without optimization (whereas serinus runs with
> > optimization). Which seems reasonable to combine with sanitizers?
>
> Dunno. I already found out that my Mac laptop (w/ clang 13) detects
> the numeric.c problem but not any of the other ones. The messages
> on RHEL8 cite where the system headers declare memcmp and friends
> with "attribute nonnull", so I'm betting that Apple's headers lack
> that annotation.

The sanitizers are documented to work best on linux... As flaviventris runs
linux, so I'm not sure what your concern is?

I think basically newer glibc versions have more annotations, so ubsan will
have more things to fail against. So it'd be good to have a fairly regularly
updated OS.

> I also tried adding the various -m switches shown in Zhihong's
> CFLAGS setting, but that still didn't repro the Alma warning
> for me.

The compilation flags make it look like it's from a run of yugabyte's fork,
rather than plain postgres.

Hi,

I should mention that, the PG subtree in yugabyte is currently aligned with PG 11.

There have been backports from PG 12, but code related to tid.c and block.h, etc is the same with upstream PG.

The fdw tests are backported from PG as well.

The message says:
src/backend/utils/adt/tid.c:112:16: runtime error: left shift of 65535 by 16 places cannot be represented in type 'int'

Afaics that means bi_hi is 65535. So either we're dealing with a very large
relation or BlockIdGetBlockNumber() is getting passed InvalidBlockNumber?

It might be enough to do something like
SELECT * FROM pg_class WHERE ctid = '(65535, 17)';
to trigger the problem?

The above syntax is not currently supported in yugabyte.

FYI

Re: casting operand to proper type in BlockIdGetBlockNumber

From

Tom Lane

Date:

03 March 2022, 21:45:14

Andres Freund <andres@anarazel.de> writes:
> The message says:
> src/backend/utils/adt/tid.c:112:16: runtime error: left shift of 65535 by 16 places cannot be represented in type
'int'

> Afaics that means bi_hi is 65535. So either we're dealing with a very large
> relation or BlockIdGetBlockNumber() is getting passed InvalidBlockNumber?

Presumably the latter, since we surely aren't using any terabyte-size
relations in our tests.

> It might be enough to do something like
> SELECT * FROM pg_class WHERE ctid = '(65535, 17)';
> to trigger the problem?

I tried to provoke it with cases like

# select '(-1,0)'::tid;
      tid
----------------
 (4294967295,0)
(1 row)

# select '(4000000000,1)'::tid;
      tid
----------------
 (4000000000,1)
(1 row)

without success.

On a nearby topic, I see that tidin's overflow checks are somewhere
between sloppy and nonexistent:

# select '(40000000000,1)'::tid;
      tid
----------------
 (1345294336,1)
(1 row)

I think I'll fix that while I'm looking at it ... but it still
doesn't explain why no complaint in tidout.

            regards, tom lane