Re: pg_verify_checksums and -fno-strict-aliasing - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pg_verify_checksums and -fno-strict-aliasing
Date
Msg-id 7502.1535670135@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg_verify_checksums and -fno-strict-aliasing  (Andres Freund <andres@anarazel.de>)
Responses Re: pg_verify_checksums and -fno-strict-aliasing
Re: pg_verify_checksums and -fno-strict-aliasing
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2018-08-30 18:11:40 -0400, Tom Lane wrote:
>> I suspect people will complain about the added cost of doing that.

> I think the compiler will just optimize it away.

Maybe.  In any case, the attached version avoids any need for memcpy
and is, I think, cleaner code anyhow.  It fixes the problem for me
with Fedora's gcc, though I'm not sure that it'd be enough to get rid
of the warning Michael sees (I wonder what compiler he's using).

>> BTW, not to mention the elephant in the room, but: is it *really* OK
>> that pg_checksum_page scribbles on the page buffer, even temporarily?
>> It's certainly quite horrid that there aren't large warnings about
>> that in the function's API comment.

> It certainly should be warned about.  Practically I don't think it's a
> problem, because we pretty much always operate on a copy of the page
> when writing out, as otherwise concurrently set hint bits would be
> troublesome.

The write end of things is not a problem IMO, since presumably the caller
would be about to overwrite the checksum field anyway.  The issue is for
reading and/or verifying, where it's much less obvious that clobbering
the page image is safe.

            regards, tom lane

diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index 64d7622..a49d27f 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -107,6 +107,13 @@
 /* prime multiplier of FNV-1a hash */
 #define FNV_PRIME 16777619

+/* Use a union so that this code is valid under strict aliasing */
+typedef union
+{
+    PageHeaderData phdr;
+    uint32        data[BLCKSZ / (sizeof(uint32) * N_SUMS)][N_SUMS];
+} PGChecksummablePage;
+
 /*
  * Base offsets to initialize each of the parallel FNV hashes into a
  * different initial state.
@@ -132,28 +139,27 @@ do { \
 } while (0)

 /*
- * Block checksum algorithm.  The data argument must be aligned on a 4-byte
- * boundary.
+ * Block checksum algorithm.  The page must be adequately aligned
+ * (at least on 4-byte boundary).
  */
 static uint32
-pg_checksum_block(char *data, uint32 size)
+pg_checksum_block(const PGChecksummablePage *page)
 {
     uint32        sums[N_SUMS];
-    uint32        (*dataArr)[N_SUMS] = (uint32 (*)[N_SUMS]) data;
     uint32        result = 0;
     uint32        i,
                 j;

     /* ensure that the size is compatible with the algorithm */
-    Assert((size % (sizeof(uint32) * N_SUMS)) == 0);
+    Assert(sizeof(PGChecksummablePage) == BLCKSZ);

     /* initialize partial checksums to their corresponding offsets */
     memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));

     /* main checksum calculation */
-    for (i = 0; i < size / sizeof(uint32) / N_SUMS; i++)
+    for (i = 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
         for (j = 0; j < N_SUMS; j++)
-            CHECKSUM_COMP(sums[j], dataArr[i][j]);
+            CHECKSUM_COMP(sums[j], page->data[i][j]);

     /* finally add in two rounds of zeroes for additional mixing */
     for (i = 0; i < 2; i++)
@@ -168,8 +174,10 @@ pg_checksum_block(char *data, uint32 size)
 }

 /*
- * Compute the checksum for a Postgres page.  The page must be aligned on a
- * 4-byte boundary.
+ * Compute the checksum for a Postgres page.
+ *
+ * The page must be adequately aligned (at least on a 4-byte boundary).
+ * Beware also that the checksum field of the page is transiently zeroed.
  *
  * The checksum includes the block number (to detect the case where a page is
  * somehow moved to a different location), the page header (excluding the
@@ -178,12 +186,12 @@ pg_checksum_block(char *data, uint32 size)
 uint16
 pg_checksum_page(char *page, BlockNumber blkno)
 {
-    PageHeader    phdr = (PageHeader) page;
+    PGChecksummablePage *cpage = (PGChecksummablePage *) page;
     uint16        save_checksum;
     uint32        checksum;

     /* We only calculate the checksum for properly-initialized pages */
-    Assert(!PageIsNew(page));
+    Assert(!PageIsNew(&cpage->phdr));

     /*
      * Save pd_checksum and temporarily set it to zero, so that the checksum
@@ -191,10 +199,10 @@ pg_checksum_page(char *page, BlockNumber blkno)
      * Restore it after, because actually updating the checksum is NOT part of
      * the API of this function.
      */
-    save_checksum = phdr->pd_checksum;
-    phdr->pd_checksum = 0;
-    checksum = pg_checksum_block(page, BLCKSZ);
-    phdr->pd_checksum = save_checksum;
+    save_checksum = cpage->phdr.pd_checksum;
+    cpage->phdr.pd_checksum = 0;
+    checksum = pg_checksum_block(cpage);
+    cpage->phdr.pd_checksum = save_checksum;

     /* Mix in the block number to detect transposed pages */
     checksum ^= blkno;

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_verify_checksums and -fno-strict-aliasing
Next
From: Andres Freund
Date:
Subject: Re: pg_verify_checksums and -fno-strict-aliasing