On 2018-08-30 18:11:40 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-08-30 14:46:06 -0700, Andres Freund wrote:
> >> One way to fix it would be to memcpy in/out the modified PageHeader, or
> >> just do offset math and memcpy to that offset.
>
> > It took me a bit to reproduce the issue (due to sheer stupidity on my
> > part: no, changing the flags passed to gcc to link pg_verify_checksums
> > doesn't do the trick), but the above indeed fixes the issue for me.
>
> I suspect people will complain about the added cost of doing that.
I think the compiler will just optimize it away. But if we're concerned
we could write it as
memcpy(&save_checksum, page + offsetof(PageHeaderData, pd_checksum), sizeof(save_checksum));
memset(page + offsetof(PageHeaderData, pd_checksum), 0, sizeof(save_checksum));
checksum = pg_checksum_block(page, BLCKSZ);
memcpy(page + offsetof(PageHeaderData, pd_checksum), &save_checksum, sizeof(save_checksum));
works, but still not exceedingly pretty :/. The code generated looks
reasonable:
194 memcpy(&save_checksum, page + offsetof(PageHeaderData, pd_checksum), sizeof(save_checksum));
0x00000000000035d0 <+0>: push %r12
0x00000000000035d2 <+2>: xor %eax,%eax
0x00000000000035d4 <+4>: movzwl 0x8(%rdi),%r12d
195 memset(page + offsetof(PageHeaderData, pd_checksum), 0, sizeof(save_checksum));
0x00000000000035d9 <+9>: push %rbp
0x00000000000035da <+10>: mov %ax,0x8(%rdi)
(the pushes are just interspersed stuff, yay latency aware instruction
scheduling)
> I've been AFK all afternoon, but what I was intending to try next was
> the union approach, specifically union'ing PageHeaderData with the uint32
> array representation needed by pg_checksum_block(). That might also
> end up giving us code less unreadable than this:
>
> uint32 (*dataArr)[N_SUMS] = (uint32 (*)[N_SUMS]) data;
Hm.
> BTW, not to mention the elephant in the room, but: is it *really* OK
> that pg_checksum_page scribbles on the page buffer, even temporarily?
> It's certainly quite horrid that there aren't large warnings about
> that in the function's API comment.
It certainly should be warned about. Practically I don't think it's a
problem, because we pretty much always operate on a copy of the page
when writing out, as otherwise concurrently set hint bits would be
troublesome.
Greetings,
Andres Freund