Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers
From | Drouvot, Bertrand |
---|---|
Subject | Re: Minimal logical decoding on standbys |
Date | |
Msg-id | 1f2ad9fd-53ff-74cc-a602-72d45423bcab@gmail.com Whole thread Raw |
In response to | Re: Minimal logical decoding on standbys (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Minimal logical decoding on standbys
("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
|
List | pgsql-hackers |
Hi, On 12/20/22 10:41 PM, Robert Haas wrote: > On Tue, Dec 20, 2022 at 3:39 PM Robert Haas <robertmhaas@gmail.com> wrote: >> I think this might be the only WAL record type where there's a >> problem, but I haven't fully confirmed that yet. > > It's not. GIST has the same issue. The same test case demonstrates the > problem there, if you substitute this test script for > kpt_hash_setup.sql and possibly also run it for somewhat longer. One > might think that this wouldn't be a problem, because the comments for > gistxlogDelete say this: > > /* > * In payload of blk 0 : todelete OffsetNumbers > */ > > But it's not in the payload of blk 0. It follows the main payload. Oh right, nice catch! Indeed, we can see in gistRedoDeleteRecord(): " todelete = (OffsetNumber *) ((char *) xldata + SizeOfGistxlogDelete); " > > This is the reverse of xl_heap_freeze_page, which claims that freeze > plans and offset numbers follow, but they don't: they're in the data > for block 0. oh right, we can see in heap_xlog_freeze_page(): " plans = (xl_heap_freeze_plan *) XLogRecGetBlockData(record, 0, NULL); offsets = (OffsetNumber *) ((char *) plans + (xlrec->nplans * sizeof(xl_heap_freeze_plan))); " > xl_btree_delete is also wrong, claiming that the data > follows when it's really attached to block 0. oh right, we can see in btree_xlog_delete(): " char *ptr = XLogRecGetBlockData(record, 0, NULL); page = (Page) BufferGetPage(buffer); if (xlrec->nupdated > 0) { OffsetNumber *updatedoffsets; xl_btree_update *updates; updatedoffsets = (OffsetNumber *) (ptr + xlrec->ndeleted * sizeof(OffsetNumber)); updates = (xl_btree_update *) ((char *) updatedoffsets + xlrec->nupdated * sizeof(OffsetNumber)); " > I guess whatever else we > do here, we should fix the comments. > Agree, please find attached a patch proposal doing so. > Bottom line is that I think the two cases that have alignment issues > as coded are xl_hash_vacuum_one_page and gistxlogDelete. Everything > else is OK, as far as I can tell right now. > Thanks a lot for the repro(s) and explanations! That's very useful/helpful. Based on your discovery about the wrong comments above, I'm now tempted to fix those 2 alignment issues by using a FLEXIBLE_ARRAY_MEMBER within those structs (as you proposed in [1]) (as that should also prevent any possible wrong comments about where the array is located). What do you think? [1]: https://www.postgresql.org/message-id/CA%2BTgmoaVcu_mbxbH%3DEccvKG6u8%2BMdQf9zx98uAL9zsStFwrYsQ%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: