Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers

From Drouvot, Bertrand
Subject Re: Minimal logical decoding on standbys
Date
Msg-id 1f2ad9fd-53ff-74cc-a602-72d45423bcab@gmail.com
Whole thread Raw
In response to Re: Minimal logical decoding on standbys  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Minimal logical decoding on standbys  ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>)
List pgsql-hackers
Hi,

On 12/20/22 10:41 PM, Robert Haas wrote:
> On Tue, Dec 20, 2022 at 3:39 PM Robert Haas <robertmhaas@gmail.com> wrote:
>> I think this might be the only WAL record type where there's a
>> problem, but I haven't fully confirmed that yet.
> 
> It's not. GIST has the same issue. The same test case demonstrates the
> problem there, if you substitute this test script for
> kpt_hash_setup.sql and possibly also run it for somewhat longer. One
> might think that this wouldn't be a problem, because the comments for
> gistxlogDelete say this:
> 
>       /*
>        * In payload of blk 0 : todelete OffsetNumbers
>        */
> 
> But it's not in the payload of blk 0. It follows the main payload.

Oh right, nice catch!

Indeed, we can see in gistRedoDeleteRecord():

"
todelete = (OffsetNumber *) ((char *) xldata + SizeOfGistxlogDelete);
"

> 
> This is the reverse of xl_heap_freeze_page, which claims that freeze
> plans and offset numbers follow, but they don't: they're in the data
> for block 0. 

oh right, we can see in heap_xlog_freeze_page():

"
        plans = (xl_heap_freeze_plan *) XLogRecGetBlockData(record, 0, NULL);
        offsets = (OffsetNumber *) ((char *) plans +
                                    (xlrec->nplans *
                                     sizeof(xl_heap_freeze_plan)));
"


> xl_btree_delete is also wrong, claiming that the data
> follows when it's really attached to block 0. 


oh right, we can see in btree_xlog_delete():

"
        char       *ptr = XLogRecGetBlockData(record, 0, NULL);

        page = (Page) BufferGetPage(buffer);

        if (xlrec->nupdated > 0)
        {
            OffsetNumber *updatedoffsets;
            xl_btree_update *updates;

            updatedoffsets = (OffsetNumber *)
                (ptr + xlrec->ndeleted * sizeof(OffsetNumber));
            updates = (xl_btree_update *) ((char *) updatedoffsets +
                                           xlrec->nupdated *
                                           sizeof(OffsetNumber));
"



> I guess whatever else we
> do here, we should fix the comments.
> 

Agree, please find attached a patch proposal doing so.


> Bottom line is that I think the two cases that have alignment issues
> as coded are xl_hash_vacuum_one_page and gistxlogDelete. Everything
> else is OK, as far as I can tell right now.
> 

Thanks a lot for the repro(s) and explanations! That's very useful/helpful.

Based on your discovery about the wrong comments above, I'm now tempted to fix those 2 alignment issues
by using a FLEXIBLE_ARRAY_MEMBER within those structs (as you proposed in [1]) (as that should also prevent
any possible wrong comments about where the array is located).

What do you think?

[1]: https://www.postgresql.org/message-id/CA%2BTgmoaVcu_mbxbH%3DEccvKG6u8%2BMdQf9zx98uAL9zsStFwrYsQ%40mail.gmail.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: pgsql: Doc: Explain about Column List feature.
Next
From: Dag Lem
Date:
Subject: Re: daitch_mokotoff module