Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash - Mailing list pgsql-hackers

From Drouvot, Bertrand
Subject Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash
Date
Msg-id afa12c6b-8c0f-a4cf-7a2c-50c73285965f@amazon.com
Whole thread Raw
In response to Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

On 8/12/21 1:00 PM, Amit Kapila wrote:
> On Tue, Aug 10, 2021 at 5:30 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>> Hi Amit,
>>
>> On 8/9/21 1:12 PM, Amit Kapila wrote:
>>> On Mon, Aug 9, 2021 at 3:37 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>>>> Hi Amit,
>>>>
>>>> On 8/9/21 10:37 AM, Amit Kapila wrote:
>>>>> On Fri, Jul 9, 2021 at 12:22 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>>>>>> Please find enclosed a patch proposal to:
>>>>>>
>>>>>> * Avoid the failed assertion on current master and generate the error message instead (should the code reach
thatstage).
 
>>>>>> * Reset the toast_hash in case of relation rewrite with toast (so that the logical decoding in the above repro
isworking).
 
>>>>>>
>>>>> I think instead of resetting toast_hash for this case why don't we set
>>>>> 'relrewrite' for toast tables as well during rewrite? If we do that
>>>>> then we will simply skip assembling toast chunks for the toast table.
>>>> Thanks for looking at it!
>>>>
>>>> I do agree, that would be even better than the current patch approach:
>>>> I'll work on it.
>>>>
>>>>> In make_new_heap(), we are calling NewHeapCreateToastTable() to create
>>>>> toast table where we can pass additional information (probably
>>>>> 'toastid'), if required to set 'relrewrite'. Additionally, let's add a
>>>>> test case if possible for this.
>>>> + 1 for the test case, it will be added in the next version of the patch.
>>>>
>>> Thanks, please see, if you can prepare patches for the back-branches as well.
>> Please find attached the new version that:
>>
>> - sets "relwrewrite" for the toast.
>>
> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -3861,6 +3861,10 @@ RenameRelationInternal(Oid myrelid, const char
> *newrelname, bool is_internal, bo
>    */
>    namestrcpy(&(relform->relname), newrelname);
>
> + /* reset relrewrite for toast */
> + if (relform->relkind == RELKIND_TOASTVALUE)
> + relform->relrewrite = InvalidOid;
> +
>
> I find this change quite ad-hoc. I think this API is quite generic to
> make such a change. I see two ways for this (a) pass a new bool flag
> (reset_toast_rewrite) in this API and then make this change, (b) in
> the specific place where we need this, change relrewrite separately
> via a new API.
>
> I would prefer (b) in the ideal case, but I understand it would be an
> additional cost, so maybe (a) is also okay. What do you people think?

I would prefer a) because:

- b) would need to update the exact same tuple one more time (means 
doing more or less the same work: open relation, search for the tuple, 
update the tuple...)

- a) would still give the ability for someone reading the code to 
understand where the relrewrite reset is needed (as opposed to the way 
the patch is currently written)

- finish_heap_swap() with swap_toast_by_content set to false, is the 
only place where we currently need to reset explicitly relrewrite (so 
that we would have the new API produced by b) being called only at that 
place).

- That means that b) would be only for code readability but at the price 
of extra cost.


That said, I think we can go with a) and rethink about b) later on if 
there is a need of this new API in other places for other reasons.

Thanks

Bertrand




pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Next Steps with Hash Indexes
Next
From: Greg Nancarrow
Date:
Subject: Re: Skipping logical replication transactions on subscriber side