Re: [BUG]Update Toast data failure in logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [BUG]Update Toast data failure in logical replication
Date
Msg-id CAA4eK1+VGApXZ5sEyn-3O7nos+Jx_cGAUbukU=khyDZCreM9MA@mail.gmail.com
Whole thread Raw
In response to RE: [BUG]Update Toast data failure in logical replication  ("tanghy.fnst@fujitsu.com" <tanghy.fnst@fujitsu.com>)
List pgsql-hackers
On Wed, Feb 9, 2022 at 11:08 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Tue, Feb 8, 2022 3:18 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2022-02-07 08:44:00 +0530, Amit Kapila wrote:
> > > Right, and it is getting changed. We are just printing the first 200
> > > characters (by using SQL [1]) from the decoded tuple so what is shown
> > > in the results is the initial 200 bytes.
> >
> > Ah, I knew I must have been missing something.
> >
> >
> > > The complete decoded data after the patch is as follows:
> >
> > Hm. I think we should change the way the strings are shortened - otherwise we
> > don't really verify much in that test. Perhaps we could just replace the long
> > repetitive strings with something shorter in the output?
> >
> > E.g. using something like regexp_replace(data,
> > '(1234567890|9876543210){200}', '\1{200}','g')
> > inside the substr().
> >
> > Wonder if we should deduplicate the number of different toasted strings in the
> > file to something that'd allow us to have a single "redact_toast" function or
> > such. There's too many different ones to have a reasonbly simple redaction
> > function right now. But that's perhaps better done separately.
> >
>
> I tried to make the output shorter using your suggestion like the following SQL,
> please see the attached patch, which is based on v8 patch[1].
>
> SELECT substr(regexp_replace(data, '(1234567890|9876543210){200}', '\1{200}','g'), 1, 200) FROM
pg_logical_slot_get_changes('regression_slot',NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
 
>
> Note that some strings are still longer than 200 characters even though they have
> been shorter, so they can't be shown entirely.
>
> e.g.
> table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1
toasted_key[text]:unchanged-toast-datumtoasted_col1[text]:unchanged-toast-datum toasted_col2[te
 
>
> The entire string is:
> table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1
toasted_key[text]:unchanged-toast-datumtoasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'9876543210{200}'
 
>
> Maybe it's better to change the substr length to 250 to show the entire string, or we
> can do it as separate HEAD only improvement where we can deduplicate some of the
> other long strings as well. Thoughts?
>

I think it is better to do this as a separate HEAD-only improvement as
it can affect other tests results. We can also try to deduplicate some
of the other long strings used in toast.sql file along with it.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [RFC] building postgres with meson - perl embedding
Next
From: Michael Paquier
Date:
Subject: Re: Plug minor memleak in pg_dump