On Wed, Feb 9, 2022 at 11:08 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Tue, Feb 8, 2022 3:18 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2022-02-07 08:44:00 +0530, Amit Kapila wrote:
> > > Right, and it is getting changed. We are just printing the first 200
> > > characters (by using SQL [1]) from the decoded tuple so what is shown
> > > in the results is the initial 200 bytes.
> >
> > Ah, I knew I must have been missing something.
> >
> >
> > > The complete decoded data after the patch is as follows:
> >
> > Hm. I think we should change the way the strings are shortened - otherwise we
> > don't really verify much in that test. Perhaps we could just replace the long
> > repetitive strings with something shorter in the output?
> >
> > E.g. using something like regexp_replace(data,
> > '(1234567890|9876543210){200}', '\1{200}','g')
> > inside the substr().
> >
> > Wonder if we should deduplicate the number of different toasted strings in the
> > file to something that'd allow us to have a single "redact_toast" function or
> > such. There's too many different ones to have a reasonbly simple redaction
> > function right now. But that's perhaps better done separately.
> >
>
> I tried to make the output shorter using your suggestion like the following SQL,
> please see the attached patch, which is based on v8 patch[1].
>
> SELECT substr(regexp_replace(data, '(1234567890|9876543210){200}', '\1{200}','g'), 1, 200) FROM
pg_logical_slot_get_changes('regression_slot',NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
>
> Note that some strings are still longer than 200 characters even though they have
> been shorter, so they can't be shown entirely.
>
> e.g.
> table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1
toasted_key[text]:unchanged-toast-datumtoasted_col1[text]:unchanged-toast-datum toasted_col2[te
>
> The entire string is:
> table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' new-tuple: id[integer]:1
toasted_key[text]:unchanged-toast-datumtoasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'9876543210{200}'
>
> Maybe it's better to change the substr length to 250 to show the entire string, or we
> can do it as separate HEAD only improvement where we can deduplicate some of the
> other long strings as well. Thoughts?
>
I think it is better to do this as a separate HEAD-only improvement as
it can affect other tests results. We can also try to deduplicate some
of the other long strings used in toast.sql file along with it.
--
With Regards,
Amit Kapila.