Thread: Re: Delay the variable initialization in get_rel_sync_entry

Re: Delay the variable initialization in get_rel_sync_entry

From
"Euler Taveira"
Date:
On Wed, Dec 22, 2021, at 10:11 AM, houzj.fnst@fujitsu.com wrote:
When reviewing some logical replication patches. I noticed that in
function get_rel_sync_entry() we always invoke get_rel_relispartition() 
and get_rel_relkind() at the beginning which could cause unnecessary
cache access.

---
get_rel_sync_entry(PGOutputData *data, Oid relid)
{
RelationSyncEntry *entry;
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
---

The extra cost could sometimes be noticeable because get_rel_sync_entry is a
hot function which is executed for each change. And the 'am_partition' and
'relkind' are necessary only when we need to rebuild the RelationSyncEntry.

Here is the perf result for the case when inserted large amounts of data into a
un-published table in which case the cost is noticeable.

--12.83%--pgoutput_change
    |--11.84%--get_rel_sync_entry
    |--4.76%--get_rel_relispartition
|--4.70%--get_rel_relkind
Good catch. WFM. Deferring variable initialization close to its first use is
good practice.


--
Euler Taveira

Re: Delay the variable initialization in get_rel_sync_entry

From
Michael Paquier
Date:
On Thu, Dec 23, 2021 at 12:54:41PM -0300, Euler Taveira wrote:
> On Wed, Dec 22, 2021, at 10:11 AM, houzj.fnst@fujitsu.com wrote:
>> The extra cost could sometimes be noticeable because get_rel_sync_entry is a
>> hot function which is executed for each change. And the 'am_partition' and
>> 'relkind' are necessary only when we need to rebuild the RelationSyncEntry.
>>
>> Here is the perf result for the case when inserted large amounts of data into a
>> un-published table in which case the cost is noticeable.
>>
>> --12.83%--pgoutput_change
>>     |--11.84%--get_rel_sync_entry
>>     |--4.76%--get_rel_relispartition
>> |--4.70%--get_rel_relkind

How does the perf balance change once you apply the patch?  Do we have
anything else that stands out?  Getting rid of this bottleneck is fine
by itself, but I am wondering if there are more things to worry about
or not.

> Good catch. WFM. Deferring variable initialization close to its first use is
> good practice.

Yeah, it is usually a good practice to have the declaration within
the code block that uses it rather than piling everything at the
beginning of the function.  Being able to see that in profiles is
annoying, and the change is simple, so I'd like to backpatch it.

This is a period of vacations for a lot of people, so I'll wait until
the beginning-ish of January before doing anything.
--
Michael

Attachment

RE: Delay the variable initialization in get_rel_sync_entry

From
"houzj.fnst@fujitsu.com"
Date:
On Friday, December 24, 2021 8:13 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Dec 23, 2021 at 12:54:41PM -0300, Euler Taveira wrote:
> > On Wed, Dec 22, 2021, at 10:11 AM, houzj.fnst@fujitsu.com wrote:
> >> The extra cost could sometimes be noticeable because get_rel_sync_entry is
> a
> >> hot function which is executed for each change. And the 'am_partition' and
> >> 'relkind' are necessary only when we need to rebuild the RelationSyncEntry.
> >>
> >> Here is the perf result for the case when inserted large amounts of data into
> a
> >> un-published table in which case the cost is noticeable.
> >>
> >> --12.83%--pgoutput_change
> >>     |--11.84%--get_rel_sync_entry
> >>     |--4.76%--get_rel_relispartition
> >> |--4.70%--get_rel_relkind
>
> How does the perf balance change once you apply the patch?  Do we have
> anything else that stands out?  Getting rid of this bottleneck is fine
> by itself, but I am wondering if there are more things to worry about
> or not.

Thanks for the response.
Here is the perf result of pgoutput_change after applying the patch.
I didn't notice something else that stand out.

        |--2.99%--pgoutput_change
            |--1.80%--get_rel_sync_entry
                    |--1.56%--hash_search

Also attach complete profiles.

> > Good catch. WFM. Deferring variable initialization close to its first use is
> > good practice.
>
> Yeah, it is usually a good practice to have the declaration within
> the code block that uses it rather than piling everything at the
> beginning of the function.  Being able to see that in profiles is
> annoying, and the change is simple, so I'd like to backpatch it.

+1

> This is a period of vacations for a lot of people, so I'll wait until
> the beginning-ish of January before doing anything.

Thanks, added it to CF.
https://commitfest.postgresql.org/36/3471/

Best regards,
Hou zj

Attachment

Re: Delay the variable initialization in get_rel_sync_entry

From
Michael Paquier
Date:
On Fri, Dec 24, 2021 at 01:27:26PM +0000, houzj.fnst@fujitsu.com wrote:
> Here is the perf result of pgoutput_change after applying the patch.
> I didn't notice something else that stand out.
>
>         |--2.99%--pgoutput_change
>             |--1.80%--get_rel_sync_entry
>                     |--1.56%--hash_search
>
> Also attach complete profiles.

Thanks.  I have also done my own set of measurements, and the
difference is noticeable in the profiles I looked at.  So, applied
down to 13.
--
Michael

Attachment

RE: Delay the variable initialization in get_rel_sync_entry

From
"houzj.fnst@fujitsu.com"
Date:
On Wednesday, January 5, 2022 9:31 AM Michael Paquier <michael@paquier.xyz> wrote:
> On Fri, Dec 24, 2021 at 01:27:26PM +0000, houzj.fnst@fujitsu.com wrote:
> > Here is the perf result of pgoutput_change after applying the patch.
> > I didn't notice something else that stand out.
> >
> >         |--2.99%--pgoutput_change
> >             |--1.80%--get_rel_sync_entry
> >                     |--1.56%--hash_search
> >
> > Also attach complete profiles.
>
> Thanks.  I have also done my own set of measurements, and the difference is
> noticeable in the profiles I looked at.  So, applied down to 13.
Thanks for pushing!

Best regards,
Hou zj



Re: Delay the variable initialization in get_rel_sync_entry

From
Amit Langote
Date:
On Wed, Jan 5, 2022 at 10:31 AM Michael Paquier <michael@paquier.xyz> wrote:
> On Fri, Dec 24, 2021 at 01:27:26PM +0000, houzj.fnst@fujitsu.com wrote:
> > Here is the perf result of pgoutput_change after applying the patch.
> > I didn't notice something else that stand out.
> >
> >         |--2.99%--pgoutput_change
> >             |--1.80%--get_rel_sync_entry
> >                     |--1.56%--hash_search
> >
> > Also attach complete profiles.
>
> Thanks.  I have also done my own set of measurements, and the
> difference is noticeable in the profiles I looked at.  So, applied
> down to 13.

Thanks.  I agree the variables were being defined in the wrong place
before this patch.

-- 
Amit Langote
EDB: http://www.enterprisedb.com