Re: Gather performance analysis - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Gather performance analysis
Date
Msg-id CAFiTN-tohveb8YhfR=bs1=8=bcMWABuGr8ZHWXFC9FsyFRv1Kw@mail.gmail.com
Whole thread Raw
In response to Re: Gather performance analysis  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Gather performance analysis  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sun, Sep 26, 2021 at 11:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sat, Sep 25, 2021 at 2:18 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> >
> > On 9/24/21 7:08 PM, Robert Haas wrote:
> > > On Fri, Sep 24, 2021 at 3:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >> Tomas, can you share your test script, I would like to repeat the same
> > >> test in my environment and with different batching sizes.
>
> For now I have tested for 1M and 10M rows, shared buffers=16GM, for
> now tested with default batching 1/4th of the queue size and I can see
> the performance gain is huge. Time taken with the patch is in the
> range of 37-90% compared to the master.  Please refer to the attached
> file for more detailed results.  I could not see any regression that
> Tomas saw, still I am planning to repeat it with different batch
> sizes.

I have done testing with different batch sizes, 16k (which is the same
as 1/4 of the queue size with 64k queue size) , 8k, 4k, 2k.

In the attached sheet I have done a comparison of
1. head vs patch (1/4 queue size) = execution time reduced to 37% to
90% this is the same as the old sheet.
2. patch (1/4 queue size) vs patch(8k batch) =  both are same, but 8k
batch size is slow in some cases.
3. patch (1/4 queue size) vs patch(4k batch) = both are same, but 4k
batch size is slow in some cases (even slower than 8k batch size).
4. patch (1/4 queue size) vs patch(2k batch) = 2k batch size is
significantly slow.

With these results, 1/4 of the queue size seems to be the winner and I
think we might go for that value, however someone might think that 4k
batch size is optimal because it is just marginally slow and with that
we will have to worry less about increasing the latency in some worse
case.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Failed transaction statistics to measure the logical replication progress
Next
From: Michael Paquier
Date:
Subject: Re: typos (and more)