Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Date
Msg-id 20191022172210.26bdiv44vwvunrh3@development
Whole thread Raw
In response to Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
List pgsql-hackers
On Tue, Oct 22, 2019 at 11:01:48AM +0530, Dilip Kumar wrote:
>On Tue, Oct 22, 2019 at 10:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Thu, Oct 3, 2019 at 1:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> >
>> > I have attempted to test the performance of (Stream + Spill) vs
>> > (Stream + BGW pool) and I can see the similar gain what Alexey had
>> > shown[1].
>> >
>> > In addition to this, I have rebased the latest patchset [2] without
>> > the two-phase logical decoding patch set.
>> >
>> > Test results:
>> > I have repeated the same test as Alexy[1] for 1kk and 1kk data and
>> > here is my result
>> > Stream + Spill
>> > N           time on master(sec)   Total xact time (sec)
>> > 1kk               6                               21
>> > 3kk             18                               55
>> >
>> > Stream + BGW pool
>> > N          time on master(sec)  Total xact time (sec)
>> > 1kk              6                              13
>> > 3kk            19                              35
>> >
>>
>> I think the test results for the master are missing.
>Yeah, That time, I was planning to compare spill vs bgworker.
>  Also, how about
>> running these tests over a network (means master and subscriber are
>> not on the same machine)?
>
>Yeah, we should do that that will show the merit of streaming the
>in-progress transactions.
>

Which I agree it's an interesting feature, I think we need to stop
adding more stuff to this patch series - it's already complex enough, so
making it even more (unnecessary) stuff is a distraction and will make
it harder to get anything committed. Typical "scope creep".

I think the current behavior (spill to file) is sufficient for v0 and
can be improved later - that's fine. I don't think we need to bother
with comparisons to master very much, because while it might be a bit
slower in some cases, you can always disable streaming (so if there's a
regression for your workload, you can undo that).

>   In general, yours and Alexy's test results
>> show that there is merit by having workers applying such transactions.
>>   OTOH, as noted above [1], we are also worried about the performance
>> of Rollbacks if we follow that approach.  I am not sure how much we
>> need to worry about Rollabcks if commits are faster, but can we think
>> of recording the changes in memory and only write to a file if the
>> changes are above a certain threshold?  I think that might help saving
>> I/O in many cases.  I am not very sure if we do that how much
>> additional workers can help, but they might still help.  I think we
>> need to do some tests and experiments to figure out what is the best
>> approach?  What do you think?
>I agree with the point.  I think we might need to do some small
>changes and test to see what could be the best method to handle the
>streamed changes at the subscriber end.
>
>>
>> Tomas, Alexey, do you have any thoughts on this matter?  I think it is
>> important that we figure out the way to proceed in this patch.
>>
>> [1] - https://www.postgresql.org/message-id/b25ce80e-f536-78c8-d5c8-a5df3e230785%40postgrespro.ru
>>
>

I think the patch should do the simplest thing possible, i.e. what it
does today. Otherwise we'll never get it committed.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Next
From: vignesh C
Date:
Subject: Re: Ordering of header file inclusion