Re: 040_pg_createsubscriber.pl is slow and unstable (was Re: speed up a logical replica setup) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: 040_pg_createsubscriber.pl is slow and unstable (was Re: speed up a logical replica setup)
Date
Msg-id 2871796.1722284312@sss.pgh.pa.us
Whole thread Raw
In response to 040_pg_createsubscriber.pl is slow and unstable (was Re: speed up a logical replica setup)  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: 040_pg_createsubscriber.pl is slow and unstable (was Re: speed up a logical replica setup)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Jun 30, 2024 at 2:40 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... However, I added a new open item about how the
>> 040_pg_createsubscriber.pl test is slow and still unstable.

> But that said, I see no commits in the commit history which purport to
> improve performance, so I guess the performance is probably still not
> what you want, though I am not clear on the details.

My concern is described at [1]:

>> I have a different but possibly-related complaint: why is
>> 040_pg_createsubscriber.pl so miserably slow?  On my machine it
>> runs for a bit over 19 seconds, which seems completely out of line
>> (for comparison, 010_pg_basebackup.pl takes 6 seconds, and the
>> other test scripts in this directory take much less).  It looks
>> like most of the blame falls on this step:
>>
>> [12:47:22.292](14.534s) ok 28 - run pg_createsubscriber on node S
>>
>> AFAICS the amount of data being replicated is completely trivial,
>> so that it doesn't make any sense for this to take so long --- and
>> if it does, that suggests that this tool will be impossibly slow
>> for production use.  But I suspect there is a logic flaw causing
>> this.  Speculating wildly, perhaps that is related to the failure
>> Alexander spotted?

The followup discussion in that thread made it sound like there's
some fairly fundamental deficiency in how wait_for_end_recovery()
detects end-of-recovery.  I'm not too conversant with the details
though, and it's possible that pg_createsubscriber is just falling
foul of a pre-existing infelicity.

If the problem can be correctly described as "pg_createsubscriber
takes 10 seconds or so to detect end-of-stream", then it's probably
only an annoyance for testing and not something that would be fatal
in the real world.  I'm not quite sure if that's accurate, though.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/2377319.1719766794%40sss.pgh.pa.us#bba9f5ee0efc73151cc521a6bd5182ed



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Refactoring postmaster's code to cleanup after child exit
Next
From: "Joel Jacobson"
Date:
Subject: Re: Optimize mul_var() for var1ndigits >= 8