Re: [HACKERS] some review comments on logical rep code - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: [HACKERS] some review comments on logical rep code
Date
Msg-id CAHGQGwGJrPO++XM4mFENAwpy1eGXKsGdguYv43GUgLgU-x8nTQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] some review comments on logical rep code  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: [HACKERS] some review comments on logical rep code  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
List pgsql-hackers
On Thu, Apr 27, 2017 at 1:28 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Apr 26, 2017 at 3:47 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> At Wed, 26 Apr 2017 14:31:12 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoDMy8a6UwMrRh8pigQbDC+JAOQ4m_tXT41VRP4SYp23=w@mail.gmail.com>
>>> On Wed, Apr 26, 2017 at 12:35 PM, Petr Jelinek
>>> <petr.jelinek@2ndquadrant.com> wrote:
>>> > On 26/04/17 01:01, Fujii Masao wrote:
>>> >>>> However this is overkill for small gain and false wakeup of the
>>> >>>> launcher is not so harmful (probably we can live with that), so
>>> >>>> we do nothing here for this issue.
>>> >>>
>>> >>> I agree this as a whole. But I think that the main issue here is
>>> >>> not false wakeups, but 'possible delay of launching new workers
>>> >>> by 3 minutes at most' (this is centainly a kind of false wakeups,
>>> >>> though). We can live with this failure when using two-paase
>>> >>> commmit, but I think it shouldn't happen silently.
>>> >>>
>>> >>>
>>> >>> How about providing AtPrepare_ApplyLauncher(void) like the
>>> >>> follows and calling it in PrepareTransaction?
>>> >>
>>> >> Or we should apply the attached patch and handle the 2PC case properly?
>>> >> I was thinking that it's overkill more than necessary, but that seems not true
>>> >> as far as I implement that.
>>> >>
>>> > Looks like it does not even increase size of the 2pc file, +1 for this.
>>>
>>> In my honest opinion, I didn't have a big will that we should handle
>>> even two-phase commit case, because this case is very rare (I could
>>> not image such case) and doesn't mean to lead a harmful result such as
>>> crash of server and returning inconsistent result. it just delays the
>>> launching worker for at most 3 minutes. We also can deal with this for
>>> example by making maximum nap time of apply launcher user-configurable
>>> and document it.
>>> But if we can deal with it by minimum changes like attached your patch I agree.
>>
>> This change looks reasonable to me, +1 from me to this.
>>
>> The patch reads on_commit_launcher_wakeup directly then updates
>> it via ApplyALuncherWakeupAtCommit() but it's too much to add a
>> function for the sake of this.
>
> OK, so what about the attached patch? I replaced all the calls to
> ApplyLauncherWakeupAtCommit() with the code "on_commit_launcher_wakeup = true".

BTW, while I was reading the code to implement the patch that
I posted upthread, I found that the following condition doesn't
work as expected. This is because "last_start_time" is always 0.

        /* Limit the start retry to once a wal_retrieve_retry_interval */
        if (TimestampDifferenceExceeds(last_start_time, now,
          wal_retrieve_retry_interval))

"last_start_time" is set to "now" when the launcher starts up new
worker. But "last_start_time" is defined and always initialized with 0
at the beginning of the main loop in ApplyLauncherMain(), so
the above condition becomes always true. This is obviously a bug.
Attached patch fixes this issue.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Dropping a partitioned table takes too long
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] scram and \password