Re: Review for GetWALAvailability() - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Review for GetWALAvailability()
Date
Msg-id ee477f3c-9bff-c9d3-25b0-4e21aafc5e5d@oss.nttdata.com
Whole thread Raw
In response to Re: Review for GetWALAvailability()  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Review for GetWALAvailability()
Re: Review for GetWALAvailability()
List pgsql-hackers

On 2020/06/16 14:00, Kyotaro Horiguchi wrote:
> At Tue, 16 Jun 2020 01:46:21 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>>> In short, it is known behavior but it was judged as useless to prevent
>>> that.
>>> That can happen when checkpointer removes up to the segment that is
>>> being read by walsender.  I think that that doesn't happen (or
>>> happenswithin a narrow time window?) for physical replication but
>>> happenes for logical replication.
>>> While development, I once added walsender a code to exit for that
>>> reason, but finally it is moved to InvalidateObsoleteReplicationSlots
>>> as a bit defferent function.
>>
>> BTW, I read the code of InvalidateObsoleteReplicationSlots() and
>> probably
>> found some issues in it.
>>
>> 1. Each cycle of the "for" loop in
>> InvalidateObsoleteReplicationSlots()
>>      emits the log message  "terminating walsender ...". This means that
>>      if it takes more than 10ms for walsender to exit after it's signaled,
>>      the second and subsequent cycles would happen and output the same
>>      log message several times. IMO that log message should be output
>>      only once.
> 
> Sounds reasonable.
> 
>> 2. InvalidateObsoleteReplicationSlots() uses the loop to scan
>> replication                            
>>      slots array and uses the "for" loop in each scan. Also it calls
>>      ReplicationSlotAcquire() for each "for" loop cycle, and
>>      ReplicationSlotAcquire() uses another loop to scan replication slots
>>      array. I don't think this is good design.
>>
>>      ISTM that we can get rid of ReplicationSlotAcquire()'s loop because
>>      InvalidateObsoleteReplicationSlots() already know the index of the
>>      slot
>>      that we want to find. The attached patch does that. Thought?
> 
> The inner loop is expected to run at most several times per
> checkpoint, which won't be a serious problem. However, it is better if
> we can get rid of that in a reasonable way.
> 
> The attached patch changes the behavior for SAB_Block. Before the
> patch, it rescans from the first slot for the same name, but with the
> patch it just rechecks the same slot.  The only caller of the function
> with SAB_Block is ReplicationSlotDrop and I don't come up with a case
> where another slot with the same name is created at different place
> before the condition variable fires. But I'm not sure the change is
> completely safe.

Yes, that change might not be safe. So I'm thinking another approach to
fix the issues.

> Maybe some assertion is needed?
> 
>> 3. There is a corner case where the termination of walsender cleans up
>>      the temporary replication slot while
>>      InvalidateObsoleteReplicationSlots()
>>      is sleeping on ConditionVariableTimedSleep(). In this case,
>>      ReplicationSlotAcquire() is called in the subsequent cycle of the
>>      "for"
>>      loop, cannot find the slot and then emits ERROR message. This leads
>>      to the failure of checkpoint by the checkpointer.
> 
> Agreed.
> 
>>      To avoid this case, if SAB_Inquire is specified,
>>      ReplicationSlotAcquire()
>>      should return the special value instead of emitting ERROR even when
>>      it cannot find the slot. Also InvalidateObsoleteReplicationSlots()
>>      should
>>      handle that special returned value.
> 
> I thought the same thing hearing that.

While reading InvalidateObsoleteReplicationSlots() code, I found another issue.

            ereport(LOG,
                    (errmsg("terminating walsender %d because replication slot \"%s\" is too far behind",
                            wspid, NameStr(slotname))));
            (void) kill(wspid, SIGTERM);

wspid indicates the PID of process using the slot. That process can be
a backend, for example, executing pg_replication_slot_advance().
So "walsender" in the above log message is not always correct.



            int            wspid = ReplicationSlotAcquire(NameStr(slotname),
                                                       SAB_Inquire);

Why do we need to call ReplicationSlotAcquire() here and mark the slot as
used by the checkpointer? Isn't it enough to check directly the slot's
active_pid, instead?

Maybe ReplicationSlotAcquire() is necessary because
ReplicationSlotRelease() is called later? If so, why do we need to call
ReplicationSlotRelease()? ISTM that we don't need to do that if the slot's
active_pid is zero. No?

If my understanding is right, I'd like to propose the attached patch.
It introduces DeactivateReplicationSlot() and replace the "for" loop
in InvalidateObsoleteReplicationSlots() with it. ReplicationSlotAcquire()
and ReplicationSlotRelease() are no longer called there.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: snowball ASCII stemmer configuration
Next
From: Tom Lane
Date:
Subject: Re: slow get_actual_variable_range with long running transactions