Re: Review for GetWALAvailability() - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Review for GetWALAvailability()
Date
Msg-id 20200617.100121.2112248049559508701.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Review for GetWALAvailability()  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
At Wed, 17 Jun 2020 00:46:38 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> >> 2. InvalidateObsoleteReplicationSlots() uses the loop to scan
> >> replication                            
> >>      slots array and uses the "for" loop in each scan. Also it calls
> >>      ReplicationSlotAcquire() for each "for" loop cycle, and
> >>      ReplicationSlotAcquire() uses another loop to scan replication slots
> >>      array. I don't think this is good design.
> >>
> >>      ISTM that we can get rid of ReplicationSlotAcquire()'s loop because
> >>      InvalidateObsoleteReplicationSlots() already know the index of the
> >>      slot
> >>      that we want to find. The attached patch does that. Thought?
> > The inner loop is expected to run at most several times per
> > checkpoint, which won't be a serious problem. However, it is better if
> > we can get rid of that in a reasonable way.
> > The attached patch changes the behavior for SAB_Block. Before the
> > patch, it rescans from the first slot for the same name, but with the
> > patch it just rechecks the same slot.  The only caller of the function
> > with SAB_Block is ReplicationSlotDrop and I don't come up with a case
> > where another slot with the same name is created at different place
> > before the condition variable fires. But I'm not sure the change is
> > completely safe.
> 
> Yes, that change might not be safe. So I'm thinking another approach
> to
> fix the issues.
> 
> > Maybe some assertion is needed?
> > 
> >> 3. There is a corner case where the termination of walsender cleans up
> >>      the temporary replication slot while
> >>      InvalidateObsoleteReplicationSlots()
> >>      is sleeping on ConditionVariableTimedSleep(). In this case,
> >>      ReplicationSlotAcquire() is called in the subsequent cycle of the
> >>      "for"
> >>      loop, cannot find the slot and then emits ERROR message. This leads
> >>      to the failure of checkpoint by the checkpointer.
> > Agreed.
> > 
> >>      To avoid this case, if SAB_Inquire is specified,
> >>      ReplicationSlotAcquire()
> >>      should return the special value instead of emitting ERROR even when
> >>      it cannot find the slot. Also InvalidateObsoleteReplicationSlots()
> >>      should
> >>      handle that special returned value.
> > I thought the same thing hearing that.
> 
> While reading InvalidateObsoleteReplicationSlots() code, I found
> another issue.
> 
>             ereport(LOG,
>                     (errmsg("terminating walsender %d
>                     because replication slot \"%s\" is too
>                     far behind",
>                             wspid,
>                             NameStr(slotname))));
>             (void) kill(wspid, SIGTERM);
> 
> wspid indicates the PID of process using the slot. That process can be
> a backend, for example, executing pg_replication_slot_advance().
> So "walsender" in the above log message is not always correct.

Agreed.

> 
>             int wspid = ReplicationSlotAcquire(NameStr(slotname),
>                                                        SAB_Inquire);
> 
> Why do we need to call ReplicationSlotAcquire() here and mark the slot
> as
> used by the checkpointer? Isn't it enough to check directly the slot's
> active_pid, instead?
> Maybe ReplicationSlotAcquire() is necessary because
> ReplicationSlotRelease() is called later? If so, why do we need to
> call
> ReplicationSlotRelease()? ISTM that we don't need to do that if the
> slot's
> active_pid is zero. No?

My understanding of the reason is that we update a slot value here.
The restriction allows the owner of a slot to assume that all the slot
values don't voluntarily change.

slot.h:104
| * - Individual fields are protected by mutex where only the backend owning
| * the slot is authorized to update the fields from its own slot.  The
| * backend owning the slot does not need to take this lock when reading its
| * own fields, while concurrent backends not owning this slot should take the
| * lock when reading this slot's data.

> If my understanding is right, I'd like to propose the attached patch.
> It introduces DeactivateReplicationSlot() and replace the "for" loop
> in InvalidateObsoleteReplicationSlots() with
> it. ReplicationSlotAcquire()
> and ReplicationSlotRelease() are no longer called there.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: David Zhang
Date:
Subject: Add an option to allow run regression test for individual module onWindows build
Next
From: Melanie Plageman
Date:
Subject: Re: Improve planner cost estimations for alternative subplans