On 2020/06/18 11:44, Kyotaro Horiguchi wrote:
> At Wed, 17 Jun 2020 20:13:01 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>>> ReplicationSlotAcquireInternal. I think we should call
>>> ConditionVariablePrepareToSleep before the sorrounding for statement
>>> block.
>>
>> OK, so what about the attached patch? I added
>> ConditionVariablePrepareToSleep()
>> just before entering the "for" loop in
>> InvalidateObsoleteReplicationSlots().
>
> Thanks.
Thanks for the review!
>
> ReplicationSlotAcquireInternal:
> + * If *slot == NULL, search for the slot with the given name.
>
> '*' seems needless here.
Fixed.
Also I added "Only one of slot and name can be specified." into
the comments of ReplicationSlotAcquireInternal().
> The patch moves ConditionVariablePrepareToSleep. We need to call the
> function before looking into active_pid as originally commented.
> Since it is not protected by ReplicationSlotControLock, just before
> releasing the lock is not correct.
>
> The attached on top of the v3 fixes that.
Yes, you're right. I merged your 0001.patch into mine.
+ if (behavior != SAB_Inquire)
+ ConditionVariablePrepareToSleep(&s->active_cv);
+ else if (behavior != SAB_Inquire)
Isn't "behavior == SAB_Block" condition better here?
I changed the patch that way.
The attached is the updated version of the patch.
I also merged Alvaro's patch into this.
> + s = (slot == NULL) ? SearchNamedReplicationSlot(name) : slot;
> + if (s == NULL || !s->in_use || strcmp(name, NameStr(s->data.name)) != 0)
>
> The conditions in the second line is needed for the case slot is
> given, but it is already done in SearchNamedReplicationSlot if slot is
> not given. I would like something like the following instead, but I
> don't insist on it.
Yes, I got rid of strcmp() check, but left is_use check as it is.
I like that because it's simpler.
> ReplicationSlot *s = NULL;
> ...
> if (!slot)
> s = SearchNamedReplicationSlot(name);
> else if(s->in_use && strcmp(name, NameStr(s->data.name)))
> s = slot;
>
>
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("replication slot \"%s\" does not exist", name)));
>
> The error message is not right when the given slot doesn't match the
> given name.
This doesn't happen after applying Alvaro's patch.
BTW, using "name" here is not valid because it may be NULL.
So I added the following code and used "slot_name" in log messages.
+ slot_name = name ? name : NameStr(slot->data.name);
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION