Re: Add WALRCV_CONNECTING state to walreceiver - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Add WALRCV_CONNECTING state to walreceiver
Date
Msg-id CABPTF7V2GFKwzeHSFKLQpdu9bU1qBh30X1rxgygSKdVhZnwTcw@mail.gmail.com
Whole thread Raw
In response to Re: Add WALRCV_CONNECTING state to walreceiver  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: Add WALRCV_CONNECTING state to walreceiver
List pgsql-hackers
Hi Chao,

Thanks for looking into this.

>
> Hi Xuneng,
>
> I just reviewed the patch. It splits the current STREAMING state into CONNECTING and STREAMING, where CONNECTING
representsthe connection establishing phase, which makes sense. The patch is solid, I have just a few small comments: 
>
> 1 - 0001
> ```
> +         <para>
> +          <literal>stopped</literal>: WAL receiver process is not running.
> +          This state is normally not visible because the view returns no row
> +          when the WAL receiver is not running.
> +         </para>
> ```
>
> “Is not running” appears twice in this short paragraph, looks redundant. Suggestion:
> ```
> <literal>stopped</literal>: WAL receiver process is not running.
> This state is normally not visible because the view returns no row in this case.
> ```
>
> 2 - 0002
> ```
> +       WALRCV_CONNECTING,                      /* connecting to primary, replication not yet
> +                                                                * started */
> ```
>
> I think “primary” is inaccurate. A connection destination could be a primary, or the other standby, so maybe change
“primary”to “upstream server”. 

Yeah, "upstream server" is more precise. There are multiple
user-facing log messages in this file that also use “primary”. I’m
wondering whether we should update those as well.

>
>
> 3 - 0002
> ```
>      * Mark walreceiver as running in shared memory.
> ```
>
> I think we should update this comment, changing “running” to “connecting”. There is not a “running” state, before
thispatch, “streaming” can roughly equal to “running”, after this patch, “running” is kinda unclear. 
>

It has been changed.

> 4 - 0002
> ```
> +                       /* Connection established, switch to streaming state */
> +                       SpinLockAcquire(&walrcv->mutex);
> +                       Assert(walrcv->walRcvState == WALRCV_CONNECTING);
> +                       walrcv->walRcvState = WALRCV_STREAMING;
> +                       SpinLockRelease(&walrcv->mutex);
> ```
>
> I don’t feel good with this Assert(). Looking at ShutdownWalRcv(), the startup process might set the state to
STOPPING,so there is a race. 
>
> Before this patch, this is no state change here, thus rest logic can handle STOPPING. After this patch, if the race
occurs,in dev mode, the Assert is fired; in production mode, STOPPING is overwritten by STREAMING, which is wrong. 
>
> So, instead of Assert(), I think we should check if current state is CONNECTING, if not, it should not proceed.

It makes sense to me. Please check the updated patch.

--
Best,
Xuneng

Attachment

pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Flush some statistics within running transactions
Next
From: Xuneng Zhou
Date:
Subject: Re: Add WALRCV_CONNECTING state to walreceiver