Re: Coding in WalSndWaitForWal - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Coding in WalSndWaitForWal
Date
Msg-id 20191112192716.emrqs2afuefunw6v@alap3.anarazel.de
Whole thread Raw
In response to Re: Coding in WalSndWaitForWal  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Coding in WalSndWaitForWal
Re: Coding in WalSndWaitForWal
Re: Coding in WalSndWaitForWal
List pgsql-hackers
Hi,

On 2019-11-11 13:53:40 -0300, Alvaro Herrera wrote:
> On 2019-Nov-11, Amit Kapila wrote:
> 
> > On Mon, Nov 11, 2019 at 7:53 AM Michael Paquier <michael@paquier.xyz> wrote:
> 
> > > So your suggestion would be to call GetFlushRecPtr() before the first
> > > check on RecentFlushPtr and before entering the loop?
> > 
> > No.  What I meant was to keep the current code as-is and have an
> > additional check on RecentFlushPtr before entering the loop.
> 
> I noticed that the "return" at the bottom of the function does a
> SetLatch(), but the other returns do not.  Isn't that a bug?

I don't think it is - We never reset the latch in that case. I don't see
what we'd gain from setting it explicitly, other than unnecessarily
performing more work?


>      /*
>       * Fast path to avoid acquiring the spinlock in case we already know we
>       * have enough WAL available. This is particularly interesting if we're
>       * far behind.
>       */
>      if (RecentFlushPtr != InvalidXLogRecPtr &&
>          loc <= RecentFlushPtr)
> +    {
> +        SetLatch(MyLatch);
>          return RecentFlushPtr;
> +    }

I.e. let's not do this.


>      /* Get a more recent flush pointer. */
>      if (!RecoveryInProgress())
>          RecentFlushPtr = GetFlushRecPtr();
>      else
>          RecentFlushPtr = GetXLogReplayRecPtr(NULL);
>  
> +    if (loc <= RecentFlushPtr)
> +    {
> +        SetLatch(MyLatch);
> +        return RecentFlushPtr;
> +    }

Hm. I'm doubtful this is a good idea - it essentially means we'd not
check for interrupts, protocol replies, etc. for an unbounded amount of
time. Whereas the existing fast-path does so for a bounded - although
not necessarily short - amount of time.

It seems to me it'd be better to just remove the "get a more recent
flush pointer" block - it doesn't seem to currently surve a meaningful
purpose.


>      for (;;)
>      {
>          long        sleeptime;
>  
>          /* Clear any already-pending wakeups */
>          ResetLatch(MyLatch);
>  
> @@ -2267,15 +2276,14 @@ WalSndLoop(WalSndSendDataCallback send_data)
>  
>              /* Sleep until something happens or we time out */
>              (void) WaitLatchOrSocket(MyLatch, wakeEvents,
>                                       MyProcPort->sock, sleeptime,
>                                       WAIT_EVENT_WAL_SENDER_MAIN);
>          }
>      }
> -    return;
>  }

Having dug into history, the reason this exists is that there used to be
the following below the return:

-
-send_failure:
-
-    /*
-     * Get here on send failure.  Clean up and exit.
-     *
-     * Reset whereToSendOutput to prevent ereport from attempting to send any
-     * more messages to the standby.
-     */
-    if (whereToSendOutput == DestRemote)
-        whereToSendOutput = DestNone;
-
-    proc_exit(0);
-    abort();                    /* keep the compiler quiet */

but when 5a991ef8692ed (Allow logical decoding via the walsender
interface) moved the shutdown code into its own function,
WalSndShutdown(), we left the returns in place.


We still have the curious
    proc_exit(0);
    abort();                    /* keep the compiler quiet */

pattern in WalSndShutdown() - wouldn't the right approach to instead
proc_exit() with pg_attribute_noreturn()?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Next
From: Andres Freund
Date:
Subject: Re: Proposal: Make use of C99 designated initialisers fornulls/values arrays