Re: Nicer error when connecting to standby with hot_standby=off - Mailing list pgsql-hackers

From James Coleman
Subject Re: Nicer error when connecting to standby with hot_standby=off
Date
Msg-id CAAaqYe8fqjHrCaW8p5BXZ9NjSPeExqxTD9KWHfJUnkbLro2gMA@mail.gmail.com
Whole thread Raw
In response to Re: Nicer error when connecting to standby with hot_standby=off  (Andres Freund <andres@anarazel.de>)
Responses Re: Nicer error when connecting to standby with hot_standby=off  (David Zhang <david.zhang@highgo.ca>)
List pgsql-hackers
On Mon, Mar 9, 2020 at 8:06 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-03-09 18:40:32 -0400, James Coleman wrote:
> > On Mon, Mar 9, 2020 at 6:28 PM Andres Freund <andres@anarazel.de> wrote:
> > > > I wanted to get some initial feedback on the idea before writing a patch:
> > > > does that seem like a reasonable change? Is it actually plausible to
> > > > distinguish between this state and "still recovering" (i.e., when starting
> > > > up a hot standby but initial recovery hasn't completed so it legitimately
> > > > can't accept connections yet)? If so, should we include the possibility if
> > > > hot_standby isn't on, just in case?
> > >
> > > Yes, it is feasible to distinguish those cases. And we should, if we're
> > > going to change things around.
> >
> > I'll look into this hopefully soon, but it's helpful to know that it's
> > possible. Is it basically along the lines of checking to see if the
> > LSN is past the minimum recovery point?
>
> No, I don't think that's the right approach. IIRC the startup process
> (i.e. the one doing the WAL replay) signals postmaster once consistency
> has been achieved. So you can just use that state.

I've taken that approach in the attached patch (I'd expected to wait
until later to work on this...but it seemed pretty small so I ended up
hacking on it this evening).

I don't have tests included: I tried intentionally breaking the
existing behavior (returning no error when hot_standby=off), but
running make check-world (including tap tests) didn't find any
breakages. I can look into that more deeply at some point, but if you
happen to know a place we test similar things, then I'd be happy to
hear it.

One other question: how is error message translation handled? I
haven't added entries to the relevant files, but also I'm obviously
not qualified to write them.

Thanks,
James

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Crash by targetted recovery
Next
From: Michael Paquier
Date:
Subject: Re: Add an optional timeout clause to isolationtester step.