Thread: BUG #6619: Misleading output from slave when host is not running

BUG #6619: Misleading output from slave when host is not running

From
petteri.raty@aalto.fi
Date:
The following bug has been logged on the website:

Bug reference:      6619
Logged by:          Petteri R=C3=A4ty
Email address:      petteri.raty@aalto.fi
PostgreSQL version: 9.1.3
Operating system:   Gentoo Linux
Description:=20=20=20=20=20=20=20=20

I setup a hot standby master and slave following instructions at:

http://michael.otacoo.com/postgresql-2/postgres-9-1-setup-a-synchronous-sta=
nd-by-server-in-5-minutes/

I left archive mode off.

When I started the slave without the master running I got the following
output:

$ postgres -D gsd-replica/
LOG:  database system was interrupted while in recovery at log time
2012-04-25 12:01:33 UTC
HINT:  If this has occurred more than once some data might be corrupted and
you might need to choose an earlier recovery target.
LOG:  entering standby mode
WARNING:  WAL was generated with wal_level=3Dminimal, data may be missing
HINT:  This happens if you temporarily set wal_level=3Dminimal without taki=
ng
a new base backup.
FATAL:  hot standby is not possible because wal_level was not set to
"hot_standby" on the master server
HINT:  Either set wal_level to "hot_standby" on the master, or turn off
hot_standby here.
LOG:  startup process (PID 28761) exited with exit code 1
LOG:  aborting startup due to startup process failure

The error message above on the FATAL line is wrong (or at least misleading).
The real problem should be that it can't connect to the master. The
wal_level on the master is hot_standby (captured after I started it):


=3D# SHOW wal_level;
  wal_level=20=20
-------------
 hot_standby
(1 row)

Re: BUG #6619: Misleading output from slave when host is not running

From
Simon Riggs
Date:
On Fri, Apr 27, 2012 at 8:47 AM,  <petteri.raty@aalto.fi> wrote:

> LOG: =A0entering standby mode
> WARNING: =A0WAL was generated with wal_level=3Dminimal, data may be missi=
ng
> HINT: =A0This happens if you temporarily set wal_level=3Dminimal without =
taking
> a new base backup.
> FATAL: =A0hot standby is not possible because wal_level was not set to
> "hot_standby" on the master server
> HINT: =A0Either set wal_level to "hot_standby" on the master, or turn off
> hot_standby here.
> LOG: =A0startup process (PID 28761) exited with exit code 1
> LOG: =A0aborting startup due to startup process failure
>
> The error message above on the FATAL line is wrong (or at least misleadin=
g).
> The real problem should be that it can't connect to the master. The
> wal_level on the master is hot_standby (captured after I started it):

The HINT that we should simply set something on the master is a little
misleading with respect to timing. However, if the master and the
standby aren't even connected and you know that, how did you expect
there to be a causal link between the setting on the master and the
state of the standby?

What do you suggest the messages say?

--=20
=A0Simon Riggs=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 http:/=
/www.2ndQuadrant.com/
=A0PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #6619: Misleading output from slave when host is not running

From
Robert Haas
Date:
On Fri, Apr 27, 2012 at 3:47 AM,  <petteri.raty@aalto.fi> wrote:
> When I started the slave without the master running I got the following
> output:
>
> $ postgres -D gsd-replica/
> LOG: =A0database system was interrupted while in recovery at log time
> 2012-04-25 12:01:33 UTC
> HINT: =A0If this has occurred more than once some data might be corrupted=
 and
> you might need to choose an earlier recovery target.
> LOG: =A0entering standby mode
> WARNING: =A0WAL was generated with wal_level=3Dminimal, data may be missi=
ng
> HINT: =A0This happens if you temporarily set wal_level=3Dminimal without =
taking
> a new base backup.
> FATAL: =A0hot standby is not possible because wal_level was not set to
> "hot_standby" on the master server
> HINT: =A0Either set wal_level to "hot_standby" on the master, or turn off
> hot_standby here.
> LOG: =A0startup process (PID 28761) exited with exit code 1
> LOG: =A0aborting startup due to startup process failure
>
> The error message above on the FATAL line is wrong (or at least misleadin=
g).

I think it's trying to tell you that you had wal_level=3Dminimal
configured on the master *at the time you took the base backup*.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #6619: Misleading output from slave when host is not running

From
Petteri Räty
Date:
On 27.04.2012 17:16, Simon Riggs wrote:
> On Fri, Apr 27, 2012 at 8:47 AM,  <petteri.raty@aalto.fi> wrote:
>
>> LOG:  entering standby mode
>> WARNING:  WAL was generated with wal_level=minimal, data may be missing
>> HINT:  This happens if you temporarily set wal_level=minimal without taking
>> a new base backup.
>> FATAL:  hot standby is not possible because wal_level was not set to
>> "hot_standby" on the master server
>> HINT:  Either set wal_level to "hot_standby" on the master, or turn off
>> hot_standby here.
>> LOG:  startup process (PID 28761) exited with exit code 1
>> LOG:  aborting startup due to startup process failure
>>
>> The error message above on the FATAL line is wrong (or at least misleading).
>> The real problem should be that it can't connect to the master. The
>> wal_level on the master is hot_standby (captured after I started it):
>
> The HINT that we should simply set something on the master is a little
> misleading with respect to timing. However, if the master and the
> standby aren't even connected and you know that, how did you expect
> there to be a causal link between the setting on the master and the
> state of the standby?
>

I started investigating after seeing that it didn't start up and found
that the master had a firewall preventing from connecting to the port
where I had setup postgres to listen.

>
> What do you suggest the messages say?
>

If the slave had no way to connect to the master then how can the slave
tell how "hot_standby" is configured there? I am expecting the message
to tell me that it can't connect to the master.

Regards,
Petteri