Thread: Wrong HINT during database recovery when occur a minimal wal.

Hello hackers,

When I do PITR in a strange step, I get this FATAL:

2021-01-15 15:02:52.364 CST [14958] FATAL:  hot standby is not possible because wal_level was not set to "replica" or higher on the primary server
2021-01-15 15:02:52.364 CST [14958] HINT:  Either set wal_level to "replica" on the primary, or turn off hot_standby here.

The strange step is that I change wal_level to minimal after basebackup.

My question is that what's the mean of  [set wal_level to "replica" on the primary] in
HINT describe, I can't think over a case to solve this FATAL by set wal_level, I can
solve it by turn off hot_standby only.

Do you think we can do this code change?
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6300,7 +6300,7 @@ CheckRequiredParameterValues(void)
  if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
  ereport(ERROR,
  (errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary server"),
- errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here.")));
+ errhint("You should turn off hot_standby here.")));


---
Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: Wrong HINT during database recovery when occur a minimal wal.

From
"lchch1990@sina.cn"
Date:

Sorry, I don't known why it showed in wrong format, and try to correct it.
-----

When I do PITR in a strange step, I get this FATAL:

2021-01-15 15:02:52.364 CST [14958] FATAL:  hot standby is not possible because wal_level was not set to "replica" or higher on the primary server
2021-01-15 15:02:52.364 CST [14958] HINT:  Either set wal_level to "replica" on the primary, or turn off hot_standby here.

The strange step is that I change wal_level to minimal after basebackup.

My question is that what's the mean of  [set wal_level to "replica" on the primary] in
HINT describe, I can't think over a case to solve this FATAL by set wal_level, I can
solve it by turn off hot_standby only.

Do you think we can do this code change?
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6300,7 +6300,7 @@ CheckRequiredParameterValues(void)
  if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
  ereport(ERROR,
  (errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary server"),
-  errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here.")));
+  errhint("You should turn off hot_standby here.")));


Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: Wrong HINT during database recovery when occur a minimal wal.

From
Kyotaro Horiguchi
Date:
At Fri, 15 Jan 2021 15:32:58 +0800, "lchch1990@sina.cn" <lchch1990@sina.cn> wrote in 
> 
> Sorry, I don't known why it showed in wrong format, and try to correct it.
> -----
> 
> When I do PITR in a strange step, I get this FATAL:
> 
> 2021-01-15 15:02:52.364 CST [14958] FATAL:  hot standby is not possible because wal_level was not set to "replica" or
higheron the primary server
 
> 2021-01-15 15:02:52.364 CST [14958] HINT:  Either set wal_level to "replica" on the primary, or turn off hot_standby
here.
> 
> The strange step is that I change wal_level to minimal after basebackup.

Mmm. Maybe something's missing. If you took the base-backup using
pg_basebackup, that means max_wal_senders > 0 on the primary. If you
lowered wal_level in the backup (or replica) then started it, You
would get something like this.

| FATAL: WAL streaming (max_wal_senders > 0) requires wal_level "replica" or "logical".

If you changed max_wal_senders to zero, you would get the following instead.

| FATAL:  hot standby is not possible because max_wal_senders = 0 is a lower setting than on the primary server (its
valuewas 2)
 

So I couldn't reproduce the situation.

Anyways..

> My question is that what's the mean of  [set wal_level to "replica" on the primary] in
> HINT describe, I can't think over a case to solve this FATAL by set wal_level, I can
> solve it by turn off hot_standby only.
> 
> Do you think we can do this code change?
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -6300,7 +6300,7 @@ CheckRequiredParameterValues(void)
>   if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
>   ereport(ERROR,
>   (errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary
server"),
> -  errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here.")));
> +  errhint("You should turn off hot_standby here.")));

Since it's obvious that the change in a primary cannot be propagted by
taking a backup or starting replication, the first sentence reads to
me as "you should retake a base-backup from a primary where wal_level
is replica or higher". So *I* don't think it needs a fix.

Any thoughts?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Wrong HINT during database recovery when occur a minimal wal.

From
"lchch1990@sina.cn"
Date:

>Mmm. Maybe something's missing. If you took the base-backup using
>pg_basebackup, that means max_wal_senders > 0 on the primary. If you
>lowered wal_level in the backup (or replica) then started it, You
>would get something like this.
>| FATAL: WAL streaming (max_wal_senders > 0) requires wal_level "replica" or "logical".
>If you changed max_wal_senders to zero, you would get the following instead.
>| FATAL:  hot standby is not possible because max_wal_senders = 0 is a lower setting than on the primary server (its value was 2)
Then mark hot_standby off and continue try lowered wal_level.
And do recovery from the basebackup, then you will see the FATAL.

>So I couldn't reproduce the situation.
>Anyways.
 
>> My question is that what's the mean of  [set wal_level to "replica" on the primary] in
>> HINT describe, I can't think over a case to solve this FATAL by set wal_level, I can
>> solve it by turn off hot_standby only.
>>
>> Do you think we can do this code change?
>> --- a/src/backend/access/transam/xlog.c
>> +++ b/src/backend/access/transam/xlog.c
>> @@ -6300,7 +6300,7 @@ CheckRequiredParameterValues(void)
>>   if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
>>   ereport(ERROR,
>>   (errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary server"),
>> -  errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here.")));
>> +  errhint("You should turn off hot_standby here.")));
 
>Since it's obvious that the change in a primary cannot be propagted by
>taking a backup or starting replication, the first sentence reads to
>me as "you should retake a base-backup from a primary where wal_level
>is replica or higher". So *I* don't think it needs a fix.
I think this HINT is want to guide users to finish this recovery, and the first guide is
invalid in my opinion.


Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: Wrong HINT during database recovery when occur a minimal wal.

From
Kyotaro Horiguchi
Date:
At Fri, 15 Jan 2021 17:04:19 +0800, "lchch1990@sina.cn" <lchch1990@sina.cn> wrote in 
> 
> >Mmm. Maybe something's missing. If you took the base-backup using
> >pg_basebackup, that means max_wal_senders > 0 on the primary. If you
> >lowered wal_level in the backup (or replica) then started it, You
> >would get something like this.
> >| FATAL: WAL streaming (max_wal_senders > 0) requires wal_level "replica" or "logical".
> >If you changed max_wal_senders to zero, you would get the following instead.
> >| FATAL:  hot standby is not possible because max_wal_senders = 0 is a lower setting than on the primary server (its
valuewas 2)
 
> Then mark hot_standby off and continue try lowered wal_level.
> And do recovery from the basebackup, then you will see the FATAL.
> 
> >So I couldn't reproduce the situation.
> >Anyways.
>  
> >> My question is that what's the mean of  [set wal_level to "replica" on the primary] in
> >> HINT describe, I can't think over a case to solve this FATAL by set wal_level, I can
> >> solve it by turn off hot_standby only.
> >>
> >> Do you think we can do this code change?
> >> --- a/src/backend/access/transam/xlog.c
> >> +++ b/src/backend/access/transam/xlog.c
> >> @@ -6300,7 +6300,7 @@ CheckRequiredParameterValues(void)
> >>   if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
> >>   ereport(ERROR,
> >>   (errmsg("hot standby is not possible because wal_level was not set to \"replica\" or higher on the primary
server"),
> >> -  errhint("Either set wal_level to \"replica\" on the primary, or turn off hot_standby here.")));
> >> +  errhint("You should turn off hot_standby here.")));
>  
> >Since it's obvious that the change in a primary cannot be propagted by
> >taking a backup or starting replication, the first sentence reads to
> >me as "you should retake a base-backup from a primary where wal_level
> >is replica or higher". So *I* don't think it needs a fix.
> I think this HINT is want to guide users to finish this recovery, and the first guide is
> invalid in my opinion.

I think it's also important to suggest to the users how they can turn
on hot_standby on their standby.  So, perhaps-a-bit-verbose hint would
be like this.

"Either start this standby from base backup taken after setting
wal_level to \"replica\" on the primary, or turn off hot_standby
here."

This this make sense?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



>I think it's also important to suggest to the users how they can turn
>on hot_standby on their standby. So, perhaps-a-bit-verbose hint would
>be like this.
>"Either start this standby from base backup taken after setting
>wal_level to \"replica\" on the primary, or turn off hot_standby
>here."
>This this make sense?
Can you help me understand what [setting wal_level to \"replica\"] help
for this startup from basebackup?

Do you mean set wal_level on basebackup or on the database we do
basebackup?