Re: BUG? Slave don't reconnect to the master - Mailing list pgsql-general

From Jehan-Guillaume de Rorthais
Subject Re: BUG? Slave don't reconnect to the master
Date
Msg-id 20200903133915.1850e955@firost
Whole thread Raw
In response to Re: BUG? Slave don't reconnect to the master  (Олег Самойлов <splarv@ya.ru>)
Responses Re: BUG? Slave don't reconnect to the master
List pgsql-general
Hi,

Sorry for the late answer, I've been busy.

On Mon, 24 Aug 2020 18:45:42 +0300
Олег Самойлов <splarv@ya.ru> wrote:

> > 21 авг. 2020 г., в 17:26, Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
> > написал(а):
> >
> > On Thu, 20 Aug 2020 15:16:10 +0300
> > Based on setup per node, you can probably add
> > 'synchronous_commit=remote_write' in the common conf.
>
> Nope. I set 'synchronous_commit=remote_write' only for 3 and 4 node clusters.
> [...]

Then I suppose your previous message had an error as it shows three
nodes tuchanka3a, tuchanka3b and tuchanka3c (no 4th node), all with remote_write
in krogan3.conf. But anyway.

> >> [...]
> >> pacemaker config, specific for this cluster:
> >> [...]
> >
> > why did you add "monitor interval=15"? No harm, but it is redundant with
> > "monitor interval=16 role=Master" and "monitor interval=17 role=Slave".
>
> I can't remember clearly. :) Look what happens without it.
>
> + pcs -f configured_cib.xml resource create krogan2DB ocf:heartbeat:pgsqlms
> bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/krogan2
> recovery_template=/var/lib/pgsql/krogan2.paf meta master notify=true
> resource-stickiness=10
> Warning: changing a monitor operation interval from 15 to 16 to make the
> operation unique
> Warning: changing a monitor operation interval from 16 to 17 to make the
> operation unique

Something fishy here. This command lack op monitor settings. Pacemaker don't
add any default monitor operation with default interval if you don't give one
at resource creation.

If you create such a resource with no monitoring, the cluster will start/stop
it when needed, but will NOT check for its health. See:

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/s-resource-monitoring.html

> So trivial monitor always exists by default with interval 15.

nope.

> My real command
> pcs -f configured_cib.xml resource create krogan2DB ocf:heartbeat:pgsqlms
> bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/krogan2
> recovery_template=/var/lib/pgsql/krogan2.paf op monitor interval=15
> timeout=10 monitor interval=16 role=Master timeout=15 monitor interval=17
> role=Slave timeout=10 meta master notify=true resource-stickiness=10
>
> Looked like I needed to add all this to change "timeout" parameter for the
> monitor operations and I needed for interval parameter to point on the
> specific monitor operation.

OK, I understand now. If you want to edit an existing resource, use "pcs
resource update". Make sure read the pcs manual about how to use it to
edit/remove/add operations on a resource.

> Looked like the default timeout 10 was not enough for the "master".

It's written in PAF doc. See:
https://clusterlabs.github.io/PAF/configuration.html#resource-agent-actions

Do not hesitate to report or submit some enhancements to the doc if needed.

> [...]
> >> [...]
> >> 10:24:55.906 LOG:  entering standby mode
> >> 10:24:55.908 LOG:  redo starts at 0/15000028
> >> 10:24:55.909 LOG:  consistent recovery state reached at 0/15002300
> >> 10:24:55.910 LOG:  database system is ready to accept read only connections
> >> 10:24:55.928 LOG:  started streaming WAL from primary at 0/16000000 on tl 3
> >> 10:26:37.308 FATAL:  terminating walreceiver due to timeout
> >
> > Timeout because of SIGSTOP on primary here.
>
> Sure
>
> >> 10:26:37.308 LOG:  invalid record length at 0/1600C4D8: wanted 24, got 0
> >> 10:30:55.965 LOG:  received promote request
> >
> > Promotion from Pacemaker here.
>
> Yep
>
> > What happened during more than 4 minutes between the timeout and the
> > promotion?
>
> It's one of the problem, which you may improve. :) The pacemaker reaction is
> the longest for STOP signal test, usually near 5 minutes. The pacemaker tried
> to make different things (for instance "demote") and wait for different
> timeouts.

Oh, understood, obviously, I should have thought about that. Well, I will not
be able to improve anything here. You might want to adjust the various operation
timeouts and lower the migration-threshold.

> >> 10:30:55.965 FATAL:  terminating walreceiver process dpue to administrator
> >> cmd 10:30:55.966 LOG:  redo done at 0/1600C4B0
> >> 10:30:55.966 LOG:  last completed transaction was at log time
> >> 10:25:38.76429 10:30:55.968 LOG:  selected new timeline ID: 4
> >> 10:30:56.001 LOG:  archive recovery complete
> >> 10:30:56.005 LOG:  database system is ready to accept connections
> >
> >> The slave with didn't reconnected replication, tuchanka3c. Also I separated
> >> logs copied from the old master by a blank line:
> >>
> >> [...]
> >>
> >> 10:20:25.168 LOG:  database system was interrupted; last known up at
> >> 10:20:19 10:20:25.180 LOG:  entering standby mode
> >> 10:20:25.181 LOG:  redo starts at 0/11000098
> >> 10:20:25.183 LOG:  consistent recovery state reached at 0/11000A68
> >> 10:20:25.183 LOG:  database system is ready to accept read only connections
> >> 10:20:25.193 LOG:  started streaming WAL from primary at 0/12000000 on tl 3
> >> 10:25:05.370 LOG:  could not send data to client: Connection reset by peer
> >> 10:26:38.655 FATAL:  terminating walreceiver due to timeout
> >> 10:26:38.655 LOG:  record with incorrect prev-link 0/1200C4B0 at
> >> 0/1600C4D8
> >
> > This message appear before the effective promotion of tuchanka3b. Do you
> > have logs about what happen *after* the promotion?
>
> This is end of the slave log. Nothing. Just absent replication.

This is unusual. Could you log some more details about replication
tryouts to your PostgreSQL logs? Set log_replication_commands and lower
log_min_messages to debug ?

> > Reading at this error, it seems like record at 0/1600C4D8 references the
> > previous one in WAL 0/12000000. So the file referenced as 0/16 have either
> > corrupted data or was 0/12 being recycled, but not zeroed correctly, as v11
> > always do no matter what (no wal_init_zero there).
>
> Okey, may be in v12 it will be fixed.

No, I don't think so. This is not related with a bug in v11 -vs- v12, but v12
might behave a bit differently depending on the value of wal_init_zero.

> > That's why I'm wondering how you built your standbys, from scratch?
>
> By special scripts. :) This project already on GitHub and I am waiting for
> the final solution of my boss to open it. And it will take some time to
> translate README to English. After this I'll link the repository here.

I'll give it a look and try to reproduce if I find some time.


Regards,



pgsql-general by date:

Previous
From: Susan Joseph
Date:
Subject: Re: SSL between Primary and Seconday PostgreSQL DBs
Next
From: o1bigtenor
Date:
Subject: Re: SSL between Primary and Seconday PostgreSQL DBs