Re: Replication with Patroni not working after killing secondary and starting again - Mailing list pgsql-general

From Peter J. Holzer
Subject Re: Replication with Patroni not working after killing secondary and starting again
Date
Msg-id 20220427200537.u2d3gapmwnwjcblx@hjp.at
Whole thread Raw
In response to Replication with Patroni not working after killing secondary and starting again  (Zb B <zbig.poland@gmail.com>)
Responses Re: Replication with Patroni not working after killing secondary and starting again  (Zb B <zbig.poland@gmail.com>)
List pgsql-general
On 2022-04-27 15:27:34 +0200, Zb B wrote:
> Hi,
> I am new to Patroni and PostgreSQL.We have set up a cluster with etcd (3
> nodes), Patroni (2 nodes) and PostgreSQL (2 nodes) with replication from
> primary to secondary.

Pretty much the setup we have.

> Seemed to be working fine and we started some tests. One of the tests
> gave us unsatisfactory results. Specifically when we start a long
> transaction with multiple inserts (we use remote Java app for that)
> and during execution of this transaction we kill the secondary
> database by using the following:
[...]
>  the database starts on secondary after a while but the replication from the
> primary is not working anymore.
[...]
> Thus my questions:
> 1) Is it normal that replication stops working if we kill secondary postgres
> and start it again using patroni? Do we need to do any additional steps except
> the commands above that start patroni?

No.

When the secondary starts up it should continue replicating from where
it stopped. However, it can only do this if the necessary information is
still available. If WAL files have been deleted in the mean time. it
can't replay them. There should be error messages in your logs on what
went wrong.

> 2) Is it normal that patroni is not started again automatically after we kill
> it and postgres on secondary?

Depends on your system setup. Did you tell systemd to automatically
restart patroni? Patroni obviously can't do anything by itself after
it's been killed.

> 3) Or there is something wrong with our setup and the replication should be
> recovered automatically after we kill the secondary and start the patroni again
> on secondary?

I assume (but you really haven't given us enough information, so I could
be wrong) that you haven't configured a replication slot and you haven't
enough WAL segments to last through the downtime naturally.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Privilege error with c functions during postgresql upgrade from 11 -> 13
Next
From: JORGE MALDONADO
Date:
Subject: Re: Backing up a DB excluding certain tables