AW: AW: AW: broken backup trail in case of quickly patroni switchbackand forth - Mailing list pgsql-general

From Zwettler Markus (OIZ)
Subject AW: AW: AW: broken backup trail in case of quickly patroni switchbackand forth
Date
Msg-id 9984e05af67d48bdaf730a934e63c513@zuerich.ch
Whole thread Raw
In response to Re: AW: AW: broken backup trail in case of quickly patroni switchbackand forth  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: AW: AW: AW: broken backup trail in case of quickly patroni switchbackand forth  ("Brad Nicholson" <bradn@ca.ibm.com>)
List pgsql-general
3)
Patroni does only failovers. Also in case of regular shutdown of the primary. A failover is a promote of the standby +
automaticreinstate (pg_rewind or pg_basebackup) of the former primary.
 

Time: role site 1 - role site 2
====================
12:00h: primary - standby
=> Some clients commited some transactions; Primary stopped => Failover to standby
12:05h: standby - primary
=> Some clients connected + commited some transactions; Primary stopped => Failover to standby
12:10h: primary - standby



Patroni.yml)
$ cat pcl_l702.yml
scope: pcl_l702
name: pcl_l702@tstm49003
namespace: /patroni/

log:
  level: DEBUG
  dir: /opt/app/patroni/etc/log/
  file_num: 10
  file_size: 104857600

restapi:
  listen: tstm49003.tstglobal.tst.loc:8010
  connect_address: tstm49003.tstglobal.tst.loc:8010

etcd:
  hosts:
etcdlab01.tstglobal.tst.loc:2379,etcdlab02.tstglobal.tst.loc:2379,etcdlab03.tstglobal.tst.loc:2379,etcdlab04.tstglobal.tst.loc:2379,etcdlab05.tstglobal.tst.loc:2379
  username: patroni
  password: censored

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    master_start_timeout: 300
    synchronous_mode: true
    postgresql:
      use_pg_rewind: true
      use_slots: true

  # NO BOOTSTRAPPING USED
  method: do_not_bootstrap
  do_not_bootstrap:
    command: /bin/false

postgresql:
  authentication:
    replication:
      username: repadmin
      password: censored
    superuser:
      username: patroni
      password: censored
  callbacks:
    on_reload: /opt/app/patroni/etc/callback_patroni.sh
    on_restart: /opt/app/patroni/etc/callback_patroni.sh
    on_role_change: /opt/app/patroni/etc/callback_patroni.sh
    on_start: /opt/app/patroni/etc/callback_patroni.sh
    on_stop: /opt/app/patroni/etc/callback_patroni.sh
  connect_address: tstm49003.tstglobal.tst.loc:5436
  database: pcl_l702
  data_dir: /pgdata/pcl_l702
  bin_dir: /usr/pgsql-9.6/bin
  listen: localhost,tstm49003.tstglobal.tst.loc,pcl_l702.tstglobal.tst.loc:5436
  pgpass: /home/postgres/.pgpass_patroni
  recovery_conf:
    restore_command: cp /pgxlog_archive/pcl_l702/%f %p
  parameters:
    hot_standby_feedback: on
    wal_keep_segments: 64
  use_pg_rewind: true

watchdog:
  mode: automatic
  device: /dev/watchdog
  safety_margin: 5

tags:
  nofailover: false
  noloadbalance: false
  clonefrom: false
  nosync: false




-----Ursprüngliche Nachricht-----
Von: Adrian Klaver <adrian.klaver@aklaver.com> 
Gesendet: Donnerstag, 7. November 2019 17:06
An: Zwettler Markus (OIZ) <Markus.Zwettler@zuerich.ch>; pgsql-general@lists.postgresql.org
Betreff: Re: AW: AW: broken backup trail in case of quickly patroni switchback and forth

On 11/7/19 7:47 AM, Zwettler Markus (OIZ) wrote:

I am heading out the door so I will not have time to look at below until later. For those that get a chance before
then,it would be nice to have the Patroni conf file information also. The Patroni information may answer the question,
butit case it does not what actually is failover in 3) below?
 

> 1) 9.6
> 
> 
> 
> 2)
> $ cat postgresql.conf
> # Do not edit this file manually!
> # It will be overwritten by Patroni!
> include 'postgresql.base.conf'
> 
> cluster_name = 'pcl_l702'
> hot_standby = 'on'
> hot_standby_feedback = 'True'
> listen_addresses = 'localhost,tstm49003.tstglobal.tst.loc,pcl_l702.tstglobal.tst.loc'
> max_connections = '100'
> max_locks_per_transaction = '64'
> max_prepared_transactions = '0'
> max_replication_slots = '10'
> max_wal_senders = '10'
> max_worker_processes = '8'
> port = '5436'
> track_commit_timestamp = 'off'
> wal_keep_segments = '8'
> wal_level = 'replica'
> wal_log_hints = 'on'
> hba_file = '/pgdata/pcl_l702/pg_hba.conf'
> ident_file = '/pgdata/pcl_l702/pg_ident.conf'
> $
> $
> $
> $ cat postgresql.base.conf
> datestyle = 'iso, mdy'
> default_text_search_config = 'pg_catalog.english'
> dynamic_shared_memory_type = posix
> lc_messages = 'en_US.UTF-8'
> lc_monetary = 'de_CH.UTF-8'
> lc_numeric = 'de_CH.UTF-8'
> lc_time = 'de_CH.UTF-8'
> logging_collector = on
> log_directory = 'pg_log'
> log_rotation_age = 1d
> log_rotation_size = 0
> log_timezone = 'Europe/Vaduz'
> log_truncate_on_rotation = on
> max_connections = 100
> timezone = 'Europe/Vaduz'
> archive_command = 'test ! -f /tmp/pg_archive_backup_running_on_pcl_l702* && rsync --checksum %p
/pgxlog_archive/pcl_l702/%f'
> archive_mode = on
> archive_timeout = 1800
> cluster_name = pcl_l702
> cron.database_name = 'pdb_l72_oiz'
> # effective_cache_size
> listen_addresses = '*'
> log_connections = on
> log_destination = 'stderr, csvlog'
> log_disconnections = on
> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
> log_line_prefix = '%t : %h=>%u@%d : %p-%c-%v : %e '
> log_statement = 'ddl'
> max_wal_senders = 5
> port = 5436
> shared_buffers = 512MB
> shared_preload_libraries = 'auto_explain, pg_stat_statements, pg_cron, pg_statsinfo'
> wal_buffers = 16MB
> wal_compression = on
> wal_level = replica
> # work_mem
> 
> 
> 
> 3)
> 12:00h: primary - standby
> => Some clients commited some transactions; Failover
> 12:05h: standby - primary
> => Some clients connected + commited some transactions; Failover
> 12:10h: primary - standby
> 
> 
> 
> 
> 
> On 11/7/19 7:18 AM, Zwettler Markus (OIZ) wrote:
>> I already asked the Patroni folks. They told me this is not related 
>> to Patroni but Postgresql. ;-)
> 
> Hard to say without more information:
> 
> 1) Postgres version
> 
> 2) Setup/config info
> 
> 3) Detail if what happened between 12:00 and 12:10
> 
>>
>> - Markus
>>
>>
>>
>> On 11/7/19 5:52 AM, Zwettler Markus (OIZ) wrote:
>>> we are using Patroni for management of our Postgres standby databases.
>>>
>>> we take our (wal) backups on the primary side based on intervals and thresholds.
>>> our archived wal's are written to a local wal directory first and moved to tape afterwards.
>>>
>>> we got a case where Patroni switched back and forth sides quickly, e.g.:
>>> 12:00h: primary - standby
>>> 12:05h: standby - primary
>>> 12:10h: primary - standby
>>>
>>> we realised that we will not have a wal backup of those wal's generated between 12:05h and 12:10h in this
scenario.
>>>
>>> how can we make sure that the whole wal sequence trail will be backuped? any idea?
>>
>> Probably best to ask the Patroni folks:
>>
>> https://github.com/zalando/patroni#community
>>
>>>
>>> - Markus
>>>
>>>
>>
>>
> 
> 


--
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

Previous
From: "Peter J. Holzer"
Date:
Subject: Re: 11 -> 12 upgrade on Debian Ubuntu
Next
From: Laurenz Albe
Date:
Subject: Re: broken backup trail in case of quickly patroni switchback andforth