Re: very long secondary->primary switch time - Mailing list pgsql-general

From Tomas Pospisek
Subject Re: very long secondary->primary switch time
Date
Msg-id 3cc70134-f869-05f7-f970-30583fa859b6@sourcepole.ch
Whole thread Raw
In response to Re: very long secondary->primary switch time  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi Tom, hi list participants,

thanks a lot for replying Tom.

On 27.04.21 22:16, Tom Lane wrote:
> Tomas Pospisek <tpo2@sourcepole.ch> writes:
>> I maintain a postgresql cluster that does failover via patroni. The
>> problem is that after a failover happens it takes the secondary too long
>> (that is about 35min) to come up and answer queries. The log of the
>> secondary looks like this:
> 
>> 04:00:29.777 [9679] LOG:  received promote request
>> 04:00:29.780 [9693] FATAL:  terminating walreceiver process due to
>> administrator command
>> 04:00:29.780 [9679] LOG:  invalid record length at 320/B95A1EE0: wanted
>> 24, got 0
>> 04:00:29.783 [9679] LOG:  redo done at 320/B95A1EA8
>> 04:00:29.783 [9679] LOG:  last completed transaction was at log time
>> 2021-03-03 03:57:46.466342+01
> 
>> 04:35:00.982 [9679] LOG:  selected new timeline ID: 15
>> 04:35:01.404 [9679] LOG:  archive recovery complete
>> 04:35:02.337 [9662] LOG:  database system is ready to accept connections
> 
>> The cluster is "fairly large" with thousands of DBs (sic!) and ~1TB of data.
> 
> Hm.  WAL replay is already done at the "redo done" entry.  There is a
> checkpoint after that, I believe, and there may be some effort to search
> for dead files as well.  Still, if your I/O subsystem is better than
> a wet noodle, 35 minutes is a long time to finish that.
> 
> One thing I'm not sure about is whether we try to do the checkpoint
> at maximum speed.  If you have set GUC options to throttle checkpoint
> I/O hard, that could perhaps explain this.

I didn't do much customization of postgresql settings and am mostly 
running what was provided by the upstreams. These are my explicit 
settings (all the rest is defaults):

     allow_system_table_mods: 'off'
     autovacuum: 'off'
     checkpoint_completion_target: '0.7'
     effective_cache_size: 16 GB
     maintenance_work_mem: 1 GB
     max_connections: 200
     max_wal_size: 16 GB
     shared_buffers: 8 GB
     shared_preload_libraries: pg_stat_statements
     track_activities: 'on'
     track_activity_query_size: 32 kB
     track_functions: all
     wal_keep_segments: 100
     work_mem: 64 MB

> You could possibly learn more by strace'ing the startup process to
> see what it's doing.

Will do, thanks, however I'm dreading the next failover downtime :-(

> Also, what PG version is that exactly?

12.6-1.pgdg20.04+1

Thankful for any help or pointers regarding the long promotion time,
*t



pgsql-general by date:

Previous
From: Chris Stephens
Date:
Subject: pgbouncer configuration
Next
From: Laurenz Albe
Date:
Subject: Re: pgbouncer configuration