Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) |
Date | |
Msg-id | 20220319001556.atuqbmafvc73pa2k@alap3.anarazel.de Whole thread Raw |
In response to | Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) (Nitin Jadhav <nitinjadhavpostgres@gmail.com>) |
Responses |
Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Size of pg_rewrite (Was: Report checkpoint progress with pg_stat_progress_checkpoint) Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs) |
List | pgsql-hackers |
Hi, This is a long thread, sorry for asking if this has been asked before. On 2022-03-08 20:25:28 +0530, Nitin Jadhav wrote: > * Sort buffers that need to be written to reduce the likelihood of random > @@ -2129,6 +2132,8 @@ BufferSync(int flags) > bufHdr = GetBufferDescriptor(buf_id); > > num_processed++; > + pgstat_progress_update_param(PROGRESS_CHECKPOINT_BUFFERS_PROCESSED, > + num_processed); > > /* > * We don't need to acquire the lock here, because we're only looking > @@ -2149,6 +2154,8 @@ BufferSync(int flags) > TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id); > PendingCheckpointerStats.m_buf_written_checkpoints++; > num_written++; > + pgstat_progress_update_param(PROGRESS_CHECKPOINT_BUFFERS_WRITTEN, > + num_written); > } > } Have you measured the performance effects of this? On fast storage with large shared_buffers I've seen these loops in profiles. It's probably fine, but it'd be good to verify that. > @@ -1897,6 +1897,112 @@ pg_stat_progress_basebackup| SELECT s.pid, > s.param4 AS tablespaces_total, > s.param5 AS tablespaces_streamed > FROM pg_stat_get_progress_info('BASEBACKUP'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6,param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19,param20); > +pg_stat_progress_checkpoint| SELECT s.pid, > + CASE s.param1 > + WHEN 1 THEN 'checkpoint'::text > + WHEN 2 THEN 'restartpoint'::text > + ELSE NULL::text > + END AS type, > + ((((((( > + CASE > + WHEN ((s.param2 & (1)::bigint) > 0) THEN 'shutdown '::text > + ELSE ''::text > + END || > + CASE > + WHEN ((s.param2 & (2)::bigint) > 0) THEN 'end-of-recovery '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param2 & (4)::bigint) > 0) THEN 'immediate '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param2 & (8)::bigint) > 0) THEN 'force '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param2 & (16)::bigint) > 0) THEN 'flush-all '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param2 & (32)::bigint) > 0) THEN 'wait '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param2 & (128)::bigint) > 0) THEN 'wal '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param2 & (256)::bigint) > 0) THEN 'time '::text > + ELSE ''::text > + END) AS flags, > + ((((((( > + CASE > + WHEN ((s.param3 & (1)::bigint) > 0) THEN 'shutdown '::text > + ELSE ''::text > + END || > + CASE > + WHEN ((s.param3 & (2)::bigint) > 0) THEN 'end-of-recovery '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param3 & (4)::bigint) > 0) THEN 'immediate '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param3 & (8)::bigint) > 0) THEN 'force '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param3 & (16)::bigint) > 0) THEN 'flush-all '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param3 & (32)::bigint) > 0) THEN 'wait '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param3 & (128)::bigint) > 0) THEN 'wal '::text > + ELSE ''::text > + END) || > + CASE > + WHEN ((s.param3 & (256)::bigint) > 0) THEN 'time '::text > + ELSE ''::text > + END) AS next_flags, > + ('0/0'::pg_lsn + ( > + CASE > + WHEN (s.param4 < 0) THEN pow((2)::numeric, (64)::numeric) > + ELSE (0)::numeric > + END + (s.param4)::numeric)) AS start_lsn, > + to_timestamp(((946684800)::double precision + ((s.param5)::double precision / (1000000)::double precision))) AS start_time, > + CASE s.param6 > + WHEN 1 THEN 'initializing'::text > + WHEN 2 THEN 'getting virtual transaction IDs'::text > + WHEN 3 THEN 'checkpointing replication slots'::text > + WHEN 4 THEN 'checkpointing logical replication snapshot files'::text > + WHEN 5 THEN 'checkpointing logical rewrite mapping files'::text > + WHEN 6 THEN 'checkpointing replication origin'::text > + WHEN 7 THEN 'checkpointing commit log pages'::text > + WHEN 8 THEN 'checkpointing commit time stamp pages'::text > + WHEN 9 THEN 'checkpointing subtransaction pages'::text > + WHEN 10 THEN 'checkpointing multixact pages'::text > + WHEN 11 THEN 'checkpointing predicate lock pages'::text > + WHEN 12 THEN 'checkpointing buffers'::text > + WHEN 13 THEN 'processing file sync requests'::text > + WHEN 14 THEN 'performing two phase checkpoint'::text > + WHEN 15 THEN 'performing post checkpoint cleanup'::text > + WHEN 16 THEN 'invalidating replication slots'::text > + WHEN 17 THEN 'recycling old WAL files'::text > + WHEN 18 THEN 'truncating subtransactions'::text > + WHEN 19 THEN 'finalizing'::text > + ELSE NULL::text > + END AS phase, > + s.param7 AS buffers_total, > + s.param8 AS buffers_processed, > + s.param9 AS buffers_written, > + s.param10 AS files_total, > + s.param11 AS files_synced > + FROM pg_stat_get_progress_info('CHECKPOINT'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6,param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19,param20); > pg_stat_progress_cluster| SELECT s.pid, > s.datid, > d.datname, This view is depressingly complicated. Added up the view definitions for the already existing pg_stat_progress* views add up to a measurable part of the size of an empty database: postgres[1160866][1]=# SELECT sum(octet_length(ev_action)), SUM(pg_column_size(ev_action)) FROM pg_rewrite WHERE ev_class::regclass::textLIKE '%progress%'; ┌───────┬───────┐ │ sum │ sum │ ├───────┼───────┤ │ 97410 │ 19786 │ └───────┴───────┘ (1 row) and this view looks to be a good bit more complicated than the existing pg_stat_progress* views. Indeed: template1[1165473][1]=# SELECT ev_class::regclass, length(ev_action), pg_column_size(ev_action) FROM pg_rewrite WHERE ev_class::regclass::textLIKE '%progress%' ORDER BY length(ev_action) DESC; ┌───────────────────────────────┬────────┬────────────────┐ │ ev_class │ length │ pg_column_size │ ├───────────────────────────────┼────────┼────────────────┤ │ pg_stat_progress_checkpoint │ 43290 │ 5409 │ │ pg_stat_progress_create_index │ 23293 │ 4177 │ │ pg_stat_progress_cluster │ 18390 │ 3704 │ │ pg_stat_progress_analyze │ 16121 │ 3339 │ │ pg_stat_progress_vacuum │ 16076 │ 3392 │ │ pg_stat_progress_copy │ 15124 │ 3080 │ │ pg_stat_progress_basebackup │ 8406 │ 2094 │ └───────────────────────────────┴────────┴────────────────┘ (7 rows) pg_rewrite without pg_stat_progress_checkpoint: 745472, with: 753664 pg_rewrite is the second biggest relation in an empty database already... template1[1164827][1]=# SELECT relname, pg_total_relation_size(oid) FROM pg_class WHERE relkind = 'r' ORDER BY 2 DESC LIMIT5; ┌────────────────┬────────────────────────┐ │ relname │ pg_total_relation_size │ ├────────────────┼────────────────────────┤ │ pg_proc │ 1212416 │ │ pg_rewrite │ 745472 │ │ pg_attribute │ 704512 │ │ pg_description │ 630784 │ │ pg_collation │ 409600 │ └────────────────┴────────────────────────┘ (5 rows) Greetings, Andres Freund
pgsql-hackers by date: