Thread: High System CPU Usage on Selects Seemingly Caused By Vacuum of Same Table

High System CPU Usage on Selects Seemingly Caused By Vacuum of Same Table

From

Joshua Banton

Date:

01 February, 01:28:16

Hello,

We are experiencing a high system cpu issue with our database that we can't quite explain.

Description of the issue:
The database instance cpu suddenly increases to over 90%, the majority of it as reported by RDS Enhanced Monitoring being system cpu, as opposed to coming from Postgres itself. IOPS usage barely changes, plenty of headroom.

The issue mostly manifests near the end of the "scanning heap" phase of vacuuming of one of our largest tables, we'll call table1. RDS Performance Insights reports that selects on table1 start to wait on cpu, where previously it didn't even show up in the top 25 queries by wait. It doesn't always happen, but if there is a larger than usual number of selects on table1 it is more likely to happen.

Example Timeline:
00:00 vacuum starts on table1 (has at least 10 million dead tuples as reported by pg_stat_user_tables)
02:00 database cpu p90 12%, RDS Performance Insights CPU waits 11%, IOPS under 100k, table1 not in top 25 waits in RDS Performance Insights, vacuum reports 6GB out of 3TB remaining to scan
02:01 database cpu p90 98%, RDS Performance Insights CPU waits 73%, IOPS under 100k, table1 selects now #1, #2, #6 in RDS Performance Insights making up the majority of the cpu waits
02:42 vacuum on table1 finishes, database cpu p90 11%, RDS Performance Insights CPU waits 12%, IOPS under 100k, table1 not in top 25 waits in RDS Performance Insights

Installation Details:
AWS RDS Managed Instance
Postgres Version: PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-17), 64-bit
Instance Class: r7i.48xlarge
vCPU: 192
RAM: 1536 GB
Storage Type: Provisioned IOPS SSD (io2), 180000 IOPS
Application Language/Framework: Ruby on Rails
Pooler: PgBouncer 1.17.0

Config:
|name |current_setting
|-------------------------------------------|---------------------------------------
|autovacuum_analyze_scale_factor |0.005
|autovacuum_freeze_max_age |500000000
|autovacuum_max_workers |24
|autovacuum_naptime |15s
|autovacuum_vacuum_cost_delay |1ms
|autovacuum_vacuum_cost_limit |2000
|autovacuum_vacuum_scale_factor |0.01
|autovacuum_work_mem |48756174kB
|checkpoint_completion_target |0.9
|checkpoint_timeout |1h
|default_toast_compression |lz4
|effective_cache_size |780098792kB
|enable_presorted_aggregate |on
|fsync |on
|full_page_writes |on
|gss_accept_delegation |off
|hot_standby |off
|huge_pages |on
|icu_validation_level |error
|idle_in_transaction_session_timeout |30s
|ignore_invalid_pages |off
|jit |off
|lo_compat_privileges |off
|log_autovacuum_min_duration |0
|log_checkpoints |on
|log_connections |on
|log_disconnections |on
|log_min_duration_statement |10ms
|log_rotation_age |1h
|log_temp_files |0
|log_timezone |UTC
|log_truncate_on_rotation |off
|logging_collector |on
|maintenance_work_mem |25577009kB
|max_connections |5000
|max_locks_per_transaction |64
|max_logical_replication_workers |10
|max_parallel_apply_workers_per_subscription|2
|max_parallel_workers |96
|max_prepared_transactions |0
|max_replication_slots |10
|max_slot_wal_keep_size |400000MB
|max_stack_depth |6MB
|max_standby_archive_delay |-1
|max_standby_streaming_delay |2147483647ms
|max_wal_senders |25
|max_wal_size |200000MB
|max_worker_processes |20
|min_wal_size |128MB
|random_page_cost |1.05
|recovery_init_sync_method |syncfs
|recovery_prefetch |off
|remove_temp_files_after_crash |off
|reserved_connections |2
|scram_iterations |4096
|send_abort_for_crash |off
|send_abort_for_kill |off
|session_preload_libraries |
|shared_buffers |390049392kB
|shared_preload_libraries |rdsutils,pg_tle,pg_stat_statements
|synchronous_commit |on
|track_activity_query_size |4kB
|track_functions |pl
|track_io_timing |on
|track_wal_io_timing |on
|vacuum_buffer_usage_limit |399410kB
|vacuum_cost_page_miss |5
|vacuum_failsafe_age |1200000000
|vacuum_multixact_failsafe_age |1200000000
|wal_compression |lz4
|wal_keep_size |400000MB
|wal_level |logical
|wal_receiver_create_temp_slot |off
|wal_receiver_timeout |15min
|wal_sender_timeout |2min

Large Table Stats:
table1:
rows: 2,303,386,400
disk space: 3.5T
rel size: 3T
FSM size: 758 MB
visibility map size: 94 MB

table2:
rows: 1,919,360,500
disk space: 1.6T
rel size: 758G
FSM size: 189 MB
visibility map size: 24 MB

table3:
rows: 2,412,903,000
disk space: 741G
rel size: 387G
FSM size: 96 MB
visibility map size: 12 MB

Any suggestions or insights greatly appreciated!

Joshua Banton

Re: High System CPU Usage on Selects Seemingly Caused By Vacuum of Same Table

From

Peter Geoghegan

Date:

01 February, 01:45:48

On Fri, Jan 31, 2025 at 5:28 PM Joshua Banton <bantonj@gmail.com> wrote:
> The issue mostly manifests near the end of the "scanning heap" phase of vacuuming of one of our largest tables, we'll
calltable1. RDS Performance Insights reports that selects on table1 start to wait on cpu, where previously it didn't
evenshow up in the top 25 queries by wait. It doesn't always happen, but if there is a larger than usual number of
selectson table1 it is more likely to happen. 

Does this database also have many tables? As in thousands of tables?

I am reminded of this issue:

https://www.postgresql.org/message-id/flat/da3205c4-5b07-a65c-6c26-a293c6464fdb%40postgrespro.ru

I've heard of this happening when an aggressive VACUUM updates
relfrozenxid on a larger table.

--
Peter Geoghegan

Re: High System CPU Usage on Selects Seemingly Caused By Vacuum of Same Table

From

Joshua Banton

Date:

01 February, 01:59:42

There are not that many tables, 175.

There are a decent number of transactions on the database, 35k a second during the day. I ran a vacuum verbose on table1 today and got this output on the pages frozen.

frozen: 2813436 pages from table (0.70% of total) had 7028123 tuples frozen

On Fri, Jan 31, 2025 at 5:46 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Fri, Jan 31, 2025 at 5:28 PM Joshua Banton <bantonj@gmail.com> wrote:
> The issue mostly manifests near the end of the "scanning heap" phase of vacuuming of one of our largest tables, we'll call table1. RDS Performance Insights reports that selects on table1 start to wait on cpu, where previously it didn't even show up in the top 25 queries by wait. It doesn't always happen, but if there is a larger than usual number of selects on table1 it is more likely to happen.

Does this database also have many tables? As in thousands of tables?

I am reminded of this issue:

https://www.postgresql.org/message-id/flat/da3205c4-5b07-a65c-6c26-a293c6464fdb%40postgrespro.ru

I've heard of this happening when an aggressive VACUUM updates
relfrozenxid on a larger table.

--
Peter Geoghegan