When Update balloons memory - Mailing list pgsql-general

From Klaudie Willis
Subject When Update balloons memory
Date
Msg-id wJYzIcLjhFnsbCK8sy4lZZAojx7m04FwjsGF4WbrKj0gj2wLpaycQSEN1mtBUlhSAQGRvEdxZTwWRNV39UevSNWg8-h-tvWAL61pV4IuhW4=@protonmail.com
Whole thread Raw
Responses Re: When Update balloons memory  (Francisco Olarte <folarte@peoplecall.com>)
Re: When Update balloons memory  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
About the system:
Ubuntu 20.04, 64GB ram, 16GB shared buffer, 500 MB working mem, Postgresql 14.1

Core issue:
The following statement below, when not divided up into chunks, but run across all 800M rows, did trigger an OOM-kill from the OS.
I have looked into it by kernel logs as well as postgresql logs. The postgresql just says it was killed, and the OS killed it due to the fact that all mem including swap was exhausted.
Looking at TOP while updating, I can see the RSS column of a single postgresql process (the connection I assume), just grow and grow until it chokes the system.

Statement:
Update table alfa
set x = beta.x
from beta where beta.id=alpha.id and x <> beta.x

alpha is a wide table (40 columns), partitioned into 5 equally partitions by year. Total row count 800M rows
beta is a 10 column 40M rows table.
the updated field x is non-indexed varchar; the id fields are indexed.
there are no triggers

I am well aware that huge updates have general issues, like locking the table etc, and it is perhaps discouraged.  And I did solve it by batching it in 1M and 1M rows.
However, my curiosity still remains of what is really happening here.  Why do Postgresql run out of memory? Exactly what is it storing in that memory?  I am aware of the work_mem danger, but that is not what is happening here.  I can replicate this with 32MB work mem as well; This is a low connection database.
Any help is appreciated.

Klaudie

track_activity_query_size = 4096                                                 
synchronous_commit = off                                                         
full_page_writes = off                                                           
#wal_compression = on                                                            
wal_level = minimal                                                              
max_wal_senders = 0                                                              
                                                                                 
log_min_duration_statement = 1000                                                
idle_in_transaction_session_timeout = '300s'    # in milliseconds, 0 is disabled 
tcp_keepalives_idle = '300s'                                                     
max_connections = 50                                                             
shared_buffers = 16GB                                                            
effective_cache_size = 48GB                                                      
maintenance_work_mem = 2GB                                                       
checkpoint_completion_target = 0.9                                               
min_wal_size = 4GB                                                               
max_wal_size = 16GB                                                              
#wal_buffers = 16MB                                                              
default_statistics_target = 1000                                                 
random_page_cost = 1.1                                                           
effective_io_concurrency = 200                                                   
work_mem = 1000MB                                                                
max_worker_processes = 8                                                         
max_parallel_workers = 8                                                         
max_parallel_workers_per_gather = 4                                              
max_parallel_maintenance_workers = 4                                             
cpu_tuple_cost = 0.03
                                                


pgsql-general by date:

Previous
From: Avi Weinberg
Date:
Subject: RE: Are Foreign Key Disabled During Logical Replication Initial Sync?
Next
From: Wicher
Date:
Subject: Re: Advice on using materialized views