BUG #18283: vacuum full use a large amount of memory (may cause OOM) - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18283: vacuum full use a large amount of memory (may cause OOM)
Date
Msg-id 18283-72dcaf4228caf47e@postgresql.org
Whole thread Raw
Responses Re: BUG #18283: vacuum full use a large amount of memory (may cause OOM)  (Yang Zhu <fairyfar@msn.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18283
Logged by:          Zhu Yang
Email address:      fairyfar@msn.com
PostgreSQL version: 16.1
Operating system:   Red Hat Enterprise Linux Server 7.6 (Maipo)
Description:

Under certain conditions, a vacuum full will use a lot of memory. The memory
usage is out of control, and an OOM may occur.

Step to reproduce the behavior:

Step 1. Create a new session("Sess 1"), then execute the following query and
keep the connection:
```sql
-- create table and insert data for testing.
create table t1 (
    c1 character varying(100),
    c2 character varying(100),
    c3 character varying(100),
    c4 character varying(100),
    c5 character varying(100),
    c6 character varying(100)
);

create function randomtext(len int) returns text as $$
select string_agg(md5(random()::text),'') from generate_series(1,$1/32)
$$ language sql;

insert into t1 select
randomtext(34),randomtext(34),randomtext(34),randomtext(34),randomtext(34),'RST'
from generate_series(1,1000000);
```

Step 2. Create another session("Sess 2"). Start a long transaction, then
create any table, and keep the connection:

```sql
BEGIN;
create table t2(a int);
```

Step 3. Go back to "Sess 1" and continue:

```sql
-- Get the PID of the current backend
postgres=# select pg_backend_pid();
 pg_backend_pid
----------------
          10511
(1 row)

-- Can update multiple times
update t1 set c2=randomtext(34) where k1='RST';
update t1 set c2=randomtext(34) where k1='RST';

vacuum full t1;
```

Step 4. Create a new bash terminal and view the memory usage of PID 10511
during the execution of 'vacumm full':

```bash
[yz@bogon ~]$ top -p 10511

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
10511 yz        20   0 1853340   1.6g 141488 D  15.7 21.4   0:14.92
postgres
```

You can observe that 'vacumm full' memory usage (VIRT & RES) is very high,
with the peak memory usage increasing with each Step 3 execution.

After analysis, the MemoryContext that consumes significant memory is "Table
rewrite", and the stack is:

```
#0  0x000000000050f700 in heap_copytuple ()
#1  0x000000000054f452 in rewrite_heap_tuple ()
#2  0x000000000054844f in reform_and_rewrite_tuple.isra.0 ()
#3  0x00000000005488e0 in heapam_relation_copy_for_cluster ()
#4  0x0000000000616760 in copy_table_data ()
#5  0x0000000000617846 in cluster_rel ()
#6  0x0000000000676973 in vacuum_rel ()
#7  0x0000000000677b9c in vacuum ()
#8  0x00000000006782dc in ExecVacuum ()
#9  0x0000000000808859 in standard_ProcessUtility ()
#10 0x0000000000806f5f in PortalRunUtility ()
#11 0x000000000080708b in PortalRunMulti ()
#12 0x000000000080755d in PortalRun ()
#13 0x0000000000803b28 in exec_simple_query ()
...
```

The code that causes the problem is in
src/backend/access/heap/rewriteheap.c:

```c
void
rewrite_heap_tuple(RewriteState state,
                   HeapTuple old_tuple, HeapTuple new_tuple)
{
    ...
    if (!((old_tuple->t_data->t_infomask & HEAP_XMAX_INVALID) ||
          HeapTupleHeaderIsOnlyLocked(old_tuple->t_data)) &&
        !HeapTupleHeaderIndicatesMovedPartitions(old_tuple->t_data) &&
        !(ItemPointerEquals(&(old_tuple->t_self),
                            &(old_tuple->t_data->t_ctid))))
    {
        // If the code executes inside this block, the allocated memory will not
be freed until the query ends.
        ...
```


pgsql-bugs by date:

Previous
From: Richard Guo
Date:
Subject: Re: Postgres 16.1 - Bug: cache entry already complete
Next
From: 노영은
Date:
Subject: Re: BUG #18279: Duplicate key violation and Deadlock when using ON CONFLICT/DO UPDATE with multiple unique indexes