Re: maintenance_work_mem = 64kB doesn't work for vacuum - Mailing list pgsql-hackers

From David Rowley
Subject Re: maintenance_work_mem = 64kB doesn't work for vacuum
Date
Msg-id CAApHDvps_sLPtBVZLyi--bmcjDNwqfg2eApQk9muYG-UrEi_nA@mail.gmail.com
Whole thread Raw
In response to Re: maintenance_work_mem = 64kB doesn't work for vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: maintenance_work_mem = 64kB doesn't work for vacuum
Re: maintenance_work_mem = 64kB doesn't work for vacuum
List pgsql-hackers
On Mon, 10 Mar 2025 at 17:22, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Regarding that patch, we need to note that the lpdead_items is a
> counter that is not reset in the entire vacuum. Therefore, with
> maintenance_work_mem = 64kB, once we collect at least one lpdead item,
> we perform a cycle of index vacuuming and heap vacuuming for every
> subsequent block even if they don't have a lpdead item. I think we
> should use vacrel->dead_items_info->num_items instead.

OK, I didn't study the code enough to realise that. My patch was only
intended as an indication of what I thought. Please feel free to
proceed with your own patch using the correct field.

When playing with parallel vacuum, I also wondered if there should be
some heuristic that avoids parallel vacuum unless the user
specifically asked for it in the command when maintenance_work_mem is
set to something far too low.

Take the following case as an example:
set maintenance_work_mem=64;
create table aa(a int primary key, b int unique);
insert into aa select a,a from generate_Series(1,1000000) a;
delete from aa;

-- try a vacuum with no parallelism
vacuum (verbose, parallel 0) aa;

system usage: CPU: user: 0.53 s, system: 0.00 s, elapsed: 0.57 s

If I did the following instead:

vacuum (verbose) aa;

The vacuum goes parallel and it takes a very long time due to
launching a parallel worker to do 1 page worth of tuples. I see the
following message 4425 times

INFO:  launched 1 parallel vacuum worker for index vacuuming (planned: 1)

and takes about 30 seconds to complete: system usage: CPU: user: 14.00
s, system: 0.81 s, elapsed: 30.86 s

Shouldn't the code in parallel_vacuum_compute_workers() try and pick a
good value for the workers based on the available memory and table
size when the user does not explicitly specify how many workers they
want?

David



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: 64 bit numbers vs format strings
Next
From: Steven Niu
Date:
Subject: Re: [Patch] remove duplicated smgrclose