Re: Vacuum Questions - Mailing list pgsql-novice

From Laurenz Albe
Subject Re: Vacuum Questions
Date
Msg-id 2568733f3acd9d6f65216acb486669ebbcdf7f6b.camel@cybertec.at
Whole thread Raw
In response to Vacuum Questions  (Leo <leo1969@gmail.com>)
List pgsql-novice
On Fri, 2025-05-02 at 09:50 -0400, Leo wrote:
> I have been working on AWS PostgreSQL RDS for a few years, but still not very experienced
> when it comes to performance issues.  Plus RDS is slightly different from the pure PostgreSQL.
>
> I am trying to comprehend exactly how vacuum works.
>
> Here is what I am trying to do.
>
> I am purging old records from a table (500 million rows, but I am doing it in sets of
> 50,000,000 with a smaller loop of 100,000).  That works just fine.
>
> Because of the amount of data/rows deleted, I disabled the autovacuum for this table
> (I want to have control over vacuum, autovacuum does not complete anyway due to the
> timeout, sizing, etc settings that I do not want to change system wide).  I will put
> the autovacuum back once I am done of course.
>
> The issue is when I start vacuuming.  This table has 4 indexes and a PK that I worry about.
> The PK takes about 30 minutes to vacuum and two of the indexes take about an hour each.
> The problem comes in for the other 2 indexes - they take 12+ hours each to vacuum:
>
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:  vacuuming "public.pc_workflowlog"
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:  launched 4 parallel vacuum workers for index vacuuming (planned: 4)
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:  scanned index "pc_workflowlog_pk" to remove 50000000 row versions
> DETAIL:  CPU: user: 191.03 s, system: 12.43 s, elapsed: 1711.22 s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:  scanned index "workflowlo_u_publicid_g6uqp9lkn6e8" to remove 50000000
rowversions 
> DETAIL:  CPU: user: 325.75 s, system: 19.75 s, elapsed: 2674.24 s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:  scanned index "workflowlo_n_workflow_2tc9k2hdtry9v" to remove 50000000
rowversions 
> DETAIL:  CPU: user: 312.17 s, system: 16.94 s, elapsed: 3097.88 s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:  scanned index "workflowlo_n_frozenseti_2kjkbjgf3c6ro" to remove
50000000row versions 
> DETAIL:  CPU: user: 41187.70 s, system: 216.14 s, elapsed: 42749.36 s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:  scanned index "workflowlo_n_userid_14kqw6qdsnndw" to remove 50000000
rowversions  
> DETAIL:  CPU: user: 41280.66 s, system: 216.67 s, elapsed: 42832.16 s
> psql:/home/backup/leo/fws_vacuum.sql:6: INFO:  "pc_workflowlog": removed 50000000 row versions in 1129870 pages
>
> I've increased max_parallel_maintenance_workers to 8 for the session and it used parallel 4
> (one for each index I assume) to handle it and the two indexes were done in ~ an hour.  What
> I am trying to figure out is how to force the other two large indexes to be vacuumed in
> parallel - a few workers going against an index.  It seems it is possible to do, the index
> size is large enough to kick in, but I have not been able to figure it out yet.  Most of the
> parameters are at default values.
>
> What am I missing?

I didn't look at the code, but I believe that a single index won't be scanned by more than a
single process.

> I have a few other questions.  Does vacuum time depend on the number of dead rows only and
> the size of the table, or does the entire storage allocation (including dead tuples) also
> affect it?

I am not certain what you mean.
If you set "maintenance_work_mem" high enough, VACUUM may be able to get rid of all dead
tuples in a single pass, which is very good for performance.  But that would figure under
"the time depends on the number of dead tuples".

> Would it be more beneficial to drop the two large indexes, purge, vacuum, and recreate the
> indexes after make more sense (I know it needs to be tested)?

If you delete the majority of the rows, that may be a good move.
But then, maybe the best approach would be to create a new copy of the table, copy all the
"surviving" rows there, create indexes on that copy, drop the original table and replace it
with the copy.
That would avoid bloat, which is otherwise unavoidable if you delete a big part of the table.

> Lastly, is it better to delete all the rows (500 mil) instead of doing it in smaller
> batches, and vacuum only once?

That will most likely be faster, and the bloat wouldn't be any worse.

Yours,
Laurenz Albe



pgsql-novice by date:

Previous
From: Leo
Date:
Subject: Vacuum Questions