BUG #16444: Vacuum writes the table twice - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #16444: Vacuum writes the table twice
Date
Msg-id 16444-4c2f66fed04080b8@postgresql.org
Whole thread Raw
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      16444
Logged by:          Kurt Roeckx
Email address:      kurt@roeckx.be
PostgreSQL version: 12.3
Operating system:   Debian
Description:

I had a large query fail, where most of the data wasn't commited, and a very
small amount was. So I decided to run vacuum on the table. The size of the
table is:
         table_name          | table_size | indexes_size | total_size
-----------------------------+------------+--------------+------------
 "public"."ct_entry_chain"   | 27 GB      | 39 GB        | 66 GB

The table looks like:
                                Table "public.ct_entry_chain"
     Column     |  Type  | Collation | Nullable |                  Default
----------------+--------+-----------+----------+--------------------------------------------
 id             | bigint |           | not null |
nextval('ct_entry_chain_id_seq'::regclass)
 ct_entry_id    | bigint |           | not null |
 certificate_id | bigint |           | not null |
Indexes:
    "ct_entry_chain_pkey" PRIMARY KEY, btree (id)
    "ct_entry_chain_ct_entry_id_certificate_id_key" UNIQUE CONSTRAINT, btree
(ct_entry_id, certificate_id)
Foreign-key constraints:
    "ct_entry_chain_certificate_id_fkey" FOREIGN KEY (certificate_id)
REFERENCES raw_certificates(id)
    "ct_entry_chain_ct_entry_id_fkey" FOREIGN KEY (ct_entry_id) REFERENCES
ct_entry(id)

This has resulted in 205 GB being read from disk, and 93 GB being written.
While I only expects 66 GB to be rewritten.

Looking at the verbose output of the vacuum, it seems that it needed to scan
the indexes 4 times. The amount of data read is about what I expect.

Looking with strace what happens, it seems that the first time it reads the
data, it also writes everything back. I assume it marks the rows as dead.
Then when indexes are cleaned up, it reads+writes the whole table again, to
actually vacuum it. Can that first write be avoided?


pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #16443: Too much memory usage on insert query
Next
From: Kurt Roeckx
Date:
Subject: Re: BUG #16443: Too much memory usage on insert query