The following bug has been logged on the website:
Bug reference: 16444
Logged by: Kurt Roeckx
Email address: kurt@roeckx.be
PostgreSQL version: 12.3
Operating system: Debian
Description:
I had a large query fail, where most of the data wasn't commited, and a very
small amount was. So I decided to run vacuum on the table. The size of the
table is:
table_name | table_size | indexes_size | total_size
-----------------------------+------------+--------------+------------
"public"."ct_entry_chain" | 27 GB | 39 GB | 66 GB
The table looks like:
Table "public.ct_entry_chain"
Column | Type | Collation | Nullable | Default
----------------+--------+-----------+----------+--------------------------------------------
id | bigint | | not null |
nextval('ct_entry_chain_id_seq'::regclass)
ct_entry_id | bigint | | not null |
certificate_id | bigint | | not null |
Indexes:
"ct_entry_chain_pkey" PRIMARY KEY, btree (id)
"ct_entry_chain_ct_entry_id_certificate_id_key" UNIQUE CONSTRAINT, btree
(ct_entry_id, certificate_id)
Foreign-key constraints:
"ct_entry_chain_certificate_id_fkey" FOREIGN KEY (certificate_id)
REFERENCES raw_certificates(id)
"ct_entry_chain_ct_entry_id_fkey" FOREIGN KEY (ct_entry_id) REFERENCES
ct_entry(id)
This has resulted in 205 GB being read from disk, and 93 GB being written.
While I only expects 66 GB to be rewritten.
Looking at the verbose output of the vacuum, it seems that it needed to scan
the indexes 4 times. The amount of data read is about what I expect.
Looking with strace what happens, it seems that the first time it reads the
data, it also writes everything back. I assume it marks the rows as dead.
Then when indexes are cleaned up, it reads+writes the whole table again, to
actually vacuum it. Can that first write be avoided?