Re: question on visibility map - Mailing list pgsql-hackers

From Jakub Wartak
Subject Re: question on visibility map
Date
Msg-id CAKZiRmwD3LNaD+OYqrwrr55tH1ZUCLjEXOr8fbytoHiW9axx1w@mail.gmail.com
Whole thread
In response to question on visibility map  (Fabrice Chapuis <commeilmeplait@gmail.com>)
List pgsql-hackers
On Fri, Apr 24, 2026 at 10:13 AM Fabrice Chapuis
<commeilmeplait@gmail.com> wrote:
>
> Hi,
>
> In this test I found that not all pages are not marked as all_visible after vacuuming, any explanation?
>
> CREATE TABLE test (
>     id SERIAL PRIMARY KEY,
>     name TEXT,
>     created_at TIMESTAMP DEFAULT now()
> );
>
> INSERT INTO test (name)
> SELECT
>     'name_' || g
> FROM generate_series(1, 100000) AS g;
> CREATE TABLE
> INSERT 0 100000
> cpr [2720498]=#
> cpr [2720498]=#
> cpr [2720498]=# table test limit 10;
> +----+---------+----------------------------+
> | id |  name   |         created_at         |
> +----+---------+----------------------------+
> |  1 | name_1  | 2026-04-24 09:35:46.561014 |
> |  2 | name_2  | 2026-04-24 09:35:46.561014 |
> |  3 | name_3  | 2026-04-24 09:35:46.561014 |
> |  4 | name_4  | 2026-04-24 09:35:46.561014 |
> |  5 | name_5  | 2026-04-24 09:35:46.561014 |
> |  6 | name_6  | 2026-04-24 09:35:46.561014 |
> |  7 | name_7  | 2026-04-24 09:35:46.561014 |
> |  8 | name_8  | 2026-04-24 09:35:46.561014 |
> |  9 | name_9  | 2026-04-24 09:35:46.561014 |
> | 10 | name_10 | 2026-04-24 09:35:46.561014 |
> +----+---------+----------------------------+
> (10 rows)
>
> # ALTER TABLE test SET (autovacuum_enabled = false);
> ALTER TABLE
> # SELECT reloptions
> FROM pg_class
> WHERE relname = 'test';
> +----------------------------+
> |         reloptions         |
> +----------------------------+
> | {autovacuum_enabled=false} |
> +----------------------------+
> (1 row)
>
> # update test set name = 'name_x' where id = 1;
> UPDATE 1
>
> # create extension pg_visibility;
> CREATE EXTENSION
> # SELECT
>     c.relpages AS total_pages,
>     s.all_visible,
>     s.all_frozen,
>     round(100.0 * s.all_visible / NULLIF(c.relpages, 0), 1) AS pct_visible,
>     round(100.0 * s.all_frozen / NULLIF(c.relpages, 0), 1) AS pct_frozen
> FROM pg_class c
> CROSS JOIN LATERAL pg_visibility_map_summary(c.oid) s
> WHERE c.relname = 'test';
> +-------------+-------------+------------+-------------+------------+
> | total_pages | all_visible | all_frozen | pct_visible | pct_frozen |
> +-------------+-------------+------------+-------------+------------+
> |         637 |         635 |          0 |        99.7 |        0.0 |
> +-------------+-------------+------------+-------------+------------+
> (1 row)
>
> vacuum test;
> VACUUM
> # SELECT
>     c.relpages AS total_pages,
>     s.all_visible,
>     s.all_frozen,
>     round(100.0 * s.all_visible / NULLIF(c.relpages, 0), 1) AS pct_visible,
>     round(100.0 * s.all_frozen / NULLIF(c.relpages, 0), 1) AS pct_frozen
> FROM pg_class c
> CROSS JOIN LATERAL pg_visibility_map_summary(c.oid) s
> WHERE c.relname = 'test';
> +-------------+-------------+------------+-------------+------------+
> | total_pages | all_visible | all_frozen | pct_visible | pct_frozen |
> +-------------+-------------+------------+-------------+------------+
> |         637 |         636 |          0 |        99.8 |        0.0 |
> +-------------+-------------+------------+-------------+------------+
> (1 row)
>


Hi Fabrice,

with both recent 12 and 13.23 I'm was getting as the result:

 total_pages | all_visible | all_frozen | pct_visible | pct_frozen
-------------+-------------+------------+-------------+------------
         637 |         637 |          0 |       100.0 |        0.0

but starting with 14+, I've started getting those 1-2 not visible
blocks too at the begging:

 total_pages | all_visible | all_frozen | pct_visible | pct_frozen
-------------+-------------+------------+-------------+------------
         637 |         636 |        636 |        99.8 |       99.8

and with select * from pg_visibility_map('test'::regclass) where blkno
<= 2 and all_visible='f';

 blkno | all_visible | all_frozen
-------+-------------+------------
     0 | f           | f

so it is always block #0 where Your update landed. After VACUUM FULL +
VACUUM however it back to 100% again. Now, re-trying Your's test case
(to get it again to 636/637) to inspect it further we apparently have
this situation on 14..master:

CREATE EXTENSION pageinspect;
SELECT lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_ctid,
t_infomask, t_infomask2 FROM heap_page_items(get_raw_page('test', 0))

 lp  | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_ctid  |
t_infomask | t_infomask2
-----+--------+----------+--------+--------+--------+---------+------------+-------------
   1 |      0 |        3 |      0 |        |        |         |
    |
   2 |   8144 |        1 |     48 |    753 |      0 | (0,2)   |
2306 |           3
   3 |   8096 |        1 |     48 |    753 |      0 | (0,3)   |
2306 |           3
   4 |   8048 |        1 |     48 |    753 |      0 | (0,4)   |
2306 |           3
   5 |   8000 |        1 |     48 |    753 |      0 | (0,5)   |
2306 |           3
   6 |   7952 |        1 |     48 |    753 |      0 | (0,6)   |
2306 |           3

Note lp_flags==3, so with 14+ we started zero-ing out the block
and putting it at the end of relation when looking just at the
result of the testcase:

SELECT blkno, count(*) AS item_count FROM  generate_series(0, (SELECT
pg_relation_size('test') / 8192) - 1) AS blkno, LATERAL
heap_page_items(get_raw_page('test', blkno::int)) GROUP BY 1 ORDER BY
1;
[..]
   634 |        157
   635 |        157
   636 |        149 // this has 148 -> 149 LPs now, due to that single UPDATE
                       appending it there

So with blkno=636 being larger by one LP, you can new row @ 149 LP:

 lp  | lp_off | lp_flags | lp_len | t_xmin | t_xmax |  t_ctid   |
t_infomask | t_infomask2
-----+--------+----------+--------+--------+--------+-----------+------------+-------------
[..]
 146 |   1184 |        1 |     48 |    799 |      0 | (636,146) |
 2306 |           3
 147 |   1136 |        1 |     48 |    799 |      0 | (636,147) |
 2306 |           3
 148 |   1088 |        1 |     48 |    799 |      0 | (636,148) |
 2306 |           3
 149 |   1040 |        1 |     48 |    801 |      0 | (636,149) |
10498 |           3

So on 14+, I think UPDATE just zeroed out blkno#0 lp_1 (and later it had set
lp_flags == 3 == LP_DEAD (!) flag was set there by VACUUM), and that UPDATE
put new record at the end of table, but VACUUM somewhat is not able to remove
that LP_DEAD unless indexes are guranteed to be cleaned up from stale entries.

If you take a look on 14+ then can force it do so
`VACUUM (INDEX_CLEANUP true) test;` and it will have 100% pct_visible set
afterwards. See more for default toggle 'INDEX_CLEANUP  AUTO' which says
`the default is AUTO, which allows VACUUM to skip index vacuuming when
appropriate` and this was introduced in 14.0 as part of`Allow vacuum to
skip index vacuuming when the number of removable index entries is
insignificant`. So the LP_DEAD is just sign that VACUUM in Your's
testcase did not clean indexes (unless forced to do so). That feature
references two commits [0][1] and first one even says in the commitmsg:

"We only skip index vacuuming when 2% or less of the table's pages have
one or more LP_DEAD items -- bypassing index vacuuming as an
optimization must not noticeably impede setting bits in the visibility
map."

Hope that helps.

-J.

[0] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=5100010ee
[1] - https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3499df0de



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Use-after-free issue in postgres_fdw
Next
From: Vlad Lesin
Date:
Subject: [PATCH] Fix ProcKill lock-group vs procLatch recycle race