The following bug has been logged on the website:
Bug reference: 17787
Logged by: Maxim Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 14.5
Operating system: Ubuntu
Description:
Hi,
On one of production system our in-house index bloat monitoring started
behave weird.
After some research I found that the value of pg_class.reltuples for FTS GIN
index could have two different values with 2 order magnitude difference
between them depends of what had been performed last on the table
(auto)vacuum (with index cleanup) or (auto)analyze.
Now what I have on production:
analyze verbose processor_callbacks;
select count(*) from processor_callbacks;
count
----------
33439573
select reltuples from pg_class where relname='payload_index';
reltuples
---------------
3.3456344e+07
So far seems ok... now after:
VACUUM (verbose, INDEX_CLEANUP ON) "public"."processor_callbacks";
the database produce:
select reltuples from pg_class where relname='payload_index';
reltuples
---------------
3.5565809e+09
After analyze verbose processor_callbacks; value return to original
value:
analyze processor_callbacks;
select reltuples from pg_class where relname='payload_index';
reltuples
---------------
3.3622728e+07
Every time I run analyze on the table - I have 3.3e+07 reltupes for this
index,
every time I run VACUUM (INDEX_CLEANUP ON) on the table - I have 3.5e+09
reltupes for this index (e.g. issue absolutely repeatable).
The table (and index) structure:
\d+ processor_callbacks
Table
"public.processor_callbacks"
Column | Type | Collation | Nullable |
Default | Storage | Compression | Stats target |
Description
-----------------+-----------------------------+-----------+----------+----------------------------------+----------+-------------+--------------+-------------
id | integer | | not null |
generated by default as identity | plain | | |
...
payload | json | | not null |
| extended | | |
Indexes:
"processor_callbacks_pkey" PRIMARY KEY, btree (id)
"payload_index" gin (to_tsvector('english'::regconfig, payload ->>
'params'::text))
...
I think first value better represents realty, but second on could have sense
as well (depend of how to define "tupes" in GIN index I suspect),
however both of them cannot be correct in same time.
While testing on small sample of 100rows I wasn't able to reproduce that
behavior, but it easily reproducible with vacuum full:
create table test as select payload from processor_callbacks order by id
desc limit 100;
create index test_gin on test USING GIN (to_tsvector('english'::regconfig,
payload ->> 'params'::text));
analyze test;
select reltuples from pg_class where relname='test_gin';
reltuples
-----------
100
vacuum FULL test;
select reltuples from pg_class where relname='test_gin';
reltuples
-----------
15834
analyze test;
select reltuples from pg_class where relname='test_gin';
reltuples
-----------
100
And so on...
PS:
I suspect that the issue could be related to the following discussion:
https://www.postgresql.org/message-id/flat/CAD21AoAA7%2BETUJo%3Dj2L8KAdKF8Q9_5uqwNx6H8rucFm6aRZSBA%40mail.gmail.com#2f06043f394a835c993ebf23a2af1183
Kind Regards,
Maxim