Re: New criteria for autovacuum - Mailing list pgsql-hackers
From | Aleksander Alekseev |
---|---|
Subject | Re: New criteria for autovacuum |
Date | |
Msg-id | CAJ7c6TPZf5EfLFohUyZXqmqFKOLhMnamZL+CFZuuuGSrcNm=Hw@mail.gmail.com Whole thread Raw |
In response to | New criteria for autovacuum (Konstantin Knizhnik <knizhnik@garret.ru>) |
Responses |
Re: New criteria for autovacuum
Re: New criteria for autovacuum |
List | pgsql-hackers |
Hi Konstantin, > But the problem can be quite easily reproduced. We can just populate table with some data with some other transaction withassigned XID active. > Then explicitly vacuum this tables or wait until autovacuum does it. > At this moment table has no more dead or inserted tuples so autovacuum will not be called for it. But heap pages of thistable are still not marked as all-visible. > And will never be marked as all-visible unless table is updated or is explicitly vacuumed. I decided to experiment with the scenario you are describing. For those who likes to have exact steps to reproduce the issue, as I do, here they are: Session 1: ``` CREATE TABLE humidity( ts TIMESTAMP NOT NULL, city TEXT NOT NULL, humidity INT NOT NULL); CREATE INDEX humidity_idx ON humidity (ts, city) INCLUDE (humidity); INSERT INTO humidity (ts, city, humidity) SELECT ts + (INTERVAL '60 minutes' * random()), city, 100*random() FROM generate_series('2010-01-01' :: TIMESTAMP, '2020-12-31', '1 hour') AS ts, unnest(array['Moscow', 'Berlin']) AS city; ``` Session 2: ``` BEGIN; INSERT INTO humidity (ts, city, humidity) SELECT ts + (INTERVAL '60 minutes' * random()), city, 100*random() FROM generate_series('2022-01-01' :: TIMESTAMP, '2025-12-31', '1 hour') AS ts, unnest(array['Moscow', 'Berlin']) AS city; -- no COMMIT ``` Session 1: ``` VACUUM humidity; ``` Session 2: ``` COMMIT; ``` Session 1: ``` EXPLAIN (ANALYZE, BUFFERS ON) SELECT humidity FROM humidity WHERE ts >= '2022-01-01' AND ts < '2022-05-02' AND city = 'Moscow'; ``` The result is: ``` Index Only Scan using humidity_idx on humidity (cost=0.42..58.75 rows=67 width=4) (actual time=0.060..7.490 rows=2904.00 loops=1) Index Cond: ((ts >= '2022-01-01 00:00:00'::timestamp without time zone) AND (ts < '2022-05-02 00:00:00'::timestamp without time zone) AND (city = 'M oscow'::text)) Heap Fetches: 2904 Index Searches: 1 Buffers: shared hit=123 Planning: Buffers: shared hit=10 Planning Time: 0.840 ms Execution Time: 7.964 ms ``` ... and it is claimed that autovacuum will never be triggered in order to set hint bits, assuming we never modify the table again. I have mixed feelings about addressing this. Although this behavior is somewhat counterintuitive, if the user has a read-only lookup table he/she can always execute VACUUM manually. In order to relieve him of this unbearable burden we are going to need to introduce some overhead that will affect all the users (not to mention people maintaining the code). This would be convenient for sure but I'm not entirely sure if it's worth it. Thoughts? -- Best regards, Aleksander Alekseev
pgsql-hackers by date: