Thread: incoherent dead tuples between pg_stat_user_tables and pgstattuple?
Hello, running 11.5 I've some misconception about pgstatuple: it seems it lags behind normal statistics. I've almost double a table with an update. The result is: testdb=# select n_live_tup, n_dead_tup, last_autovacuum, last_autoanalyze from pg_stat_user_tables where relname = 'foo'; -[ RECORD 1 ]----+------------------------------ n_live_tup | 5021619 n_dead_tup | 5000000 last_autovacuum | 2019-09-27 12:05:36.072376+02 last_autoanalyze | 2019-09-27 12:06:16.538413+02 but pgstatuple provides a quite different output, saying the table is not full of dead rows: testdb=# SELECT * FROM pgstattuple( 'foo' ); -[ RECORD 1 ]------+---------- table_len | 930897920 tuple_count | 5000000 tuple_len | 439595972 tuple_percent | 47.22 dead_tuple_count | 88 dead_tuple_len | 7744 dead_tuple_percent | 0 free_space | 447716392 free_percent | 48.1 Even after running an ANALYZE, pgstattuple provides the same results. After a VACUUM FULL ANALYZE the world is as I would expect it to be: testdb=# select n_live_tup, n_dead_tup, last_autovacuum, last_autoanalyze from pg_stat_user_tables where relname = 'foo'; -[ RECORD 1 ]----+------------------------------ n_live_tup | 5000117 n_dead_tup | 0 last_autovacuum | 2019-09-27 12:05:36.072376+02 last_autoanalyze | 2019-09-27 12:06:16.538413+02 Time: 16,159 ms testdb=# SELECT * FROM pgstattuple( 'foo' ); -[ RECORD 1 ]------+---------- table_len | 465453056 tuple_count | 5000000 tuple_len | 439595972 tuple_percent | 94.44 dead_tuple_count | 0 dead_tuple_len | 0 dead_tuple_percent | 0 free_space | 3870144 free_percent | 0.83 Am I missing something? Thanks, Luca
Luca Ferrari <fluca1978@gmail.com> writes: > running 11.5 I've some misconception about pgstatuple: it seems it > lags behind normal statistics. n_live_tup/n_dead_tup should not be thought to be better than approximations. Most operations adjust them only incrementally, and messages to the stats collector can get dropped entirely under heavy load, causing deltas to go missing. There are also race conditions involved in some update scenarios. > Even after running an ANALYZE, pgstattuple provides the same results. > After a VACUUM FULL ANALYZE the world is as I would expect it to be: VACUUM FULL is one of very few operations that reset those counters to guaranteed-correct values (and I'm not sure the guarantee is ironclad even in that case). It's very hard to improve on this without giving up the desirable ability to have concurrent table updates. If you really want an accurate row count, COUNT(*) or pgstattuples will give you a more reliable estimate ... at much higher cost, of course. regards, tom lane
On Fri, Sep 27, 2019 at 3:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > n_live_tup/n_dead_tup should not be thought to be better than > approximations. Most operations adjust them only incrementally, > and messages to the stats collector can get dropped entirely > under heavy load, causing deltas to go missing. There are > also race conditions involved in some update scenarios. > Are you suggesting tha autovacuum made free space available so that pg_stat_user_tables lags behind? Because I don't see last_autovacuum changing in the mantime. Luca
Luca Ferrari <fluca1978@gmail.com> writes: > On Fri, Sep 27, 2019 at 3:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> n_live_tup/n_dead_tup should not be thought to be better than >> approximations. Most operations adjust them only incrementally, >> and messages to the stats collector can get dropped entirely >> under heavy load, causing deltas to go missing. There are >> also race conditions involved in some update scenarios. > Are you suggesting tha autovacuum made free space available so that > pg_stat_user_tables lags behind? Because I don't see last_autovacuum > changing in the mantime. I'm not suggesting any specific scenario, because you haven't presented any evidence as to when those counts became off. I'm just saying that there are plenty of possible mechanisms for them to become so. regards, tom lane
On Fri, Sep 27, 2019 at 4:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I'm not suggesting any specific scenario, because you haven't presented > any evidence as to when those counts became off. I'm just saying that > there are plenty of possible mechanisms for them to become so. So far autovacuum seems the main reason for pg_stattuple to be _ahead_ of pg_stats (I stand correct). In fact, turning off autovacuum the pg_stattuple shows the exact result as I would have expected, and probably the pg_stats I was referring to with n_dead_tup > 0 was due to a previous UPDATE ran against the same table. Apparently this produces always the same results: testdb=# CREATE TABLE foo( bar text ); CREATE TABLE testdb=# INSERT INTO foo SELECT generate_series( 1, 5000000 )::text; INSERT 0 5000000 testdb=# ANALYZE foo; ANALYZE testdb=# show autovacuum; -[ RECORD 1 ]--- autovacuum | off testdb=# select current_timestamp; -[ RECORD 1 ]-----+------------------------------ current_timestamp | 2019-09-27 20:44:39.277217+02 Time: 0,839 ms testdb=# select n_live_tup, n_dead_tup, last_autovacuum, last_autoanalyze from pg_stat_user_tables where relname = 'foo'; -[ RECORD 1 ]----+------------------------------ n_live_tup | 5000000 n_dead_tup | 0 last_autovacuum | 2019-09-27 20:29:18.571858+02 last_autoanalyze | 2019-09-27 20:29:40.889657+02 testdb=# UPDATE foo SET bar = lower( bar ); UPDATE 5000000 testdb=# select n_live_tup, n_dead_tup, last_autovacuum, last_autoanalyze from pg_stat_user_tables where relname = 'foo'; -[ RECORD 1 ]----+------------------------------ n_live_tup | 5000000 n_dead_tup | 5000000 last_autovacuum | 2019-09-27 20:29:18.571858+02 last_autoanalyze | 2019-09-27 20:29:40.889657+02 testdb=# SELECT * FROM pgstattuple( 'foo' ); -[ RECORD 1 ]------+---------- table_len | 362479616 tuple_count | 5000000 tuple_len | 158888896 tuple_percent | 43.83 dead_tuple_count | 5000000 dead_tuple_len | 158888896 dead_tuple_percent | 43.83 free_space | 1240672 free_percent | 0.34