Thread: Does pg_stat_get_live_tuples() matter?
If a table has 1.8M rows in reality, yet pg_stat_get_live_tuples() returns 1.8K, does it matter to the performance of the database, as long as pg_class.reltuples is the right order of magnitude?
The query planner seems to use the accurate estimate, but what about the autovacuum daemon? Or anything else?
In short, is pg_stat_get_live_tuples() (and the views that invoke it) used by any built-in tooling, or is it just there for the user to do monitoring?
Thanks,
Sherrylyn
On 4/9/19 9:45 AM, Sherrylyn Branchaw wrote: > If a table has 1.8M rows in reality, yet pg_stat_get_live_tuples() > returns 1.8K, does it matter to the performance of the database, as long > as pg_class.reltuples is the right order of magnitude? What version of Postgres? How are you getting at the 1.8M number? > > The query planner seems to use the accurate estimate, but what about the > autovacuum daemon? Or anything else? > > In short, is pg_stat_get_live_tuples() (and the views that invoke it) > used by any built-in tooling, or is it just there for the user to do > monitoring? > > Thanks, > Sherrylyn -- Adrian Klaver adrian.klaver@aklaver.com
What version of Postgres?
Sorry, can't believe I forgot to include that! Postgres 9.6.9.
How are you getting at the 1.8M number?
SELECT COUNT(*)
Sherrylyn Branchaw <sbranchaw@gmail.com> writes: > If a table has 1.8M rows in reality, yet pg_stat_get_live_tuples() returns > 1.8K, does it matter to the performance of the database, as long as > pg_class.reltuples is the right order of magnitude? Hmmm ... what was in reltuples again? Looking at the current code, it looks like vacuum or autovacuum should set reltuples and n_live_tuples to the same thing. n_live_tuples gets adjusted incrementally by subsequent transactions, and it wouldn't be too surprising if it drifted away from reality, but it's a bit hard to believe that it could get to be off by 1000X. Have you suppressed autovacuum on this table? We have fooled around with the logic that maintains these numbers, so maybe it was different in 9.6.9. Anyway, to answer your question, I don't see anything in the current core code that pays attention to n_live_tuples. reltuples definitely does matter to the planner, and some of the sibling counters like n_dead_tuples drive autovacuum, but nothing is examining n_live_tuples AFAICS. regards, tom lane
> Hmmm ... what was in reltuples again?
Reltuples had the correct number, and the query plans were showing the correct estimates.
> it's a bit hard to believe that it could get to be off by 1000X. Have you suppressed autovacuum on this table?
No, but here are some things I've observed:
1) Any time pg_stat_reset() gets run, pg_stat_user_tables.n_live_tup gets reset to 0. pg_class.reltuples is untouched.
2) If new tuples get inserted or deleted after pg_stat_reset(), pg_stat_user_tables.n_live_tup will match pg_stat_user_tables.n_tup_ins - pg_stat_user_tables.n_tup_del up until the next analyze or autoanalyze.
3) Once the next (auto)analyze happens, pg_stat_user_tables.n_live_tup will be updated to match pg_class.reltuples.
4) If a table is very large, it may be a while before autovacuum_analyze_scale_factor is reached. We have ours set to 0.01, which is an order of magnitude larger than 1.8K on a 1.8M row table. (I would like to tune our settings more, but I have a list of priorities from management on which this isn't high.)
5) Given 1-4, pg_stat_user_tables.n_live_tup may spend quite a long time matching pg_stat_user_tables.n_tup_ins - pg_stat_user_tables.n_tup_del instead of pg_class.reltuples. For example, if a table has 1.8 million rows, and you insert 5 and delete 4 after a stats reset, n_live_tup will report that the table has 1 tuple.
6) Failovers, at least in Aurora, apparently cause pg_stat_reset() to be run, at least judging by the timestamp I'm seeing in pg_stat_bgwriter.stat_reset. We haven't done a failover in the data center in a while, and it's less trivial for me to test there atm, so I'm not certain whether open-source Postgres failovers also reset statistics.
> I don't see anything in the current core code that pays attention to n_live_tuples. reltuples definitely does matter to the planner, and some of the sibling counters like n_dead_tuples drive autovacuum, but nothing is examining n_live_tuples AFAICS.
That's what I thought, but I wanted to make sure I wasn't missing anything obvious. Thanks!
> some of the sibling counters like n_dead_tuples drive autovacuum
So that's interesting. I knew that, but hadn't thought about the implications. If pg_stat_reset() is executed by failovers, and a failover happens just before a table is ready to be vacuumed--say it has 0.009 * reltuples dead tuples (I'm simplifying the formula here)--then n_dead_tup gets reset to 0 and the counting starts all over again. Regular failovers could thus increase bloat by delaying the autovacuum daemon from recognizing that a table needs to be vacuumed, am I right?
Is it recommended practice to manually VACUUM ANALYZE the whole database after a failover? Or is resetting stats after a failover just an Aurora thing? I'm sorry I'm asking the latter question instead of testing, but I've been ordered not to spend time on improving our vacuuming and statistics until 5 other large projects are done, and I'm spending a minimal amount of time anyway just to see how badly frequent failovers might be affecting us and if there's any action we need to take.
Thanks,
Sherrylyn
> 6) Failovers, at least in Aurora, apparently cause pg_stat_reset() to be run, at least judging by the timestamp I'm seeingin pg_stat_bgwriter.stat_reset. We haven't done a failover in the data center in a while, and it's less trivial forme to test there atm, so I'm not certain whether open-source Postgres failovers also reset statistics. Just wanted to chime in here - on this version of Aurora, there’s nothing special or different from open source communityPostgreSQL that I know of. I believe that a failover in RDS MAZ PostgreSQL or in your own data center with repmgrwould have the same characteristics. Actually nothing is calling pg_stat_reset() in any of these cases - it’s justthat these statistics are maintained only in memory and not replicated through the WAL stream. (Aurora is slightly differentw shared storage between the nodes, but same principle applies.) So after failover, you’re on a new physical machinewhich needs to start over fresh on populating the memory structures. > Is it recommended practice to manually VACUUM ANALYZE the whole database after a failover? Or is resetting stats aftera failover just an Aurora thing? I'm sorry I'm asking the latter question instead of testing, but I've been orderednot to spend time on improving our vacuuming and statistics until 5 other large projects are done, and I'm spendinga minimal amount of time anyway just to see how badly frequent failovers might be affecting us and if there's anyaction we need to take. If you’re seeing frequent failovers - whether in your own data center or somebody else’s - I think you should investigateit. I don’t think you should accept that from your own staff or from any vendor. That said, I don’t think the impact here is really that severe on bloat or statistics; any frequently modified table willstill get vacuumed - albeit with one slightly longer interval due to a failover. It won’t exceed double the normal intervalunless there are multiple failovers in quick succession. And a table that isn’t getting changed wasn’t going toget vacuumed anyway. -Jeremy