Thread: Identifying diskspace leakage
I am trying to identify tables with significant diskspace "leakage" due to in appropriately low max_fsm_pages settings. I can see the results of VACUUM ANALYZE VERBOSE output counts of tuples and unused tuples, and understand that (1 - (tuples/unused)) is the amount of diskspace available to be reclaimed with a VACUUM FULL or dump/reload. Is there a way to identify the numbers of unused tuples without performing a VACUUM? Is it stored in a system table anywhere? Other ideas on how to identify disk bloat short of forcing downtime? TIA.
On Fri, 2004-05-14 at 10:10, Ed L. wrote: > I am trying to identify tables with significant diskspace "leakage" due to > in appropriately low max_fsm_pages settings. I can see the results of > VACUUM ANALYZE VERBOSE output counts of tuples and unused tuples, and > understand that (1 - (tuples/unused)) is the amount of diskspace available > to be reclaimed with a VACUUM FULL or dump/reload. > > Is there a way to identify the numbers of unused tuples without performing a > VACUUM? Is it stored in a system table anywhere? Other ideas on how to > identify disk bloat short of forcing downtime? You can calculate the number of bytes per row, multiply by the number of live tuples (count(1) from table), and subtract that from the actual # of bytes in the on-disk representation. The difference is wasted space. -jwb
On Friday May 14 2004 11:47, Jeffrey W. Baker wrote: > > Is there a way to identify the numbers of unused tuples without > > performing a VACUUM? Is it stored in a system table anywhere? Other > > ideas on how to identify disk bloat short of forcing downtime? > > You can calculate the number of bytes per row, multiply by the number of > live tuples (count(1) from table), and subtract that from the actual # > of bytes in the on-disk representation. The difference is wasted space. That works, but with umpteen clusters to manage, I'm really hoping for a SQL-based check so it can be done remotely and non-interactively. Maybe it is too much to keep track of, but it would be cool if VACUUM updated a system table with the same info it spits out during verbose mode. That would be very helpful in auto-identifying leakage and also a recent case where the cpu:real time ratio during vacuum went thru the roof due to I/O overload from leakage.
Here's an attempt at a query to estimate diskspace leakage. This leakage might occur when max_fsm_pages and/or max_fsm_relations are set too low. Not sure which of the two approaches below (leak1 or leak2) is more accurate? Is there a better way via SQL? The query uses the 'dbsize' project from contrib. Dbsize has a function called relation_size() which performs a 'stat' to get actual disk usage for a database and/or table. I use the column pg_class.reltuples instead of actually counting rows because I suspect that would essentially flush our OS cache of useful pages, degrading performance. This query assumes you're keeping stats updated. SELECT c.relname, SUM(s.avg_width) as width, CAST(c.reltuples as BIGINT) AS tuples, CAST(SUM(s.avg_width) * c.reltuples/1048576 AS INTEGER) AS tupdu, c.relpages AS pages, CAST(c.relpages * 8192 / 1048576 AS INTEGER) AS pgdu, relation_size(s.tablename)/1048576 AS reldu, CAST((relation_size(s.tablename) - SUM(s.avg_width) * c.reltuples)/1048576 AS INTEGER) AS leak1, CAST((relation_size(s.tablename) - c.relpages * 8192) / 1048576 AS INTEGER) AS leak2 FROM pg_stats s, pg_class c WHERE c.relname NOT LIKE 'pg_%' AND c.relname = s.tablename GROUP BY c.oid, s.tablename, c.reltuples, c.relpages, pgdu ORDER BY tupdu; relname | width | tuples | tupdu | pages | pgdu | reldu | leak1 | leak2 ---------------+-------+---------+-------+--------+------+-------+-------+------- table_1766485 | 27 | 198 | 0 | 12 | 0 | 0 | 0 | 0 table_1766443 | 186 | 0 | 0 | 9317 | 72 | 72 | 73 | 0 table_1766439 | 83 | 0 | 0 | 10 | 0 | 0 | 0 | 0 table_1766435 | 27 | 0 | 0 | 0 | 0 | 0 | 0 | 0 table_1766437 | 30 | 0 | 0 | 0 | 0 | 0 | 0 | 0 table_1766421 | 23 | 2 | 0 | 1 | 0 | 0 | 0 | 0 table_1766451 | 30 | 189822 | 5 | 1754 | 13 | 13 | 8 | 0 table_1766396 | 48 | 278781 | 13 | 3185 | 24 | 24 | 12 | 0 table_1766391 | 74 | 200826 | 14 | 3271 | 25 | 25 | 11 | 0 table_1766446 | 36 | 504594 | 17 | 4881 | 38 | 38 | 21 | 0 table_1766426 | 149 | 2241719 | 319 | 55555 | 434 | 434 | 116 | 0 table_1766456 | 888 | 390657 | 331 | 637949 | 887 | 4983 | 4653 | 4096 table_1766399 | 596 | 732708 | 416 | 41876 | 327 | 327 | -89 | 0 (13 rows) The basic column definitions are: tupdu(MB) = avg_width * reltuples pgdu(MB) = relpages * 8K/page reldu(MB) = relation_size(tablename) (src/contrib/dbsize) leak1 = reldu - tupdu leak2 = reldu - pgdu Not sure how we ended up with a couple of cases where the number of mb on disk was less than the estimated size; maybe we had some deletions after the last update of pg_stats?
[reposting...original seems to have been lost in ether...] Here's an attempt at a query to estimate diskspace leakage. This leakage might occur when max_fsm_pages and/or max_fsm_relations are set too low. Not sure which of the two approaches below (leak1 or leak2) is more accurate? Is there a better way via SQL? The query uses the 'dbsize' project from contrib. Dbsize has a function called relation_size() which performs a 'stat' to get actual disk usage for a database and/or table. I use the column pg_class.reltuples instead of actually counting rows because I suspect that would essentially flush our OS cache of useful pages, degrading performance. This query assumes you're keeping stats updated. SELECT c.relname, SUM(s.avg_width) as width, CAST(c.reltuples as BIGINT) AS tuples, CAST(SUM(s.avg_width) * c.reltuples/1048576 AS INTEGER) AS tupdu, c.relpages AS pages, CAST(c.relpages * 8192 / 1048576 AS INTEGER) AS pgdu, relation_size(s.tablename)/1048576 AS reldu, CAST((relation_size(s.tablename) - SUM(s.avg_width) * c.reltuples)/1048576 AS INTEGER) AS leak1, CAST((relation_size(s.tablename) - c.relpages * 8192) / 1048576 AS INTEGER) AS leak2 FROM pg_stats s, pg_class c WHERE c.relname NOT LIKE 'pg_%' AND c.relname = s.tablename GROUP BY c.oid, s.tablename, c.reltuples, c.relpages, pgdu ORDER BY tupdu; relname | width | tuples | tupdu | pages | pgdu | reldu | leak1 | leak2 ---------------+-------+---------+-------+--------+------+-------+-------+------- table_1766485 | 27 | 198 | 0 | 12 | 0 | 0 | 0 | 0 table_1766443 | 186 | 0 | 0 | 9317 | 72 | 72 | 73 | 0 table_1766439 | 83 | 0 | 0 | 10 | 0 | 0 | 0 | 0 table_1766435 | 27 | 0 | 0 | 0 | 0 | 0 | 0 | 0 table_1766437 | 30 | 0 | 0 | 0 | 0 | 0 | 0 | 0 table_1766421 | 23 | 2 | 0 | 1 | 0 | 0 | 0 | 0 table_1766451 | 30 | 189822 | 5 | 1754 | 13 | 13 | 8 | 0 table_1766396 | 48 | 278781 | 13 | 3185 | 24 | 24 | 12 | 0 table_1766391 | 74 | 200826 | 14 | 3271 | 25 | 25 | 11 | 0 table_1766446 | 36 | 504594 | 17 | 4881 | 38 | 38 | 21 | 0 table_1766426 | 149 | 2241719 | 319 | 55555 | 434 | 434 | 116 | 0 table_1766456 | 888 | 390657 | 331 | 637949 | 887 | 4983 | 4653 | 4096 table_1766399 | 596 | 732708 | 416 | 41876 | 327 | 327 | -89 | 0 (13 rows) The basic column definitions are: tupdu(MB) = avg_width * reltuples pgdu(MB) = relpages * 8K/page reldu(MB) = relation_size(tablename) (src/contrib/dbsize) leak1 = reldu - tupdu leak2 = reldu - pgdu Not sure how we ended up with a couple of cases where the number of mb on disk was less than the estimated size; maybe we had some deletions after the last update of pg_stats?