Re: Identifying diskspace leakage - Mailing list pgsql-general
From | Ed L. |
---|---|
Subject | Re: Identifying diskspace leakage |
Date | |
Msg-id | 200405171214.11626.pgsql@bluepolka.net Whole thread Raw |
In response to | Re: Identifying diskspace leakage ("Ed L." <pgsql@bluepolka.net>) |
List | pgsql-general |
[reposting...original seems to have been lost in ether...] Here's an attempt at a query to estimate diskspace leakage. This leakage might occur when max_fsm_pages and/or max_fsm_relations are set too low. Not sure which of the two approaches below (leak1 or leak2) is more accurate? Is there a better way via SQL? The query uses the 'dbsize' project from contrib. Dbsize has a function called relation_size() which performs a 'stat' to get actual disk usage for a database and/or table. I use the column pg_class.reltuples instead of actually counting rows because I suspect that would essentially flush our OS cache of useful pages, degrading performance. This query assumes you're keeping stats updated. SELECT c.relname, SUM(s.avg_width) as width, CAST(c.reltuples as BIGINT) AS tuples, CAST(SUM(s.avg_width) * c.reltuples/1048576 AS INTEGER) AS tupdu, c.relpages AS pages, CAST(c.relpages * 8192 / 1048576 AS INTEGER) AS pgdu, relation_size(s.tablename)/1048576 AS reldu, CAST((relation_size(s.tablename) - SUM(s.avg_width) * c.reltuples)/1048576 AS INTEGER) AS leak1, CAST((relation_size(s.tablename) - c.relpages * 8192) / 1048576 AS INTEGER) AS leak2 FROM pg_stats s, pg_class c WHERE c.relname NOT LIKE 'pg_%' AND c.relname = s.tablename GROUP BY c.oid, s.tablename, c.reltuples, c.relpages, pgdu ORDER BY tupdu; relname | width | tuples | tupdu | pages | pgdu | reldu | leak1 | leak2 ---------------+-------+---------+-------+--------+------+-------+-------+------- table_1766485 | 27 | 198 | 0 | 12 | 0 | 0 | 0 | 0 table_1766443 | 186 | 0 | 0 | 9317 | 72 | 72 | 73 | 0 table_1766439 | 83 | 0 | 0 | 10 | 0 | 0 | 0 | 0 table_1766435 | 27 | 0 | 0 | 0 | 0 | 0 | 0 | 0 table_1766437 | 30 | 0 | 0 | 0 | 0 | 0 | 0 | 0 table_1766421 | 23 | 2 | 0 | 1 | 0 | 0 | 0 | 0 table_1766451 | 30 | 189822 | 5 | 1754 | 13 | 13 | 8 | 0 table_1766396 | 48 | 278781 | 13 | 3185 | 24 | 24 | 12 | 0 table_1766391 | 74 | 200826 | 14 | 3271 | 25 | 25 | 11 | 0 table_1766446 | 36 | 504594 | 17 | 4881 | 38 | 38 | 21 | 0 table_1766426 | 149 | 2241719 | 319 | 55555 | 434 | 434 | 116 | 0 table_1766456 | 888 | 390657 | 331 | 637949 | 887 | 4983 | 4653 | 4096 table_1766399 | 596 | 732708 | 416 | 41876 | 327 | 327 | -89 | 0 (13 rows) The basic column definitions are: tupdu(MB) = avg_width * reltuples pgdu(MB) = relpages * 8K/page reldu(MB) = relation_size(tablename) (src/contrib/dbsize) leak1 = reldu - tupdu leak2 = reldu - pgdu Not sure how we ended up with a couple of cases where the number of mb on disk was less than the estimated size; maybe we had some deletions after the last update of pg_stats?
pgsql-general by date: