Re: Identifying diskspace leakage - Mailing list pgsql-general

From Ed L.
Subject Re: Identifying diskspace leakage
Date
Msg-id 200405171214.11626.pgsql@bluepolka.net
Whole thread Raw
In response to Re: Identifying diskspace leakage  ("Ed L." <pgsql@bluepolka.net>)
List pgsql-general
[reposting...original seems to have been lost in ether...]

Here's an attempt at a query to estimate diskspace leakage.  This
leakage might occur when max_fsm_pages and/or max_fsm_relations are
set too low.  Not sure which of the two approaches below (leak1 or
leak2) is more accurate?  Is there a better way via SQL?

The query uses the 'dbsize' project from contrib.  Dbsize has a
function called relation_size() which performs a 'stat' to get
actual disk usage for a database and/or table.  I use the column
pg_class.reltuples instead of actually counting rows because I
suspect that would essentially flush our OS cache of useful pages,
degrading performance.  This query assumes you're keeping stats
updated.

SELECT c.relname,
       SUM(s.avg_width) as width,
       CAST(c.reltuples as BIGINT) AS tuples,
       CAST(SUM(s.avg_width) * c.reltuples/1048576 AS INTEGER) AS tupdu,
       c.relpages AS pages,
       CAST(c.relpages * 8192 / 1048576 AS INTEGER) AS pgdu,
       relation_size(s.tablename)/1048576 AS reldu,
       CAST((relation_size(s.tablename)
           - SUM(s.avg_width) * c.reltuples)/1048576 AS INTEGER) AS leak1,
       CAST((relation_size(s.tablename)
           - c.relpages * 8192) / 1048576 AS INTEGER) AS leak2
FROM pg_stats s, pg_class c
WHERE c.relname NOT LIKE 'pg_%'
  AND c.relname = s.tablename
GROUP BY c.oid, s.tablename, c.reltuples, c.relpages, pgdu
ORDER BY tupdu;

    relname    | width | tuples  | tupdu | pages  | pgdu | reldu | leak1 |
leak2
---------------+-------+---------+-------+--------+------+-------+-------+-------
 table_1766485 |    27 |     198 |     0 |     12 |    0 |     0 |     0 |
0
 table_1766443 |   186 |       0 |     0 |   9317 |   72 |    72 |    73 |
0
 table_1766439 |    83 |       0 |     0 |     10 |    0 |     0 |     0 |
0
 table_1766435 |    27 |       0 |     0 |      0 |    0 |     0 |     0 |
0
 table_1766437 |    30 |       0 |     0 |      0 |    0 |     0 |     0 |
0
 table_1766421 |    23 |       2 |     0 |      1 |    0 |     0 |     0 |
0
 table_1766451 |    30 |  189822 |     5 |   1754 |   13 |    13 |     8 |
0
 table_1766396 |    48 |  278781 |    13 |   3185 |   24 |    24 |    12 |
0
 table_1766391 |    74 |  200826 |    14 |   3271 |   25 |    25 |    11 |
0
 table_1766446 |    36 |  504594 |    17 |   4881 |   38 |    38 |    21 |
0
 table_1766426 |   149 | 2241719 |   319 |  55555 |  434 |   434 |   116 |
0
 table_1766456 |   888 |  390657 |   331 | 637949 |  887 |  4983 |  4653 |
4096
 table_1766399 |   596 |  732708 |   416 |  41876 |  327 |   327 |   -89 |
0
(13 rows)

The basic column definitions are:

    tupdu(MB) = avg_width * reltuples
    pgdu(MB) = relpages * 8K/page
    reldu(MB) = relation_size(tablename) (src/contrib/dbsize)
    leak1 = reldu - tupdu
    leak2 = reldu - pgdu

Not sure how we ended up with a couple of cases where the number of
mb on disk was less than the estimated size; maybe we had some
deletions after the last update of pg_stats?



pgsql-general by date:

Previous
From: Jeff
Date:
Subject: Re: dbmirror
Next
From: Holger Marzen
Date:
Subject: Re: error messages (not valid postgresql link resoucre)