Hi,
So, we have this situation, where there is cluster with 5 smallish
databases:
$ select oid, pg_database_size(oid) from pg_database;
oid | pg_database_size
-------+------------------
1 | 6752440
12035 | 6760632
16428 | 59779475640
16427 | 294947000
12030 | 6455812
(5 rows)
But the 16428 database has quite a lot of objects:
$ select count(*) from pg_class;
count
---------
1032761
(1 row)
This is reflected in stats:
# ls -l $( su -c "psql -qAtX -c 'show stats_temp_directory'" - postgres )
total 127452
-rw------- 1 postgres postgres 4230 Sep 9 12:02 db_0.stat
-rw------- 1 postgres postgres 20792 Sep 9 12:02 db_12035.stat
-rw------- 1 postgres postgres 30932 Sep 9 12:02 db_16427.stat
-rw------- 1 postgres postgres 130413431 Sep 9 12:03 db_16428.stat
-rw------- 1 postgres postgres 20792 Sep 9 12:02 db_1.stat
-rw------- 1 postgres postgres 1026 Sep 9 12:03 global.stat
This directory is on tmpfs (ramdisk).
And getting any kind of stats takes non-trivial time:
$ explain analyze SELECT sum(xact_commit+xact_rollback) as txs FROM pg_stat_database;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Aggregate (cost=1.12..1.13 rows=1 width=4) (actual time=460.917..460.918 rows=1 loops=1)
-> Seq Scan on pg_database d (cost=0.00..1.06 rows=6 width=4) (actual time=0.003..0.015 rows=5 loops=1)
Total runtime: 460.946 ms
(3 rows)
This is repeatable, and quick strace shows that when dealing with stats, it
looks that pg has to read all stat files, in whole, parse, and return results.
Is there anything that could be done, aside from dropping 90% objects, to make
stat-relating queries faster?
Best regards,
depesz
--
The best thing about modern society is how easy it is to avoid contact with it.
http://depesz.com/