Thread: huge pgstat.stat file on PostgreSQL 8.3.24
Hi group,
We’ve found huge pgstat.stat file on our production DB boxes, the size is over 100MB. autovacuum is enabled. So my question would be:
1. What’s a reasonable size of pgstat.stat file, can it be estimated?
2. What’s the safest way to reduce the file size to alleviate the IO impact on disk?
3. If need to drop all statistics, would a “analyze DB” command enough to eliminate the performance impact on queries?
Thanks,
Suya
pg_stat_reset()
http://www.postgresql.org/docs/9.2/static/monitoring-stats.htmlPavel
Hi group,
We’ve found huge pgstat.stat file on our production DB boxes, the size is over 100MB. autovacuum is enabled. So my question would be:
1. What’s a reasonable size of pgstat.stat file, can it be estimated?
2. What’s the safest way to reduce the file size to alleviate the IO impact on disk?
3. If need to drop all statistics, would a “analyze DB” command enough to eliminate the performance impact on queries?
Thanks,
Suya
From: Pavel Stehule [mailto:pavel.stehule@gmail.com] Sent: Thursday, June 19, 2014 3:28 PM To: Huang, Suya Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24 Hello The size of statfile is related to size of database objects in database. Depends on PostgreSQL version this file can be oneper database cluster or one per database (from 9.3), These statistics should by reset by call pg_stat_reset() http://www.postgresql.org/docs/9.2/static/monitoring-stats.html Autovacuum on large stat files has significant overhead - it can be increasing by using new PostgreSQL (9.3) and by migrationstat directory to ramdisk - by setting stats_temp_directory to some dir on ramdisk (tmpfs on Linux) Regards Pavel 2014-06-19 6:38 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>: Hi group, We’ve found huge pgstat.stat file on our production DB boxes, the size is over 100MB. autovacuum is enabled. So my questionwould be: 1. What’s a reasonable size of pgstat.stat file, can it be estimated? 2. What’s the safest way to reduce the file size to alleviate the IO impact on disk? 3. If need to drop all statistics, would a “analyze DB” command enough to eliminate the performance impact on queries? Thanks, Suya Hi Pavel, our version is 8.3.24, not 9.3. I also want to know the impact caused by run pg_stat_reset to application, is that able tobe mitigated by doing an analyze database command? Thanks, Suya
From: Pavel Stehule [mailto:pavel.stehule@gmail.com]
Sent: Thursday, June 19, 2014 3:28 PM
To: Huang, Suya
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24Hi Pavel,
Hello
The size of statfile is related to size of database objects in database. Depends on PostgreSQL version this file can be one per database cluster or one per database (from 9.3),
These statistics should by reset by call pg_stat_reset() http://www.postgresql.org/docs/9.2/static/monitoring-stats.html
Autovacuum on large stat files has significant overhead - it can be increasing by using new PostgreSQL (9.3) and by migration stat directory to ramdisk - by setting stats_temp_directory to some dir on ramdisk (tmpfs on Linux)
Regards
Pavel
2014-06-19 6:38 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>:
Hi group,
We’ve found huge pgstat.stat file on our production DB boxes, the size is over 100MB. autovacuum is enabled. So my question would be:
1. What’s a reasonable size of pgstat.stat file, can it be estimated?
2. What’s the safest way to reduce the file size to alleviate the IO impact on disk?
3. If need to drop all statistics, would a “analyze DB” command enough to eliminate the performance impact on queries?
Thanks,
Suya
our version is 8.3.24, not 9.3. I also want to know the impact caused by run pg_stat_reset to application, is that able to be mitigated by doing an analyze database command?
Thanks,
Suya
On 19 Červen 2014, 7:35, Huang, Suya wrote: > From: Pavel Stehule [mailto:pavel.stehule@gmail.com] > Sent: Thursday, June 19, 2014 3:28 PM > To: Huang, Suya > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24 > > Hello > > The size of statfile is related to size of database objects in database. > Depends on PostgreSQL version this file can be one per database cluster or > one per database (from 9.3), > These statistics should by reset by call pg_stat_reset() > http://www.postgresql.org/docs/9.2/static/monitoring-stats.html > Autovacuum on large stat files has significant overhead - it can be > increasing by using new PostgreSQL (9.3) and by migration stat directory > to ramdisk - by setting stats_temp_directory to some dir on ramdisk (tmpfs > on Linux) > Regards > > Pavel > > 2014-06-19 6:38 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>: > Hi group, > > We’ve found huge pgstat.stat file on our production DB boxes, the size is > over 100MB. autovacuum is enabled. So my question would be: > 1. What’s a reasonable size of pgstat.stat file, can it be > estimated? > 2. What’s the safest way to reduce the file size to alleviate the IO > impact on disk? > 3. If need to drop all statistics, would a “analyze DB” command > enough to eliminate the performance impact on queries? > > Thanks, > Suya > > > > > Hi Pavel, > > our version is 8.3.24, not 9.3. I also want to know the impact caused by > run pg_stat_reset to application, is that able to be mitigated by doing an > analyze database command? Hi, I really doubt you're on 8.3.24. The last version in 8.3 branch is 8.3.23. Running pg_stat_reset has no impact on planning queries. There are two kinds of statistics - those used for planning are stored withing the database, not in pgstat.stat file and are not influenced by pg_stat_reset. The stats in pgstat.stat are 'runtime stats' used for monitoring etc. so you may see some distuption in your monitoring system. ANALYZE command has nothing to do with the stats in pgstat.stat. However, if you really have a pgstat.stat this large, this is only a temporary solution - it will grow back, possibly pretty quickly, depending on how often you access the objects. Another option is to move the file to a tmpfs (ramdisk) partition. It will eliminate the IO overhead, but it will consume more CPU (because it still needs to be processed, and IO is not the bottleneck anymore). The other thing is that you should really start thinking about upgrading to a supported version. 8.3 did not get updates for > 1 year (and won't). Tomas
From: Pavel Stehule [mailto:pavel.stehule@gmail.com] Sent: Thursday, June 19, 2014 3:41 PM To: Huang, Suya Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24 2014-06-19 7:35 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>: From: Pavel Stehule [mailto:pavel.stehule@gmail.com] Sent: Thursday, June 19, 2014 3:28 PM To: Huang, Suya Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24 Hello The size of statfile is related to size of database objects in database. Depends on PostgreSQL version this file can be oneper database cluster or one per database (from 9.3), These statistics should by reset by call pg_stat_reset() http://www.postgresql.org/docs/9.2/static/monitoring-stats.html Autovacuum on large stat files has significant overhead - it can be increasing by using new PostgreSQL (9.3) and by migrationstat directory to ramdisk - by setting stats_temp_directory to some dir on ramdisk (tmpfs on Linux) Regards Pavel 2014-06-19 6:38 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>: Hi group, We’ve found huge pgstat.stat file on our production DB boxes, the size is over 100MB. autovacuum is enabled. So my questionwould be: 1. What’s a reasonable size of pgstat.stat file, can it be estimated? 2. What’s the safest way to reduce the file size to alleviate the IO impact on disk? 3. If need to drop all statistics, would a “analyze DB” command enough to eliminate the performance impact on queries? Thanks, Suya Hi Pavel, our version is 8.3.24, not 9.3. I also want to know the impact caused by run pg_stat_reset to application, is that able tobe mitigated by doing an analyze database command? your version is too old - you can try reset statistics. ANALYZE statement should not have a significant impact on theseruntime statistics. Pavel Attention: PostgreSQL 8.3 is unsupported now Thanks, Suya Thanks Pavel, to be more clear, what does " pg_stat_reset "really reset? In the document it says " Reset all statistics countersfor the current database to zero(requires superuser privileges) ". I thought it would reset all statistics of alltables/indexes, thus why I am thinking of re-run analyze database to gather statistics. Because if table/indexes don'thave statistics, the query plan would be affected which is not a good thing to a production box... I'm not so sure ifI understand "run statistics" you mentioned here. Thanks, Suya
From: Pavel Stehule [mailto:pavel.stehule@gmail.com]
Sent: Thursday, June 19, 2014 3:41 PM
To: Huang, Suya
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24
2014-06-19 7:35 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>:
From: Pavel Stehule [mailto:pavel.stehule@gmail.com]
Sent: Thursday, June 19, 2014 3:28 PM
To: Huang, Suya
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24
Hello
The size of statfile is related to size of database objects in database. Depends on PostgreSQL version this file can be one per database cluster or one per database (from 9.3),
These statistics should by reset by call pg_stat_reset() http://www.postgresql.org/docs/9.2/static/monitoring-stats.html
Autovacuum on large stat files has significant overhead - it can be increasing by using new PostgreSQL (9.3) and by migration stat directory to ramdisk - by setting stats_temp_directory to some dir on ramdisk (tmpfs on Linux)
Regards
Pavel
2014-06-19 6:38 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>:
Hi group,
We’ve found huge pgstat.stat file on our production DB boxes, the size is over 100MB. autovacuum is enabled. So my question would be:
1. What’s a reasonable size of pgstat.stat file, can it be estimated?
2. What’s the safest way to reduce the file size to alleviate the IO impact on disk?
3. If need to drop all statistics, would a “analyze DB” command enough to eliminate the performance impact on queries?
Thanks,
Suya
Hi Pavel,
our version is 8.3.24, not 9.3. I also want to know the impact caused by run pg_stat_reset to application, is that able to be mitigated by doing an analyze database command?
your version is too old - you can try reset statistics. ANALYZE statement should not have a significant impact on these runtime statistics.
Pavel
Attention: PostgreSQL 8.3 is unsupported now
Thanks,
Suya
Thanks Pavel, to be more clear, what does " pg_stat_reset "really reset? In the document it says " Reset all statistics counters for the current database to zero(requires superuser privileges) ". I thought it would reset all statistics of all tables/indexes, thus why I am thinking of re-run analyze database to gather statistics. Because if table/indexes don't have statistics, the query plan would be affected which is not a good thing to a production box... I'm not so sure if I understand "run statistics" you mentioned here.
Thanks,
Suya
On 20 Červen 2014, 5:33, Pavel Stehule wrote: > 2014-06-20 1:44 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>: >> >> Thanks Pavel, to be more clear, what does " pg_stat_reset "really reset? >> In the document it says " Reset all statistics counters for the current >> database to zero(requires superuser privileges) ". I thought it would >> reset all statistics of all tables/indexes, thus why I am thinking of >> re-run analyze database to gather statistics. Because if table/indexes >> don't have statistics, the query plan would be affected which is not a >> good >> thing to a production box... I'm not so sure if I understand "run >> statistics" you mentioned here. >> > > you have true - anyway you can clean a content of this directory - but if > your database has lot of database objects, your stat file will have a > original size very early > > Pavel > No, he's not right. Suya, as I wrote in my previous message, there are two kinds of statistics in PostgreSQL a) data distribution statistics - histograms, MCV lists, number of distinct values, ... - stored in regular tables - used for planning - collected by ANALYZE - not influenced by pg_stat_reset() at all b) runtime statistics - number of scans for table/index, rows fetched from table/index, ... - tracks activity within the database - stored in pgstat.stat file (or per-db files in the recent releases) - used for monitoring, not for planning - removed by pg_stat_reset() So running pg_stat_reset will not hurt planning at all. regards Tomas
-----Original Message----- From: Tomas Vondra [mailto:tv@fuzzy.cz] Sent: Friday, June 20, 2014 8:14 PM To: Pavel Stehule Cc: Huang, Suya; pgsql-performance@postgresql.org Subject: Re: [PERFORM] huge pgstat.stat file on PostgreSQL 8.3.24 On 20 Červen 2014, 5:33, Pavel Stehule wrote: > 2014-06-20 1:44 GMT+02:00 Huang, Suya <Suya.Huang@au.experian.com>: >> >> Thanks Pavel, to be more clear, what does " pg_stat_reset "really reset? >> In the document it says " Reset all statistics counters for the >> current database to zero(requires superuser privileges) ". I thought >> it would reset all statistics of all tables/indexes, thus why I am >> thinking of re-run analyze database to gather statistics. Because if >> table/indexes don't have statistics, the query plan would be affected >> which is not a good thing to a production box... I'm not so sure if I >> understand "run statistics" you mentioned here. >> > > you have true - anyway you can clean a content of this directory - but > if your database has lot of database objects, your stat file will have > a original size very early > > Pavel > No, he's not right. Suya, as I wrote in my previous message, there are two kinds of statistics in PostgreSQL a) data distribution statistics - histograms, MCV lists, number of distinct values, ... - stored in regular tables - used for planning - collected by ANALYZE - not influenced by pg_stat_reset() at all b) runtime statistics - number of scans for table/index, rows fetched from table/index, ... - tracks activity within the database - stored in pgstat.stat file (or per-db files in the recent releases) - used for monitoring, not for planning - removed by pg_stat_reset() So running pg_stat_reset will not hurt planning at all. regards Tomas Hi Tomas, You're right, my DB version is 8.3.11, I remembered the wrong version... we've got a new project using the latest version9.3.4, and the old DB will be decommissioned in the future, so that's why the management people don't want to spendresources on upgrading and QA, etc. Still have a question of why the file would become so big, is that related to the number of objects I have in database? Thanks again for your clear explanation on the two different statistics in PostgreSQL DB, really helped a lot! I'm wonderingif they should also exist in the documentation, as it really confuses people... :) Thanks, Suya