Thread: WARNING: pgstat wait timeout
Hi all!
--
First of all thank you for this awesome database! We are successfully using PG on top of Linux
in our projects for 2 years so far and are very happy with the performance and observability
features of the database.
Recently we deployed PG 9.0.5-x64 on Windows Server 2008 R2 and noticed very strange behavior:
at some time after firing up the DB it starts logging many 'pgstat wait timeout' messages.
We monitor the IO, and it never goes higher than 10MB/s, whereas the total throughput of DB disks is ~200MB/s.
I wouldn't bother, but the performance degrades soon after the message starts showing up.
After some digging around I found that the reason is that $PGDATA/pg_stat_tmp/pgstat.stat stops updating.
The workaround for me was to SIGHUP the DB -- pgstat.stat starts updating and the warning stops.
Is this expected behavior?
If this is a bug what information will be helpful in this particular situation to include in the bug-report?
Thanks!
Tair Sabirgaliev
On Thu, 3 Nov 2011 00:05:58 +0600 Таир Сабыргалиев <tair.sabirgaliev@bee.kz> wrote: > We monitor the IO, and it never goes higher than 10MB/s, whereas the total > throughput of DB disks is ~200MB/s. This isn't normal: it should be around 190MB/s. ... > If this is a bug what information will be helpful in this particular > situation to include in the bug-report? This is a w$ feature. -- When a girl marries she exchanges the attentions of many men for the inattentions of one. -- Helen Rowland
Sorry for replying to my own message! I'm very novice not only in PG but in using mailing-lists also.. > On Thu, 3 Nov 2011 00:05:58 +0600 > Таир Сабыргалиев <tair(dot)sabirgaliev(at)bee(dot)kz> wrote: > >> We monitor the IO, and it never goes higher than 10MB/s, whereas the total >> throughput of DB disks is ~200MB/s. > > This isn't normal: it should be around 190MB/s. Do you mean that my real throughput is actually lower that what I've measured? Anyway I don't think the warning is a result of too high IO > > ... >> If this is a bug what information will be helpful in this particular >> situation to include in the bug-report? > > This is a w$ feature. > > -- > When a girl marries she exchanges the attentions of many men for the > inattentions of one. > -- Helen Rowland
On Fri, 4 Nov 2011 16:49:02 +0600 Tair Sabirgaliev <tair.sabirgaliev@bee.kz> wrote: > Sorry for replying to my own message! I'm very novice not only in PG > but in using > mailing-lists also.. Everybody needs a beginning :) > > On Thu, 3 Nov 2011 00:05:58 +0600 > > Таир Сабыргалиев <tair(dot)sabirgaliev(at)bee(dot)kz> wrote: > > > >> We monitor the IO, and it never goes higher than 10MB/s, whereas the total > >> throughput of DB disks is ~200MB/s. > > > > This isn't normal: it should be around 190MB/s. > > Do you mean that my real throughput is actually lower that what I've measured? > Anyway I don't think the warning is a result of too high IO No, I was only ironic (toward w$) - the problem you face isn't very easy to fix because w$ lacks *nix usual tools. You should search the web for such tools (iotop, analyse system i/o, etc) in order to be able to identify which program(s) is creating this disk flow. At first you could take a look into taskmgr: may be the program is using some CPU resource and you'll be able to identify it while it writes to the disk. -- "I'd love to go out with you, but I'm converting my calendar watch from Julian to Gregorian."
On Fri, Nov 4, 2011 at 8:09 PM, Jean-Yves F. Barbier <12ukwn@gmail.com> wrote: > On Fri, 4 Nov 2011 16:49:02 +0600 > Tair Sabirgaliev <tair.sabirgaliev@bee.kz> wrote: > >> Sorry for replying to my own message! I'm very novice not only in PG >> but in using >> mailing-lists also.. > > Everybody needs a beginning :) > >> > On Thu, 3 Nov 2011 00:05:58 +0600 >> > Таир Сабыргалиев <tair(dot)sabirgaliev(at)bee(dot)kz> wrote: >> > >> >> We monitor the IO, and it never goes higher than 10MB/s, whereas the total >> >> throughput of DB disks is ~200MB/s. >> > >> > This isn't normal: it should be around 190MB/s. >> >> Do you mean that my real throughput is actually lower that what I've measured? >> Anyway I don't think the warning is a result of too high IO > > No, I was only ironic (toward w$) - the problem you face isn't very easy to > fix because w$ lacks *nix usual tools. > You should search the web for such tools (iotop, analyse system i/o, etc) in > order to be able to identify which program(s) is creating this disk flow. > > At first you could take a look into taskmgr: may be the program is using > some CPU resource and you'll be able to identify it while it writes to the disk. Thanks! That's indeed how we found that the system's overall disk IO never exceeds 10MB/s at peak times. The server is 32-core Xeon X7550 with 64GB RAM, storage: 140GB internal SAS + 1TB FC SAN, all dedicated to PG only. postgresql.conf modifications: max_connections = 500 effective_cache_size = 32GB maintenance_work_mem = 64MB shared_buffers = 512MB temp_buffers = 16MB work_mem = 8MB shared_preload_libraries = $libdir/pg_stat_statements checkpoint_segments = 30 We also used SQLIO to do some benchmarking. I'm no expert, I chose SQLIO because it was simple and didn't need any DB setup. The problem is that there's no SQLIO guideline specific to PG around, that's why I'm not sure my results are valid at all :) Here are my SQLIO results of writing 8kB blocks using 32 threads, each thread writing in its own file sequentially for 60 seconds: $ sqlio.exe -kW -t32 -s60 -b8 -fsequential -Ffiles32.txt .. snip initialization .. CUMULATIVE DATA: throughput metrics: IOs/sec: 10031.31 MBs/sec: 78.36 The above results made me believe in that the problem is not disk IO. > > -- > "I'd love to go out with you, but I'm converting my calendar watch from > Julian to Gregorian." > > -- > Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-novice > -- с уважением, Таир Сабыргалиев ТОО "BEE Software" Республика Казахстан, 010000 г.Астана, ул.Сарайшык 34, ВП-27 Тел.: +7 (7172) 56-89-31 Сот.: +7 (702) 2173359 e-mail: tair.sabirgaliev@bee.kz Tair Sabirgaliev "BEE Software" Ltd. Republic of Kazakhstan, 010000 Astana, Sarayshyk str. 34, sect. 27 Tel.: +7 (7172) 56-89-31 Mob.: +7 (702) 2173359 e-mail: tair.sabirgaliev@bee.kz