Re: pgstat wait timeout (RE: contrib/cache_scan) - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: pgstat wait timeout (RE: contrib/cache_scan)
Date
Msg-id CAMkU=1wS_UPTFzXJ2SCi8Z84Uu1AuO=N=MTz5j0EFe+=s7ANGg@mail.gmail.com
Whole thread Raw
In response to Re: pgstat wait timeout (RE: contrib/cache_scan)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pgstat wait timeout (RE: contrib/cache_scan)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Mar 12, 2014 at 7:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Kouhei Kaigai <kaigai@ak.jp.nec.com> writes:
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout

> Once I got above messages, write performance is dramatically
> degraded, even though I didn't take detailed investigation.

> I could reproduce it on the latest master branch without my
> enhancement, so I guess it is not a problem something special
> to me.
> One other strangeness is, right now, this problem is only
> happen on my virtual machine environment - VMware ESXi 5.5.0.
> I couldn't reproduce the problem on my physical environment
> (Fedora20, core i5-4570S).

We've seen sporadic reports of that sort of behavior for years, but no
developer has ever been able to reproduce it reliably.  Now that you've
got a reproducible case, do you want to poke into it and see what's going
on?

I didn't know we were trying to reproduce it, nor that it was a mystery.  Do anything that causes serious IO constipation, and you will probably see that message.  For example, turn off synchronous_commit and run the default pgbench transaction at a large scale but that still comfortably fits in RAM, and wait for a checkpoint sync phase to kick in.

The pgstat wait timeout is a symptom, not the cause.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: GIN improvements part2: fast scan
Next
From: Heikki Linnakangas
Date:
Subject: Re: The case against multixact GUCs