Thread: Re: pgstat wait timeout (by Robert Schnabel)
Hi! We've got the same issue with PostgreSQL 8.4.1 on 32 bit Windows 2003 Server RC2. Every minute the message "pgstat wait timeout" appears in Event Viewer / Application Log. We've tried to restart the service. It was looking like the service stopped successfully, but two postgres.exe processes left running. One of them was utilizing one core. It had no tcp connections in any state. It was "C:/Program Files/PostgreSQL/8.4/bin/postgres.exe" "--forkcol" "1228" sysinternals' procexp.exe showes the following stack for this process: """ ntkrnlpa.exe+0x8dace ntkrnlpa.exe+0x29a62 ntkrnlpa.exe+0x33178 hal.dll+0x6199 hal.dll+0x63d9 hal.dll+0x6577 hal.dll+0x3902 ntkrnlpa.exe+0x89721 ntdll.dll!KiFastSystemCallRet mswsock.dll!ServiceMain+0x2949 mswsock.dll!NSPStartup+0x1893 WS2_32.dll!WSAUnhookBlockingHook+0x486 WS2_32.dll!WSAUnhookBlockingHook+0x660 WS2_32.dll!WSAEventSelect+0x2f postgres.exe!mdpostckpt+0x181b38 postgres.exe!mdpostckpt+0x18bbdb """ After it was killed the following appeared in the application log: LOG: statistics collector process (PID 2680) exited with exit code 1 The service was started again and was working for aprox. 18 hours with no errors. Then the message started appearing again. The plan is to try PG 8.4.2. Advices and questions are welcome ;) -- Alexey sms stands for save my soul
Alexey Luchko wrote: > We've tried to restart the service. It was looking like the service > stopped successfully, but two postgres.exe processes left running. > One of them was utilizing one core. It had no tcp connections in any > state. It was > "C:/Program Files/PostgreSQL/8.4/bin/postgres.exe" "--forkcol" "1228" > > sysinternals' procexp.exe showes the following stack for this process: > """ > ntkrnlpa.exe+0x8dace > ntkrnlpa.exe+0x29a62 > ntkrnlpa.exe+0x33178 > hal.dll+0x6199 > hal.dll+0x63d9 > hal.dll+0x6577 > hal.dll+0x3902 > ntkrnlpa.exe+0x89721 > ntdll.dll!KiFastSystemCallRet > mswsock.dll!ServiceMain+0x2949 > mswsock.dll!NSPStartup+0x1893 > WS2_32.dll!WSAUnhookBlockingHook+0x486 > WS2_32.dll!WSAUnhookBlockingHook+0x660 > WS2_32.dll!WSAEventSelect+0x2f > postgres.exe!mdpostckpt+0x181b38 > postgres.exe!mdpostckpt+0x18bbdb > """ This stack trace seems bogus. The stats collector should not be calling mdpostckpt at all. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On 16-12-2009 16:28, Alvaro Herrera wrote: > Alexey Luchko wrote: > >> We've tried to restart the service. It was looking like the service >> stopped successfully, but two postgres.exe processes left running. >> One of them was utilizing one core. It had no tcp connections in any >> state. It was >> "C:/Program Files/PostgreSQL/8.4/bin/postgres.exe" "--forkcol" "1228" >> >> sysinternals' procexp.exe showes the following stack for this process: >> """ >> ntkrnlpa.exe+0x8dace >> ntkrnlpa.exe+0x29a62 >> ntkrnlpa.exe+0x33178 >> hal.dll+0x6199 >> hal.dll+0x63d9 >> hal.dll+0x6577 >> hal.dll+0x3902 >> ntkrnlpa.exe+0x89721 >> ntdll.dll!KiFastSystemCallRet >> mswsock.dll!ServiceMain+0x2949 >> mswsock.dll!NSPStartup+0x1893 >> WS2_32.dll!WSAUnhookBlockingHook+0x486 >> WS2_32.dll!WSAUnhookBlockingHook+0x660 >> WS2_32.dll!WSAEventSelect+0x2f >> postgres.exe!mdpostckpt+0x181b38 >> postgres.exe!mdpostckpt+0x18bbdb >> """ > > This stack trace seems bogus. The stats collector should not be calling > mdpostckpt at all. It seems that the process left and the stack trace is from the process waiting for the stats collector. -- Alexey
Alvaro Herrera <alvherre@commandprompt.com> writes: > Alexey Luchko wrote: >> postgres.exe!mdpostckpt+0x181b38 >> postgres.exe!mdpostckpt+0x18bbdb > This stack trace seems bogus. The stats collector should not be calling > mdpostckpt at all. Given the size of the offsets, it's pretty clear that it's not actually in mdpostckpt at all. Probably this is a stripped executable and mdpostckpt is just the nearest global symbol. IOW, not necessarily bogus, but still useless :-( regards, tom lane
On Wed, Dec 16, 2009 at 16:12, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Alexey Luchko wrote: >>> postgres.exe!mdpostckpt+0x181b38 >>> postgres.exe!mdpostckpt+0x18bbdb > >> This stack trace seems bogus. =A0The stats collector should not be calli= ng >> mdpostckpt at all. > > Given the size of the offsets, it's pretty clear that it's not actually > in mdpostckpt at all. =A0Probably this is a stripped executable and > mdpostckpt is just the nearest global symbol. If that is so, the OP should follow: http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreS= QL_backend_on_Windows specifically the part about symbols. Which it certainly looks like :) --=20 Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, Dec 16, 2009 at 10:45 AM, Magnus Hagander <magnus@hagander.net> wro= te: > On Wed, Dec 16, 2009 at 16:12, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Alvaro Herrera <alvherre@commandprompt.com> writes: >>> Alexey Luchko wrote: >>>> postgres.exe!mdpostckpt+0x181b38 >>>> postgres.exe!mdpostckpt+0x18bbdb >> >>> This stack trace seems bogus. =A0The stats collector should not be call= ing >>> mdpostckpt at all. >> >> Given the size of the offsets, it's pretty clear that it's not actually >> in mdpostckpt at all. =A0Probably this is a stripped executable and >> mdpostckpt is just the nearest global symbol. > > If that is so, the OP should follow: > http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_Postgr= eSQL_backend_on_Windows > > specifically the part about symbols. > > Which it certainly looks like :) We should really take some of the vast quantity of really useful information that is in the wiki and try to index it somehow or incorporate it into the docs. I'm always learning about new pages that have good stuff on them, but I never seem to stumble across them organically. ...Robert
Robert Haas escribió: > We should really take some of the vast quantity of really useful > information that is in the wiki and try to index it somehow or > incorporate it into the docs. I'm always learning about new pages > that have good stuff on them, but I never seem to stumble across them > organically. Maybe we should have a new wiki category, say "pages", that groups all the existant pages in the wiki. That way you could browse it and find everything you'd ever want. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Wed, Dec 16, 2009 at 11:05 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Robert Haas escribi=F3: > >> We should really take some of the vast quantity of really useful >> information that is in the wiki and try to index it somehow or >> incorporate it into the docs. =A0I'm always learning about new pages >> that have good stuff on them, but I never seem to stumble across them >> organically. > > Maybe we should have a new wiki category, say "pages", that groups all > the existant pages in the wiki. =A0That way you could browse it and find > everything you'd ever want. Unfortunately, I'd likely find everything I didn't want, too. :-( ...Robert
On 16-12-2009 17:45, Magnus Hagander wrote: > If that is so, the OP should follow: > http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows > > specifically the part about symbols. > > Which it certainly looks like :) How can I find the stats collector's process? What are others processes of interest? Note. The problem appeared in a production system. It's preferably not to install lot's of additional software. Thanks in advance! -- Alexey sms stands for save my soul
On Wed, Dec 16, 2009 at 17:43, Alexey Luchko <luch@ank-sia.com> wrote: > On 16-12-2009 17:45, Magnus Hagander wrote: >> >> If that is so, the OP should follow: >> >> http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows >> >> specifically the part about symbols. >> >> Which it certainly looks like :) > > How can I find the stats collector's process? Process explorer will let you see a Mutex named with a description of the process. Try that. > Note. The problem appeared in a production system. > It's preferably not to install lot's of additional software. The symbols aren't additional software, they are a part of postgresql. Something like process explorer or windbg is necessary though (but it's from microsoft, if that helps), because Windows doesn't ship with any usable debugging tools. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On 16/12/2009 11:55 PM, Robert Haas wrote: > On Wed, Dec 16, 2009 at 10:45 AM, Magnus Hagander<magnus@hagander.net> wrote: >> On Wed, Dec 16, 2009 at 16:12, Tom Lane<tgl@sss.pgh.pa.us> wrote: >>> Alvaro Herrera<alvherre@commandprompt.com> writes: >>>> Alexey Luchko wrote: >>>>> postgres.exe!mdpostckpt+0x181b38 >>>>> postgres.exe!mdpostckpt+0x18bbdb >>> >>>> This stack trace seems bogus. The stats collector should not be calling >>>> mdpostckpt at all. >>> >>> Given the size of the offsets, it's pretty clear that it's not actually >>> in mdpostckpt at all. Probably this is a stripped executable and >>> mdpostckpt is just the nearest global symbol. >> >> If that is so, the OP should follow: >> http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows >> >> specifically the part about symbols. >> >> Which it certainly looks like :) > > We should really take some of the vast quantity of really useful > information that is in the wiki and try to index it somehow or > incorporate it into the docs. I'm always learning about new pages > that have good stuff on them, but I never seem to stumble across them > organically. MediaWiki offers features to make this a lot easier. The categories feature in particular is great: go to the page: Category:Blah and you'll see a listing of all pages containing a link to [[Category:Blah]]. The existing list of all categories can be found via the "special pages" link in the sidebar: http://wiki.postgresql.org/wiki/Special:Categories A "favourite" or "highlight" category would be one way to do what you're talking about. -- Craig Ringer
On Wed, Dec 16, 2009 at 9:03 PM, Craig Ringer <craig@postnewspapers.com.au> wrote: > On 16/12/2009 11:55 PM, Robert Haas wrote: >> >> On Wed, Dec 16, 2009 at 10:45 AM, Magnus Hagander<magnus@hagander.net> >> =A0wrote: >>> >>> On Wed, Dec 16, 2009 at 16:12, Tom Lane<tgl@sss.pgh.pa.us> =A0wrote: >>>> >>>> Alvaro Herrera<alvherre@commandprompt.com> =A0writes: >>>>> >>>>> Alexey Luchko wrote: >>>>>> >>>>>> postgres.exe!mdpostckpt+0x181b38 >>>>>> postgres.exe!mdpostckpt+0x18bbdb >>>> >>>>> This stack trace seems bogus. =A0The stats collector should not be >>>>> calling >>>>> mdpostckpt at all. >>>> >>>> Given the size of the offsets, it's pretty clear that it's not actually >>>> in mdpostckpt at all. =A0Probably this is a stripped executable and >>>> mdpostckpt is just the nearest global symbol. >>> >>> If that is so, the OP should follow: >>> >>> http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_Post= greSQL_backend_on_Windows >>> >>> specifically the part about symbols. >>> >>> Which it certainly looks like :) >> >> We should really take some of the vast quantity of really useful >> information that is in the wiki and try to index it somehow or >> incorporate it into the docs. =A0I'm always learning about new pages >> that have good stuff on them, but I never seem to stumble across them >> organically. > > MediaWiki offers features to make this a lot easier. The categories featu= re > in particular is great: go to the page: > > =A0 Category:Blah > > and you'll see a listing of all pages containing a link to > [[Category:Blah]]. The existing list of all categories can be found via t= he > "special pages" link in the sidebar: > > =A0http://wiki.postgresql.org/wiki/Special:Categories > > A "favourite" or "highlight" category would be one way to do what you're > talking about. I don't think that's it exactly. What I think would be really useful is a page that is linked off the main page or some other reasonably prominent place that lists useful resources by topic. I guess the category interface could do that if it were maintained, but just "which pages are good" is not quite what I had in mind. ...Robert
Hi! On 16-12-2009 17:45, Magnus Hagander wrote: > http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows Thanks, it helps! Yesterday evening Windows on the server were restarted. Today the messages still appear, but not so regularly as before. There was one message per minute. Now the rate is lower. While I was looking at it the WARNING: pgstat wait timeout appeared at ~16:48, 16:54, 17:10, 17:15, 17:16. Then I disconnected. I've found the stats collector process by the stack trace: "C:/Program Files/PostgreSQL/8.4/bin/postgres.exe" "--forkcol" "1228" 17:00 """ ntkrnlpa.exe!ExAllocatePoolWithTag+0xfa0 ntkrnlpa.exe!KeInitializeMutant+0x1c ntkrnlpa.exe!MiFlushSectionInternal+0x89c ntkrnlpa.exe!MiFlushSectionInternal+0xba4 ntkrnlpa.exe!KeI386Call16BitCStyleFunction+0x55 ntkrnlpa.exe!PopFlushVolumeWorker+0x12e ntkrnlpa.exe!PopFlushVolumes+0x72 ntkrnlpa.exe!MiReserveSystemPtes+0x1ca ntdll.dll!KiFastSystemCall+0x3 ntdll.dll!KiFastSystemCallRet ntdll.dll!NtWaitForMultipleObjects+0xc kernel32.dll!WaitForMultipleObjectsEx+0x11a postgres.exe!pgwin32_waitforsinglesocket+0x1f0 postgres.exe!PgstatCollectorMain+0x175 postgres.exe!SubPostmasterMain+0x352 postgres.exe!main+0x177 postgres.exe!__tmainCRTStartup+0x10f kernel32.dll!BaseProcessStart+0x23 """ 17:12 """ ntkrnlpa.exe!KiSwapContext+0x26 ntkrnlpa.exe!KiSwapThread+0x2e5 ntkrnlpa.exe!KeWaitForSingleObject+0x346 ntkrnlpa.exe!KiSuspendThread+0x18 ntkrnlpa.exe!KiDeliverApc+0x117 ntkrnlpa.exe!KiSwapThread+0x300 ntkrnlpa.exe!KeWaitForMultipleObjects+0x3d7 ntkrnlpa.exe!ObpWaitForMultipleObjects+0x202 ntkrnlpa.exe!NtWaitForMultipleObjects+0xc8 ntkrnlpa.exe!KiFastCallEntry+0xfc ntdll.dll!KiFastSystemCallRet ntdll.dll!NtWaitForMultipleObjects+0xc kernel32.dll!WaitForMultipleObjectsEx+0x11a postgres.exe!pgwin32_waitforsinglesocket+0x1f0 postgres.exe!PgstatCollectorMain+0x175 postgres.exe!SubPostmasterMain+0x352 postgres.exe!main+0x177 postgres.exe!__tmainCRTStartup+0x10f kernel32.dll!BaseProcessStart+0x23 """ 17:18 """ ntkrnlpa.exe!KiSwapContext+0x26 ntkrnlpa.exe!KiSwapThread+0x2e5 ntkrnlpa.exe!KeWaitForSingleObject+0x346 ntkrnlpa.exe!KiSuspendThread+0x18 ntkrnlpa.exe!KiDeliverApc+0x117 ntkrnlpa.exe!KiSwapThread+0x300 ntkrnlpa.exe!KeWaitForMultipleObjects+0x3d7 ntkrnlpa.exe!ObpWaitForMultipleObjects+0x202 ntkrnlpa.exe!NtWaitForMultipleObjects+0xc8 ntkrnlpa.exe!KiFastCallEntry+0xfc ntdll.dll!KiFastSystemCallRet ntdll.dll!NtWaitForMultipleObjects+0xc kernel32.dll!WaitForMultipleObjectsEx+0x11a postgres.exe!pgwin32_waitforsinglesocket+0x1f0 postgres.exe!PgstatCollectorMain+0x175 postgres.exe!SubPostmasterMain+0x352 postgres.exe!main+0x177 postgres.exe!__tmainCRTStartup+0x10f kernel32.dll!BaseProcessStart+0x23 """ And autovacuum launcher: "C:/Program Files/PostgreSQL/8.4/bin/postgres.exe" "--forkavlauncher" "1232" """ ntkrnlpa.exe!ExAllocatePoolWithTag+0xfa0 ntkrnlpa.exe!KeInitializeMutant+0x1c ntkrnlpa.exe!MiFlushSectionInternal+0x89c ntkrnlpa.exe!MiFlushSectionInternal+0xba4 ntkrnlpa.exe!KeInitializeMutant+0x1c ntkrnlpa.exe!PoGetLightestSystemStateForEject+0x132 ntkrnlpa.exe!MiReserveSystemPtes+0x1ca ntdll.dll!KiFastSystemCall+0x3 ntdll.dll!KiFastSystemCallRet ntdll.dll!NtWaitForSingleObject+0xc kernel32.dll!WaitForSingleObjectEx+0xac kernel32.dll!WaitForSingleObject+0x12 postgres.exe!pg_usleep+0x36 postgres.exe!AutoVacLauncherMain+0x24c postgres.exe!SubPostmasterMain+0x2b5 postgres.exe!main+0x177 postgres.exe!__tmainCRTStartup+0x10f kernel32.dll!BaseProcessStart+0x23 """ 17:12 """ ntkrnlpa.exe!KiSwapContext+0x26 ntkrnlpa.exe!KiSwapThread+0x2e5 ntkrnlpa.exe!KeWaitForSingleObject+0x346 ntkrnlpa.exe!KiSuspendThread+0x18 ntkrnlpa.exe!KiDeliverApc+0x117 ntkrnlpa.exe!KiSwapThread+0x300 ntkrnlpa.exe!KeWaitForSingleObject+0x346 ntkrnlpa.exe!NtWaitForSingleObject+0x9a ntkrnlpa.exe!KiFastCallEntry+0xfc ntdll.dll!KiFastSystemCallRet ntdll.dll!NtWaitForSingleObject+0xc kernel32.dll!WaitForSingleObjectEx+0xac kernel32.dll!WaitForSingleObject+0x12 postgres.exe!pg_usleep+0x36 postgres.exe!AutoVacLauncherMain+0x24c postgres.exe!SubPostmasterMain+0x2b5 postgres.exe!main+0x177 postgres.exe!__tmainCRTStartup+0x10f kernel32.dll!BaseProcessStart+0x23 """ 17:17 """ ntkrnlpa.exe!KiSwapContext+0x26 ntkrnlpa.exe!KiSwapThread+0x2e5 ntkrnlpa.exe!KeWaitForSingleObject+0x346 ntkrnlpa.exe!KiSuspendThread+0x18 ntkrnlpa.exe!KiDeliverApc+0x117 ntkrnlpa.exe!KiSwapThread+0x300 ntkrnlpa.exe!KeWaitForSingleObject+0x346 ntkrnlpa.exe!NtWaitForSingleObject+0x9a ntkrnlpa.exe!KiFastCallEntry+0xfc ntdll.dll!KiFastSystemCallRet ntdll.dll!NtWaitForSingleObject+0xc kernel32.dll!WaitForSingleObjectEx+0xac kernel32.dll!WaitForSingleObject+0x12 postgres.exe!pg_usleep+0x36 postgres.exe!AutoVacLauncherMain+0x24c postgres.exe!SubPostmasterMain+0x2b5 postgres.exe!main+0x177 postgres.exe!__tmainCRTStartup+0x10f kernel32.dll!BaseProcessStart+0x23 """ Process Explorer showed the following for both processes in process properties / threads: """ postgres.exe!mainCRTStartup biolsp.dll!GetLspGuid+0xc00 biolsp.dll!GetLspGuid+0xc00 (second time) WS2HELP.dll!ApcThread postgres.exe!pg signal_thread """ Please, ask if one needs more details. -- Alexey sms stands for save my soul