Thread: BUG #8701: recover process hang on slave
The following bug has been logged on the website: Bug reference: 8701 Logged by: amutu Email address: amutu@amutu.com PostgreSQL version: 9.1.9 Operating system: CentOS 6 x86-64 Description: we have a master and two streaming salve pg.we find One of the slave replay_location is far behand the other. both sent_location is BF1/921F6000;the write_location and flush_location is similar;but one of the server replay_location is BF1/9210DD10ï¼the oter is 6DE/D958E8. on the abnormal serverï¼top show that a postgres process replay the 00000001000006DE00000000 WALï¼and the process take up 100% usage of the cpu core. I try to restart the salveï¼but failed. I get the core of the process,it showsï¼ Loaded symbols for /lib64/ld-linux-x86-64.so.2 Core was generated by `postgres: startup process recovering 00000001000006DE00000000'. #0 0x00000000006264e8 in smgrclose () Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.49.tl1.x86_64 (gdb) bt #0 0x00000000006264e8 in smgrclose () #1 0x00000000006265c8 in smgrcloseall () #2 0x0000000000495322 in XLogDropDatabase () #3 0x0000000000516253 in dbase_redo () #4 0x0000000000492d40 in StartupXLOG () #5 0x0000000000495148 in StartupProcessMain () #6 0x00000000004ac26f in AuxiliaryProcessMain () #7 0x00000000005eb383 in StartChildProcess () #8 0x00000000005ef3dc in PostmasterMain () #9 0x0000000000590fe8 in main ()
amutu@amutu.com wrote: > we have a master and two streaming salve pg.we find One of the slave > replay_location is far behand the other. > > > both sent_location is BF1/921F6000;the write_location and flush_location is > similar;but one of the server replay_location is BF1/9210DD10ï¼the oter is > 6DE/D958E8. > > on the abnormal serverï¼top show that a postgres process replay the > 00000001000006DE00000000 WALï¼and the process take up 100% usage of the cpu > core. Perhaps you can try to pg_xlogdump the offending pg_xlog file? -- Ãlvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi, On Wed, Dec 25, 2013 at 6:47 PM, <amutu@amutu.com> wrote: > PostgreSQL version: 9.1.9 In the last minor release 9.1.11 there were a bunch of important fixes affecting the replication process, so I would suggest you to upgrade first and ASAP. http://www.postgresql.org/docs/current/static/release-9-1-11.html http://www.databasesoup.com/2013/12/why-you-need-to-apply-todays-update.htm= l > I try to restart the salve=EF=BC=8Cbut failed. If it wont help then look at the thread below. It might be the same case. http://www.postgresql.org/message-id/flat/E1VtTni-00082E-Jv@wrigleys.postgr= esql.org --=20 Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com