Re: walsender vs. XLogBackgroundFlush during shutdown - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: walsender vs. XLogBackgroundFlush during shutdown |
Date | |
Msg-id | 20190501150220.z27mt2y6gt3nssdf@development Whole thread Raw |
In response to | Re: walsender vs. XLogBackgroundFlush during shutdown (Alexander Kukushkin <cyberdemn@gmail.com>) |
Responses |
Re: walsender vs. XLogBackgroundFlush during shutdown
|
List | pgsql-hackers |
On Wed, May 01, 2019 at 02:13:10PM +0200, Alexander Kukushkin wrote: > Hi Tomas, > >On Wed, 1 May 2019 at 02:28, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > >> I see there's an ongoing discussion about race conditions in walreceiver >> blocking shutdown, so let me start a new thread about a race condition in >> walsender during shutdown. >> >> The symptoms are rather simple - 'pg_ctl -m fast shutdown' gets stuck, >> waiting for walsender processes to catch-up and terminate indefinitely. > >I can confirm, during the past couple of years we observed such a >problem a few times and this is really annoying. > >> The reason for that is pretty simple - the walsenders are doing logical >> decoding, and during shutdown they're waiting for WalSndCaughtUp=true. >> But per XLogSendLogical() this only happens if >> >> if (logical_decoding_ctx->reader->EndRecPtr >= GetFlushRecPtr()) >> { >> WalSndCaughtUp = true; >> ... >> } > >After a couple of days investigating and debugging I came to a >slightly different conclusion, WalSndCaughtUp is set to true in my >case. >Since I am mostly a python person, I decided to use psycopg2 for my >investigation. I took an example from >http://initd.org/psycopg/docs/advanced.html#logical-replication-quick-start >as a starting point, created a slot and started the script. >I wasn't running any transactions, therefore the DemoConsumer.__call__ >was never executed and cursor.send_feedback(flush_lsn=msg.data_start) >was never called either. Basically, the only what the psycopg2 >internals was doing - periodically sending keepalive messages or >replying to keepalives sent by postgres. OK, so that seems like a separate issue, somewhat unrelated to the issue I reported. And I'm not sure it's a walsender issue - it seems it might be a psycopg2 issue in not reporting the flush properly, no? >Actually, the same problem will happen even in the case when the >consumer script receives some message, but not very intensively, but >it is just a bit harder to reproduce it. > >If you attach to the walsender with gdb, you'll see the following picture: >(gdb) bt >#0 0x00007fd6623d296a in __libc_send (fd=8, buf=0x55cb958dca08, >len=94, flags=0) at ../sysdeps/unix/sysv/linux/send.c:28 >#1 0x000055cb93aa7ce9 in secure_raw_write (port=0x55cb958d71e0, >ptr=0x55cb958dca08, len=94) at be-secure.c:318 >#2 0x000055cb93aa7b87 in secure_write (port=0x55cb958d71e0, >ptr=0x55cb958dca08, len=94) at be-secure.c:265 >#3 0x000055cb93ab6bf9 in internal_flush () at pqcomm.c:1433 >#4 0x000055cb93ab6b89 in socket_flush () at pqcomm.c:1409 >#5 0x000055cb93dac30b in send_message_to_frontend >(edata=0x55cb942b4380 <errordata>) at elog.c:3317 >#6 0x000055cb93da8973 in EmitErrorReport () at elog.c:1481 >#7 0x000055cb93da5abf in errfinish (dummy=0) at elog.c:481 >#8 0x000055cb93da852d in elog_finish (elevel=13, fmt=0x55cb93f32de3 >"sending replication keepalive") at elog.c:1376 >#9 0x000055cb93bcae71 in WalSndKeepalive (requestReply=true) at >walsender.c:3358 Is it stuck in the send() call forever, or did you happen to grab this bracktrace? > >All above text probably looks like a brain dump, but I don't think >that it conflicts with Tomas's findings it rather compliments them. >I am very glad that now I know how to mitigate the problem on the >client side, but IMHO it is also very important to fix the server >behavior if it is ever possible. > I think having a report of an issue, with a way to reproduce it is a first (and quite important) step. So thanks for doing that. That being said, I think those are two separate issues, with different causes and likely different fixes. I don't think fixing the xlog flush will resolve your issue, and vice versa. FWIW attached is a patch that I used to reliably trigger the xlog flush issue - it simply adds LogLogicalMessage() to commit handler in the built-in output plugin. So all you need to do is create a subscription, start generating commit and trigger a restart. The message is 8kB, so it's definitely long enough to overflow to the next xlog page. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: