Home > mailing lists

Re: Slow client can delay replication despite max_standby_streaming_delay set - Mailing list pgsql-hackers

From	Fujii Masao
Subject	Re: Slow client can delay replication despite max_standby_streaming_delay set
Date	November 18, 2021 04:47:50
Msg-id	ce425978-4c10-c534-68e5-c06d8c8e8647@oss.nttdata.com Whole thread Raw
In response to	Slow client can delay replication despite max_standby_streaming_delay set (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses	Re: Slow client can delay replication despite max_standby_streaming_delay set
List	pgsql-hackers

Tree view


On 2021/11/17 14:39, Andrey Borodin wrote:
> Hi hackers!
> 
> $subj was recently observed on one of our installations.
> 
> Startup process backtrace
> #0  0x00007fd216660d27 in epoll_wait (epfd=525, events=0x55c688dfbde8, maxevents=maxevents@entry=1,
timeout=timeout@entry=-1)
> #1  0x000055c687264be9 in WaitEventSetWaitBlock (nevents=1, occurred_events=0x7ffdf8089f00, cur_timeout=-1,
set=0x55c688dfbd78)
> #2  WaitEventSetWait (set=set@entry=0x55c688dfbd78, timeout=timeout@entry=-1,
occurred_events=occurred_events@entry=0x7ffdf8089f00,nevents=nevents@entry=1,
wait_event_info=wait_event_info@entry=67108864)at /build/../src/backend/storage/ipc/latch.c:1000
 
> #3  0x000055c687265038 in WaitLatchOrSocket (latch=0x7fd1fa735454, wakeEvents=wakeEvents@entry=1, sock=sock@entry=-1,
timeout=-1,timeout@entry=0, wait_event_info=wait_event_info@entry=67108864)
 
> #4  0x000055c6872650f5 in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents@entry=1, timeout=timeout@entry=0,
wait_event_info=wait_event_info@entry=67108864
> #5  0x000055c687276399 in ProcWaitForSignal (wait_event_info=wait_event_info@entry=67108864)
> #6  0x000055c68726c898 in ResolveRecoveryConflictWithBufferPin ()
> #7  0x000055c6872582c5 in LockBufferForCleanup (buffer=292159)
> #8  0x000055c687259447 in ReadBuffer_common (smgr=0x55c688deae40, relpersistence=relpersistence@entry=112 'p',
forkNum=forkNum@entry=MAIN_FORKNUM,blockNum=blockNum@entry=3751242, mode=mode@entry=RBM_ZERO_AND_CLEANUP_LOCK,
strategy=strategy@entry=0x0,hit=0x7ffdf808a117 "\001")
 
> #9  0x000055c687259b6b in ReadBufferWithoutRelcache (rnode=..., forkNum=forkNum@entry=MAIN_FORKNUM,
blockNum=blockNum@entry=3751242,mode=mode@entry=RBM_ZERO_AND_CLEANUP_LOCK, strategy=strategy@entry=0x0)
 
> #10 0x000055c68705655f in XLogReadBufferExtended (rnode=..., forknum=MAIN_FORKNUM, blkno=3751242,
mode=RBM_ZERO_AND_CLEANUP_LOCK)
> #11 0x000055c687056706 in XLogReadBufferForRedoExtended (record=record@entry=0x55c688dd2378,
block_id=block_id@entry=0'\000', mode=mode@entry=RBM_NORMAL, get_cleanup_lock=get_cleanup_lock@entry=1 '\001',
buf=buf@entry=0x7ffdf808a218)
> #12 0x000055c68700728b in heap_xlog_clean (record=0x55c688dd2378)
> #13 heap2_redo (record=0x55c688dd2378)
> #14 0x000055c68704a7eb in StartupXLOG ()
> 
> Backend holding the buffer pin:
> 
> #0  0x00007fd216660d27 in epoll_wait (epfd=5, events=0x55c688d67ca8, maxevents=maxevents@entry=1,
timeout=timeout@entry=-1)
> #1  0x000055c687264be9 in WaitEventSetWaitBlock (nevents=1, occurred_events=0x7ffdf808e070, cur_timeout=-1,
set=0x55c688d67c38)
> #2  WaitEventSetWait (set=0x55c688d67c38, timeout=timeout@entry=-1,
occurred_events=occurred_events@entry=0x7ffdf808e070,nevents=nevents@entry=1,
wait_event_info=wait_event_info@entry=100663297)
> #3  0x000055c687185d7e in secure_write (port=0x55c688db4a60, ptr=ptr@entry=0x55c688dd338e, len=len@entry=2666)
> #4  0x000055c687190f4c in internal_flush ()
> #5  0x000055c68719118a in internal_putbytes (s=0x55c68f8dcc35 "", s@entry=0x55c68f8dcc08 "", len=65)
> #6  0x000055c687191262 in socket_putmessage (msgtype=<optimized out>, s=0x55c68f8dcc08 "", len=<optimized out>)
> #7  0x000055c687193431 in pq_endmessage (buf=buf@entry=0x7ffdf808e1a0)
> #8  0x000055c686fd1442 in printtup (slot=0x55c6894c2dc0, self=0x55c689326b40)
> #9  0x000055c687151962 in ExecutePlan (execute_once=<optimized out>, dest=0x55c689326b40, direction=<optimized out>,
numberTuples=0,sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x55c689281a28,estate=0x55c6892816f8)
 
> #10 standard_ExecutorRun (queryDesc=0x55c689289628, direction=<optimized out>, count=0, execute_once=<optimized
out>)
> #11 0x00007fd2074100c5 in pgss_ExecutorRun (queryDesc=0x55c689289628, direction=ForwardScanDirection, count=0,
execute_once=<optimizedout>)
 
> #12 0x000055c68728a356 in PortalRunSelect (portal=portal@entry=0x55c688d858b8, forward=forward@entry=1 '\001',
count=0,count@entry=9223372036854775807, dest=dest@entry=0x55c689326b40)
 
> #13 0x000055c68728b988 in PortalRun (portal=portal@entry=0x55c688d858b8, count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=1'\001', run_once=run_once@entry=1 '\001', dest=dest@entry=0x55c689326b40,
altdest=altdest@entry=0x55c689326b40,completionTag=0x7ffdf808e580 "")
 
> #14 0x000055c687287425 in exec_simple_query (query_string=0x55c688ec5e38 "select\n
'/capacity/created-at-counter-by-time-v2'as sensor,\n  round(extract(epoch from shipment_date))
 
> #15 0x000055c687289418 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x55c688dd3e28, dbname=<optimized out>,
username=<optimizedout>)
 
> 
> 
> I think the problem here is that secure_write() uses infinite timeout.

Is this the same issue as one reported at [1]?

[1] https://www.postgresql.org/message-id/adce2c09-3bfc-4666-997a-c21991cb1eb1.mengjuan.cmj@alibaba-inc.com

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

pgsql-hackers by date:

From: Bharath Rupireddy
Date: 18 November 2021, 04:34:57
Subject: Re: wait event and archive_command

From: Amit Kapila
Date: 18 November 2021, 04:53:40
Subject: Re: pg_get_publication_tables() output duplicate relid

Re: Slow client can delay replication despite max_standby_streaming_delay set - Mailing list pgsql-hackers

Previous

Next