Re: AIO v2.5 - Mailing list pgsql-hackers
From | Alexander Lakhin |
---|---|
Subject | Re: AIO v2.5 |
Date | |
Msg-id | 554d4da2-9392-4098-9c91-4c34f222abbb@gmail.com Whole thread Raw |
In response to | Re: AIO v2.5 (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
Hello Andres,
31.03.2025 02:46, Andres Freund wrote:
31.03.2025 02:46, Andres Freund wrote:
I've pushed most of these after some very light further editing. Yay. Thanks a lot for all the reviews! So far the buildfarm hasn't been complaining, but it's early days.
I found one complaint against commit 12ce89fd0, expressed as:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=scorpion&dt=2025-04-08%2001%3A36%3A41
149/334 postgresql:recovery / recovery/032_relfilenode_reuse ERROR 206.41s (exit status 255 or signal 127 SIGinvalid)
[01:41:53.249](0.620s) ok 8 - standby: post move contents as expected
[01:44:55.175](181.927s) Bail out! aborting wait: program timed out
stream contents: >><<
pattern searched for: (?^m:warmed_buffers)
032_relfilenode_reuse_standby.log contains:
2025-04-08 01:41:54.674 UTC [4013024][startup][33/0:0] DEBUG: waiting for all backends to process ProcSignalBarrier generation 7
2025-04-08 01:41:54.674 UTC [4013024][startup][33/0:0] CONTEXT: WAL redo at 0/43E5360 for Database/DROP: dir 1663/50001
2025-04-08 01:41:54.670 UTC [4013251][client backend][0/2:0] DEBUG: waiting for self with 0 pending
2025-04-08 01:41:54.674 UTC [4013251][client backend][0/2:0] ERROR: no free IOs despite no in-flight IOs
2025-04-08 01:41:54.674 UTC [4013251][client backend][0/2:0] STATEMENT: SELECT SUM(pg_prewarm(oid)) warmed_buffers FROM pg_class WHERE pg_relation_filenode(oid) != 0;
I could reproduce this error with the following TEMP_CONFIG:
log_min_messages = DEBUG4
log_connections = on
log_disconnections = on
log_statement = 'all'
log_line_prefix = '%m [%d][%p:%l][%b] %q[%a] '
fsync = on
io_method=io_uring
backtrace_functions = 'pgaio_io_wait_for_free'
When running 032_relfilenode_reuse.pl in a loop, I got failures on
iterations 10, 18, 6:
2025-04-20 15:56:01.310 CEST [][1517068:122][startup] DEBUG: updated min recovery point to 0/4296E90 on timeline 1
2025-04-20 15:56:01.310 CEST [][1517068:123][startup] CONTEXT: WAL redo at 0/4296E48 for Transaction/COMMIT: 2025-04-20 15:56:01.09363+02; inval msgs: catcache 21; sync
2025-04-20 15:56:01.310 CEST [][1517068:124][startup] DEBUG: waiting for all backends to process ProcSignalBarrier generation 5
2025-04-20 15:56:01.310 CEST [][1517068:125][startup] CONTEXT: WAL redo at 0/4296E90 for Database/DROP: dir 16409/50001
2025-04-20 15:56:01.312 CEST [postgres][1517075:144][client backend] [032_relfilenode_reuse.pl] DEBUG: waiting for self with 0 pending
2025-04-20 15:56:01.312 CEST [postgres][1517075:145][client backend] [032_relfilenode_reuse.pl] BACKTRACE:
pgaio_io_wait_for_free at aio.c:703:2
(inlined by) pgaio_io_acquire at aio.c:186:3
AsyncReadBuffers at bufmgr.c:1854:9
StartReadBuffersImpl at bufmgr.c:1425:18
(inlined by) StartReadBuffers at bufmgr.c:1497:9
read_stream_start_pending_read at read_stream.c:333:25
read_stream_look_ahead at read_stream.c:475:3
read_stream_next_buffer at read_stream.c:840:6
pg_prewarm at pg_prewarm.c:214:10
ExecInterpExpr at execExprInterp.c:927:7
ExecEvalExprNoReturn at executor.h:447:2
...
2025-04-20 15:56:01.312 CEST [][1517068:126][startup] DEBUG: finished waiting for all backends to process ProcSignalBarrier generation 5
2025-04-20 15:56:01.312 CEST [][1517068:127][startup] CONTEXT: WAL redo at 0/4296E90 for Database/DROP: dir 16409/50001
2025-04-20 15:56:01.314 CEST [postgres][1517075:146][client backend] [032_relfilenode_reuse.pl] ERROR: no free IOs despite no in-flight IOs
2025-04-20 15:56:01.314 CEST [postgres][1517075:147][client backend] [032_relfilenode_reuse.pl] BACKTRACE:
pgaio_io_wait_for_free at aio.c:735:3
(inlined by) pgaio_io_acquire at aio.c:186:3
AsyncReadBuffers at bufmgr.c:1854:9
StartReadBuffersImpl at bufmgr.c:1425:18
(inlined by) StartReadBuffers at bufmgr.c:1497:9
read_stream_start_pending_read at read_stream.c:333:25
read_stream_look_ahead at read_stream.c:475:3
read_stream_next_buffer at read_stream.c:840:6
pg_prewarm at pg_prewarm.c:214:10
ExecInterpExpr at execExprInterp.c:927:7
ExecEvalExprNoReturn at executor.h:447:2
...
I configured the build with:
CPPFLAGS="-O1" ./configure --enable-injection-points --enable-cassert --enable-debug --enable-tap-tests --with-liburing
I also encountered another failure when running this test:
t/032_relfilenode_reuse.pl .. 1/? Bailout called. Further testing stopped: aborting wait: program died
Core was generated by `postgres: standby: law postgres [local] SELECT '.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x0000700a5a24527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x0000700a5a2288ff in __GI_abort () at ./stdlib/abort.c:79
#5 0x0000630ff2c5dcea in errfinish (filename=filename@entry=0x630ff2cdd139 "aio.c", lineno=lineno@entry=559,
funcname=funcname@entry=0x630ff2e85da8 <__func__.9> "pgaio_io_wait") at elog.c:599
#6 0x0000630ff2aae79c in pgaio_io_wait (ioh=ioh@entry=0x700a57b59700, ref_generation=3972) at aio.c:559
#7 0x0000630ff2aaefab in pgaio_io_wait_for_free () at aio.c:771
#8 pgaio_io_acquire (resowner=0x63100114dee8, ret=ret@entry=0x6310011efec8) at aio.c:186
#9 0x0000630ff2ab9137 in AsyncReadBuffers (operation=operation@entry=0x6310011efe88, nblocks_progress=nblocks_progress@entry=0x7ffd8d262034) at bufmgr.c:1854
#10 0x0000630ff2abc6cc in StartReadBuffersImpl (allow_forwarding=true, flags=0, nblocks=0x7ffd8d262034, blockNum=0, buffers=0x6310011efe3c,
operation=0x6310011efe88) at bufmgr.c:1425
#11 StartReadBuffers (operation=0x6310011efe88, buffers=buffers@entry=0x6310011efe3c, blockNum=0, nblocks=nblocks@entry=0x7ffd8d262034, flags=flags@entry=0)
at bufmgr.c:1497
#12 0x0000630ff2ab2f3f in read_stream_start_pending_read (stream=stream@entry=0x6310011efde0) at read_stream.c:328
#13 0x0000630ff2ab3363 in read_stream_look_ahead (stream=stream@entry=0x6310011efde0) at read_stream.c:475
#14 0x0000630ff2ab37e5 in read_stream_next_buffer (stream=stream@entry=0x6310011efde0, per_buffer_data=per_buffer_data@entry=0x0) at read_stream.c:837
#15 0x0000700a5adae60c in pg_prewarm (fcinfo=<optimized out>) at pg_prewarm.c:214
#16 0x0000630ff29004cc in ExecInterpExpr (state=0x6310012239c8, econtext=0x6310011e1920, isnull=0x0) at execExprInterp.c:926
#17 0x0000630ff291bfb4 in ExecEvalExprNoReturn (econtext=<optimized out>, state=<optimized out>) at ../../../src/include/executor/executor.h:445
#18 ExecEvalExprNoReturnSwitchContext (econtext=<optimized out>, state=<optimized out>) at ../../../src/include/executor/executor.h:486
#19 advance_aggregates (aggstate=aggstate@entry=0x6310011e14f8) at nodeAgg.c:820
#20 0x0000630ff291db18 in agg_retrieve_direct (aggstate=0x6310011e14f8) at nodeAgg.c:2540
#21 ExecAgg (pstate=0x6310011e14f8) at nodeAgg.c:2265
#22 0x0000630ff290deef in ExecProcNodeFirst (node=0x6310011e14f8) at execProcnode.c:469
...
Best regards,
Alexander Lakhin
Neon (https://neon.tech)
pgsql-hackers by date: