Re: AIO v2.5 - Mailing list pgsql-hackers
From | Alexander Lakhin |
---|---|
Subject | Re: AIO v2.5 |
Date | |
Msg-id | 062daca9-dfad-4750-9da8-b13388301ad9@gmail.com Whole thread Raw |
In response to | Re: AIO v2.5 (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
Hello Andres, 14.04.2025 19:06, Andres Freund wrote: > Unfortunately I'm several hundred iterations in, without reproducing the > issue. I'm bad at statistics, but I think that makes it rather unlikely that I > will, without changing some aspect. > > Was this an assert enabled build? What compiler and what optimization settings > did you use? Do you have huge pages configured (so that the default > huge_pages=try would end up with huge pages)? Yes, I used --enable-cassert; no explicit optimization setting and no huge pages configured. pg_config says: CONFIGURE = '--enable-debug' '--enable-cassert' '--enable-tap-tests' '--with-liburing' CC = gcc CPPFLAGS = -D_GNU_SOURCE CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wshadow=compatible-local -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -O2 Please look at the complete script attached. I've just run it and got: iteration 56 (jobs: 44) Tue Apr 15 06:30:52 PM CEST 2025 dropdb: error: database removal failed: ERROR: could not read blocks 0..0 in file "global/1213": Operation canceled 2025-04-15 18:31:00.650 CEST [1612266] LOG: could not read blocks 0..0 in file "global/1213": Operation canceled 2025-04-15 18:31:00.650 CEST [1612266] CONTEXT: completing I/O on behalf of process 1612271 2025-04-15 18:31:00.650 CEST [1612266] STATEMENT: DROP DATABASE db3; I used gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, but now I've also reproduced the issue with CC=clang (18.1.3 (1ubuntu1)). Please take a look also at the simple reproducer for the crash inside pg_get_aios() I mentioned upthread: for i in {1..100}; do numjobs=12 echo "iteration $i" date for ((j=1;j<=numjobs;j++)); do ( createdb db$j; for k in {1..300}; do echo "CREATE TABLE t (a INT); CREATE INDEX ON t (a); VACUUM t; SELECT COUNT(*) >= 0 AS ok FROM pg_aios; " \ | psql -d db$j >/dev/null 2>&1; done; dropdb db$j; ) & done wait psql -c 'SELECT 1' || break; done it fails for me as follows: iteration 20 Tue Apr 15 07:21:29 PM EEST 2025 dropdb: error: connection to server on socket "/tmp/.s.PGSQL.55432" failed: No such file or directory Is the server running locally and accepting connections on that socket? ... 2025-04-15 19:21:30.675 EEST [3111699] LOG: client backend (PID 3320979) was terminated by signal 11: Segmentation fault 2025-04-15 19:21:30.675 EEST [3111699] DETAIL: Failed process was running: SELECT COUNT(*) >= 0 AS ok FROM pg_aios; 2025-04-15 19:21:30.675 EEST [3111699] LOG: terminating any other active server processes >> I reproduced this error on three different machines (all are running >> Ubuntu 24.04, two with kernel version 6.8, one with 6.11), with PGDATA >> located on tmpfs. > That's another variable to try - so far I've been trying this on 6.15.0-rc1 > [1]. I guess I'll have to set up a ubuntu 24.04 VM and try with that. > > Greetings, > > Andres Freund > > > [1] I wanted to play with io_uring changes that were recently merged. Namely > support for readv/writev of "fixed" buffers. That avoids needing to pin/unpin > buffers while IO is ongoing, which turns out to be a noticeable bottleneck in > some workloads, particularly when using 1GB huge pages. Best regards, Alexander Lakhin Neon (https://neon.tech)
Attachment
pgsql-hackers by date: