Re: AIO v2.5 - Mailing list pgsql-hackers
From | Alexander Lakhin |
---|---|
Subject | Re: AIO v2.5 |
Date | |
Msg-id | 96abefe8-fa72-41f5-8840-0517125c24e3@gmail.com Whole thread Raw |
In response to | Re: AIO v2.5 (Alexander Lakhin <exclusion@gmail.com>) |
Responses |
Re: AIO v2.5
|
List | pgsql-hackers |
Hello Andres,
07.04.2025 22:10, Alexander Lakhin wrote:
07.04.2025 22:10, Alexander Lakhin wrote:
I ran it for a while in a VM, it hasn't triggered yet. Neither on xfs nor on tmpfs.
Before sharing the script I tested it on two my machines, but I had
anticipated that the error can be hard to reproduce. Will try to reduce
the reproducer...
I've managed to reduce it to the following:
ulimit -n 4096
echo "
fsync = off
autovacuum = off
checkpoint_timeout = 30s
io_max_concurrency = 10
io_method = io_uring
" >> $PGDATA/postgresql.conf
pg_ctl -l server.log start
for i in `seq 1000`; do
numjobs=$((20 + $RANDOM % 60))
echo "iteration $i (jobs: $numjobs)"
date
for ((j=1;j<=numjobs;j++)); do
(
createdb db$j;
for ((n=1;n<=50;n++)); do
cat << EOF | psql -d db$j -a >>/dev/null 2>&1
DROP TABLE IF EXISTS tenk1;
CREATE TABLE tenk1 (
unique1 int4,
unique2 int4,
two int4,
four int4,
ten int4,
twenty int4,
hundred int4,
thousand int4,
twothousand int4,
fivethous int4,
tenthous int4,
odd int4,
even int4,
stringu1 name,
stringu2 name,
string4 name
);
COPY tenk1 FROM '.../src/test/regress/data/tenk.data';
EOF
done;
) &
done
wait
for ((j=1;j<=numjobs;j++)); do dropdb db$j & done
wait
grep -A3 -E '(ERROR|could not read blocks )' server.log && break;
done
pg_ctl stop
It fails for me as below:
iteration 13 (jobs: 25)
Sun Apr 13 05:31:47 AM UTC 2025
iteration 14 (jobs: 67)
Sun Apr 13 05:31:50 AM UTC 2025
dropdb: error: database removal failed: ERROR: could not read blocks 0..0 in file "global/1213": Operation canceled
2025-04-13 05:31:58.930 UTC [1153451] LOG: could not read blocks 0..0 in file "global/1213": Operation canceled
2025-04-13 05:31:58.930 UTC [1153451] CONTEXT: completing I/O on behalf of process 1153456
2025-04-13 05:31:58.930 UTC [1153451] STATEMENT: DROP DATABASE db5;
2025-04-13 05:31:58.930 UTC [1153456] ERROR: could not read blocks 0..0 in file "global/1213": Operation canceled
2025-04-13 05:31:58.930 UTC [1153456] STATEMENT: DROP DATABASE db6;
2025-04-13 05:31:58.931 UTC [1034758] LOG: checkpoint complete: wrote 3 buffers (0.0%), wrote 0 SLRU buffers; 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s, sync=0.001 s, total=0.002 s; sync files=0, longest=0.000 s, average=0.000 s; distance=18 kB, estimate=458931 kB; lsn=16/54589E08, redo lsn=16/54586F88
2025-04-13 05:31:58.931 UTC [1034758] LOG: checkpoint starting: immediate force wait
I reproduced this error on three different machines (all are running
Ubuntu 24.04, two with kernel version 6.8, one with 6.11), with PGDATA
located on tmpfs.
Best regards,
Alexander Lakhin
Neon (https://neon.tech)
pgsql-hackers by date: