Re: AIO v2.5 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: AIO v2.5
Date
Msg-id 4qk3ehe6w7x7hfrldei2hefjcb7v7nfmj2owl2ir64craqcapz@kbrao22ljxeb
Whole thread Raw
In response to Re: AIO v2.5  (Alexander Lakhin <exclusion@gmail.com>)
Responses Re: AIO v2.5
List pgsql-hackers
Hi,

On 2025-04-13 09:00:01 +0300, Alexander Lakhin wrote:
> 07.04.2025 22:10, Alexander Lakhin wrote:
> > > I ran it for a while in a VM, it hasn't triggered yet. Neither on xfs nor on
> > > tmpfs.
> > 
> > Before sharing the script I tested it on two my machines, but I had
> > anticipated that the error can be hard to reproduce. Will try to reduce
> > the reproducer...
> 
> I've managed to reduce it to the following:

Thanks a lot for working on that!


> [reproducer]
> 
> It fails for me as below:
> iteration 13 (jobs: 25)
> Sun Apr 13 05:31:47 AM UTC 2025
> iteration 14 (jobs: 67)
> Sun Apr 13 05:31:50 AM UTC 2025
> dropdb: error: database removal failed: ERROR:  could not read blocks 0..0 in file "global/1213": Operation canceled
> 2025-04-13 05:31:58.930 UTC [1153451] LOG:  could not read blocks 0..0 in file "global/1213": Operation canceled
> 2025-04-13 05:31:58.930 UTC [1153451] CONTEXT:  completing I/O on behalf of process 1153456
> 2025-04-13 05:31:58.930 UTC [1153451] STATEMENT:  DROP DATABASE db5;
> 2025-04-13 05:31:58.930 UTC [1153456] ERROR:  could not read blocks 0..0 in file "global/1213": Operation canceled
> 2025-04-13 05:31:58.930 UTC [1153456] STATEMENT:  DROP DATABASE db6;
> 2025-04-13 05:31:58.931 UTC [1034758] LOG:  checkpoint complete: wrote 3
> buffers (0.0%), wrote 0 SLRU buffers; 0 WAL file(s) added, 0 removed, 0
> recycled; write=0.002 s, sync=0.001 s, total=0.002 s; sync files=0,
> longest=0.000 s, average=0.000 s; distance=18 kB, estimate=458931 kB;
> lsn=16/54589E08, redo lsn=16/54586F88
> 2025-04-13 05:31:58.931 UTC [1034758] LOG:  checkpoint starting: immediate force wait

Unfortunately I'm several hundred iterations in, without reproducing the
issue. I'm bad at statistics, but I think that makes it rather unlikely that I
will, without changing some aspect.

Was this an assert enabled build? What compiler and what optimization settings
did you use? Do you have huge pages configured (so that the default
huge_pages=try would end up with huge pages)?

So far I've been trying to use a cassert enabled build built with -O0, without
huge pages. After the current test run I'll switch to cassert+-O2.



> I reproduced this error on three different machines (all are running
> Ubuntu 24.04, two with kernel version 6.8, one with 6.11), with PGDATA
> located on tmpfs.

That's another variable to try - so far I've been trying this on 6.15.0-rc1
[1].  I guess I'll have to set up a ubuntu 24.04 VM and try with that.

Greetings,

Andres Freund


[1] I wanted to play with io_uring changes that were recently merged. Namely
support for readv/writev of "fixed" buffers. That avoids needing to pin/unpin
buffers while IO is ongoing, which turns out to be a noticeable bottleneck in
some workloads, particularly when using 1GB huge pages.



pgsql-hackers by date:

Previous
From: Dimitrios Apostolou
Date:
Subject: [WIP] Implement "pg_restore --data-only --clean" as a way to skip WAL
Next
From: Jacob Champion
Date:
Subject: Re: [PoC] Federated Authn/z with OAUTHBEARER