Re: Direct I/O - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Direct I/O
Date
Msg-id CA+hUKGJ2JqN1O=kfdbZfVZKpTCkZXY4=nMwc1U4xe39YE66GTw@mail.gmail.com
Whole thread Raw
In response to Re: Direct I/O  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Apr 9, 2023 at 9:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 2023-04-08 16:50:03.177 EDT [2023-04-08 16:50:03 EDT 3257645:3] 004_io_direct.pl LOG:  statement: select count(*)
fromt1 
> 2023-04-08 16:50:03.316 EDT [2023-04-08 16:50:03 EDT 3257646:1] ERROR:  invalid page in block 56 of relation
base/5/16384

> The fact that the error is happening in a parallel worker seems
> interesting ...

That's because it's running with debug_parallel_query=regress.  I've
been trying to repro that but no luck...  A different kind of failure
also showed up, where it counted the wrong number of tuples:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2023-04-08%2015%3A52%3A03

A paranoid explanation would be that this system is failing to provide
basic I/O coherency, we're writing pages out and not reading them back
in.  Or of course there is a dumb bug... but why only here?  Can of
course be timing-sensitive and it's interesting that crake suffers
from the "no unpinned buffers available" thing (which should now be
gone) with higher frequency; I'm keen to see if the dodgy-read problem
continues with a similar frequency now.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Direct I/O
Next
From: Andres Freund
Date:
Subject: Re: Direct I/O