Home > mailing lists

Re: Direct I/O - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Direct I/O
Date	April 9, 2023 23:43:37
Msg-id	3867491.1681073017@sss.pgh.pa.us Whole thread Raw
In response to	Re: Direct I/O (Thomas Munro <thomas.munro@gmail.com>)
Responses	Re: Direct I/O
List	pgsql-hackers

Tree view

Thomas Munro <thomas.munro@gmail.com> writes:
> we have a page at offset 638976, and we can find all system calls that
> touched that offset:

> [pid 26031] 23:26:48.521123 pwritev(50,
> [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> iov_len=8192}], 1, 638976) = 8192

> [pid 26040] 23:26:48.568975 pwrite64(5,
> "\0\0\0\0\0Nj\1\0\0\0\0\240\3\300\3\0 \4
> \0\0\0\0\340\2378\0\300\2378\0"..., 8192, 638976) = 8192

> [pid 26040] 23:26:48.593157 pread64(6,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 8192, 638976) = 8192

Boy, it's hard to look at that trace and not call it a filesystem bug.
Given the apparent dependency on COW, I wonder if this has something
to do with getting confused about which copy is current?

Another thing that struck me is that the two calls from pid 26040
are issued on different FDs.  I checked the strace log and verified
that these do both refer to "base/5/16384".  It looks like there was
a cache flush at about 23:26:48.575023 that caused 26040 to close
and later reopen all its database relation FDs.  Maybe that is
somehow contributing to the filesystem's confusion?  And more to the
point, could that explain why other O_DIRECT users aren't up in arms
over this bug?  Maybe they don't switch FDs as readily as we do.

            regards, tom lane

pgsql-hackers by date:

From: Tom Lane
Date: 09 April 2023, 23:11:42
Subject: Re: cataloguing NOT NULL constraints

From: Sandro Santilli
Date: 09 April 2023, 23:46:29
Subject: Re: [PATCH] Support % wildcard in extension upgrade filenames

Re: Direct I/O - Mailing list pgsql-hackers

Previous

Next