Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes - Mailing list pgsql-bugs

From Merlin Moncure
Subject Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes
Date
Msg-id BANLkTimvQDj8H6Vt4CvE84i5N8VsxzYUVA@mail.gmail.com
Whole thread Raw
In response to Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes  (Антон Степаненко <zlobnynigga@yandex.ru>)
Responses Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes
List pgsql-bugs
2011/6/17 =E1=CE=D4=CF=CE =F3=D4=C5=D0=C1=CE=C5=CE=CB=CF <zlobnynigga@yande=
x.ru>:
> 17.06.2011, 21:24, "Merlin Moncure" <mmoncure@gmail.com>:
>> 2011/6/17 =E1=CE=D4=CF=CE =F3=D4=C5=D0=C1=CE=C5=CE=CB=CF <zlobnynigga@ya=
ndex.ru>;:
>>
>>> =9A17.06.2011, 20:19, "Merlin Moncure" <mmoncure@gmail.com>;:
>>>> =9AOn Fri, Jun 17, 2011 at 10:56 AM, Kevin Grittner
>>>> =9A<Kevin.Grittner@wicourts.gov>;; wrote:
>>>>>> =9A=9AI still do not believe that this is hardware problem.
>>>>> =9A=9AHow would an application cause a bus error?
>>>> =9Aunaligned memory access on risc maybe? =9Awhat's this running on?
>>>>
>>>> =9Amerlin
>>> =9A*****:~$ cat /proc/cpuinfo
>>> =9Aprocessor =9A =9A =9A : 0
>>> =9A....
>>> =9Aprocessor =9A =9A =9A : 23
>>> =9Avendor_id =9A =9A =9A : GenuineIntel
>>> =9Acpu family =9A =9A =9A: 6
>>> =9Amodel =9A =9A =9A =9A =9A : 44
>>> =9Amodel name =9A =9A =9A: Intel(R) Xeon(R) CPU =9A =9A =9A =9A =9A E56=
45 =9A@ 2.40GHz
>>
>> hm, I'm wondering if this
>> (http://us.generation-nt.com/bug-626451-linux-image-mremap-returns-usele=
ss-pages-moving-anonymous-shared-mmap-access-causes-sigbus-help-203302832.h=
tml)
>> has anything to do with your problem.
>>
>> merlin
>
> Thank you very much, very interesting link. I've compiled it under my ubu=
ntu lucid - it really causes sigbus. But when compiled under CentOS 2.6.18 =
- it makes the same. So I am not sure that this is a bug.
> And event if it is - why it occurs only when buffers are set to 12Gb and =
filled...
> I've read some sources of postgresql, e.g. /src/backend/storage/smgr/md.c:
> void
> mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
> =9A =9A =9A =9A =9A char *buffer)
> {
> ..
> if (nbytes !=3D BLCKSZ)
> =9A =9A =9A =9A{
> =9A =9A =9A =9A =9A =9A =9A =9Aif (nbytes < 0)
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9Aereport(ERROR,
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =
=9A =9A(errcode_for_file_access(),
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =
=9A =9A errmsg("could not read block %u in file \"%s\": %m",
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =
=9A =9A =9A =9A =9A =9A =9A =9A =9A =9Ablocknum, FilePathName(v->mdfd_vfd))=
));
>
> =9A =9A =9A =9A =9A =9A =9A =9A/*
> =9A =9A =9A =9A =9A =9A =9A =9A * Short read: we are at or past EOF, or w=
e read a partial block at
> =9A =9A =9A =9A =9A =9A =9A =9A * EOF. =9ANormally this is an error; uppe=
r levels should never try to
> =9A =9A =9A =9A =9A =9A =9A =9A * read a nonexistent block. =9AHowever, i=
f zero_damaged_pages is ON or
> =9A =9A =9A =9A =9A =9A =9A =9A * we are InRecovery, we should instead re=
turn zeroes without
> =9A =9A =9A =9A =9A =9A =9A =9A * complaining. =9AThis allows, for exampl=
e, the case of trying to
> =9A =9A =9A =9A =9A =9A =9A =9A * update a block that was later truncated=
 away.
> =9A =9A =9A =9A =9A =9A =9A =9A */
> =9A =9A =9A =9A =9A =9A =9A =9Aif (zero_damaged_pages || InRecovery)
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9AMemSet(buffer, 0, BLCKSZ);
> =9A =9A =9A =9A =9A =9A =9A =9Aelse
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9Aereport(ERROR,
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =
=9A =9A(errcode(ERRCODE_DATA_CORRUPTED),
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =
=9A =9A errmsg("could not read block %u in file \"%s\": read only %d of %d =
bytes",
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =
=9A =9A =9A =9A =9A =9A =9A =9A =9A =9Ablocknum, FilePathName(v->mdfd_vfd),
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =
=9A =9A =9A =9A =9A =9A =9A =9A =9A =9Anbytes, BLCKSZ)));
> =9A =9A =9A =9A}
> }
>
> This is the only place reporting errors like 'could not read block in fil=
e'.
> Then I lookead at /src/backend/storage/file/fd.c:
> int
> FileRead(File file, char *buffer, int amount)
> {
> ..
> retry:
> =9A =9A =9A =9AreturnCode =3D read(VfdCache[file].fd, buffer, amount);
>
> =9A =9A =9A =9Aif (returnCode >=3D 0)
> =9A =9A =9A =9A =9A =9A =9A =9AVfdCache[file].seekPos +=3D returnCode;
> =9A =9A =9A =9Aelse
> =9A =9A =9A =9A{
> =9A =9A =9A =9A =9A =9A =9A =9A/*
> =9A =9A =9A =9A =9A =9A =9A =9A * Windows may run out of kernel buffers a=
nd return "Insufficient
> =9A =9A =9A =9A =9A =9A =9A =9A * system resources" error. =9AWait a bit =
and retry to solve it.
> =9A =9A =9A =9A =9A =9A =9A =9A *
> =9A =9A =9A =9A =9A =9A =9A =9A * It is rumored that EINTR is also possib=
le on some Unix filesystems,
> =9A =9A =9A =9A =9A =9A =9A =9A * in which case immediate retry is indica=
ted.
> =9A =9A =9A =9A =9A =9A =9A =9A */
> #ifdef WIN32
> =9A =9A =9A =9A =9A =9A =9A =9A...
> #endif
> =9A =9A =9A =9A =9A =9A =9A =9A/* OK to retry if interrupted */
> =9A =9A =9A =9A =9A =9A =9A =9Aif (errno =3D=3D EINTR)
> =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9A =9Agoto retry;
>
> =9A =9A =9A =9A =9A =9A =9A =9A/* Trouble, so assume we don't know the fi=
le position anymore */
> =9A =9A =9A =9A =9A =9A =9A =9AVfdCache[file].seekPos =3D FileUnknownPos;
> =9A =9A =9A =9A}
>
> =9A =9A =9A =9Areturn returnCode;
> }
>
> First, comment started with 'It is rumored' looks suspiciosly =3D) But I =
am not a kernel developer, I am event not a C++ developer, so I trust autho=
rs.
> I've read 'man read' and 'man 7 signal', and it is said that syscalls cou=
ld be interrupted by some signals, including sigbus, but when they do so, t=
hey should return to normal behaviour.
> "the call will be automatically restarted after the signal handler return=
s if the SA_RESTART flag was used; otherwise the call will fail with the er=
ror EINTR" - from man 7 signal
> So as I far as I understand even if postgresql gets signal 7 it should ex=
perience EINTR and retry immediately. What I am trying to say is that I do =
not know why I am getting sigbus, but no matter where it comes from, accord=
ing to sources postgresql should just try to read one more time, and one mo=
re, and so on until read succeeded. But I'm not quite sure what happens fir=
st - sigbus or 'could not read block' error.

I wonder if you are oversubscribing your memory, and are getting weird
errors when reading data into memory because the pages can't be
reserved to do that.  What happens when you enable overcommit and
attempt to start the server?

merlin

pgsql-bugs by date:

Previous
From: Антон Степаненко
Date:
Subject: Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes
Next
From: Антон Степаненко
Date:
Subject: Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes