Home > mailing lists

Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes - Mailing list pgsql-bugs

From	Антон Степаненко
Subject	Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes
Date	June 17, 2011 10:51:09
Msg-id	438511308318660@web144.yandex.ru Whole thread Raw
In response to	Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses	Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes
List	pgsql-bugs

Tree view

17.06.2011, 00:28, "Kevin Grittner" <Kevin.Grittner@wicourts.gov>:
> ***** **********<zlobnynigga@yandex.ru>; wrote:
>
>>  [4-1] 2011-06-16 17:40:27 UTC LOG:  startup process (PID 15292)
>>  was terminated by signal 7: Bus error
>>  Signal 7 means  hardware problems. But all 10 replicas crashed
>>  within 10 minutes, say from 13:35 to 13:45.
>>  One important thing - all replicas and master are running on
>>  openvz
>
> Were the PostgreSQL clusters sharing any hardware?
>
>>  there is no way to reject virtualization (it is a long story =))
>>
>>  Please, I do not want to discuss my decision to set buffers to
>>  12Gb and postgresql optimization at all. I just want to undestand
>>  why I'm getting such errors.
>
> On the face of it, the most likely cause would seem to be hardware
> or the virtual environment.  Without knowing more about the exact
> messages on the replicas and how they compared to each other and the
> master it's hard to know whether any of the replica failures were
> from passing corrupted data from the master to the replicas, versus
> having a common hardware/vm flaw.
>
> -Kevin

I noticed that crash takes place when shared buffers are almost full, i.e. SELECT SUM(size)  FROM adm.buffercache()
returns11670 at about one minute before crash. Furthermore, last night I set buffers  to 11Gb, at it is working, no
crash,all buffers are used (11120). 
I still do not believe that this is hardware problem. Each replica and master runs on dedicated server, no hardware is
shared.There is only postgresql on each server, no any other software(just crond, zabbix, atop). 
Actually openvz is used only for portability(easily add new replicas or migrate one of them to new server).
Messages on replicas are all the same: "could not read block", then "signal 7". I copypasted error log as is, that is
allI know. 
Master did not crash, I think because it processes less SELECT queries, therefore his buffers do not reach limit.

pgsql-bugs by date:

From: "Ben"
Date: 17 June 2011, 10:49:27
Subject: BUG #6065: FATAL: lock 0 not held

From: "Kevin Grittner"
Date: 17 June 2011, 11:20:58
Subject: Re: BUG #6064: != NULL, <> NULL do not work

Re: could not read block XXXXX in file "base/YYYYY/ZZZZZZ": read only 160 of 8192 bytes - Mailing list pgsql-bugs

Previous

Next