Re: BUG #5238: frequent signal 11 segfaults - Mailing list pgsql-bugs

From Pavel Stehule
Subject Re: BUG #5238: frequent signal 11 segfaults
Date
Msg-id 162867790912130813p5c91fc0fib0881a93706626ad@mail.gmail.com
Whole thread Raw
In response to Re: BUG #5238: frequent signal 11 segfaults  (Nagy Daniel <nagy.daniel@telekom.hu>)
List pgsql-bugs
2009/12/13 Nagy Daniel <nagy.daniel@telekom.hu>:
> I ran "select * from" on both tables. All rows were returned
> successfully, no error logs were produced during the selects.
>
> However there are usually many 23505 errors in indices, like:
> Dec 13 10:02:13 goldbolt postgres[21949]: [26-1]
> user=3Drandirw,db=3Dlovehunter ERROR: =C2=A023505: duplicate key value vi=
olates
> unique constraint "kepek_eredeti_uid_meret_idx"
> Dec 13 10:02:13 goldbolt postgres[21949]: [26-2]
> user=3Drandirw,db=3Dlovehunter LOCATION: =C2=A0_bt_check_unique, nbtinser=
t.c:301
>
> There are many 58P01 errors as well, like:
> Dec 13 10:05:18 goldbolt postgres[7931]: [23-1] user=3Dmunin,db=3Dlovehun=
ter
> ERROR: =C2=A058P01: could not open segment 1 of relation base/16
> 400/19856 (target block 3014766): No such file or directory
> Dec 13 10:05:18 goldbolt postgres[7931]: [23-2] user=3Dmunin,db=3Dlovehun=
ter
> LOCATION: =C2=A0_mdfd_getseg, md.c:1572
> Dec 13 10:05:18 goldbolt postgres[7931]: [23-3] user=3Dmunin,db=3Dlovehun=
ter
> STATEMENT: =C2=A0SELECT count(*) FROM users WHERE nem=3D't'
>
> Reindexing sometimes helps, but the error logs appear again within
> hours.
>

You can have a some hardware problems. Try to check your hardware,
please. Minimum is memory test.

Regards
Pavel Stehule

> Recently a new error appeared:
>
> Dec 13 03:46:55 goldbolt postgres[18628]: [15-1]
> user=3Drandir,db=3Dlovehunter ERROR: =C2=A0XX000: tuple offset out of ran=
ge: 0
> Dec 13 03:46:55 goldbolt postgres[18628]: [15-2]
> user=3Drandir,db=3Dlovehunter LOCATION: =C2=A0tbm_add_tuples, tidbitmap.c=
:286
> Dec 13 03:46:55 goldbolt postgres[18628]: [15-3]
> user=3Drandir,db=3Dlovehunter STATEMENT: =C2=A0SELECT * FROM valogatas WH=
ERE
> uid!=3D'16208' AND eletkor BETWEEN 39 AND 55 AND megyeid=3D'1' AND
> keresettnem=3D'f' AND dom=3D'iwiw.hu' AND appid=3D'2001434963' AND nem=3D=
't'
> ORDER BY random() DESC
>
>
>
> If there is on-disk corruption, would a complete dump and
> restore to an other directory fix it?
>
> Apart from that, I think that pg shouldn't crash in case of
> on-disk corruptions, but log an error message instead.
> I'm sure that it's not that easy to implement as it seems,
> but nothing is impossible :)
>
>
> Regards,
>
> Daniel
>
>
> Tom Lane wrote:
>> Nagy Daniel <nagy.daniel@telekom.hu> writes:
>>> Here's a better backtrace:
>>
>> The crash location suggests a problem with a corrupted tuple, but it's
>> impossible to guess where the tuple came from. =C2=A0In particular I can=
't
>> guess whether this reflects on-disk data corruption or some internal
>> bug. =C2=A0Now that you have (some of) the query, can you put together a=
 test
>> case? =C2=A0Or try "select * from" each of the tables used in the query =
to
>> check for on-disk corruption.
>>
>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 regards, tom lane
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

pgsql-bugs by date:

Previous
From: Nagy Daniel
Date:
Subject: Re: BUG #5238: frequent signal 11 segfaults
Next
From: Tom Lane
Date:
Subject: Re: BUG #5238: frequent signal 11 segfaults