Re: FlushRelationBuffers error - Mailing list pgsql-hackers

From Gaetano Mendola
Subject Re: FlushRelationBuffers error
Date
Msg-id 415C3ABB.4070206@bigfoot.com
Whole thread Raw
In response to Re: FlushRelationBuffers error  (Jan Wieck <JanWieck@Yahoo.com>)
List pgsql-hackers
Jan Wieck wrote:> Any chance for bad memory?>

I'll say near 0. However who never knows ? Now the server is again up and
running without glitches.

I suspect a race condition somewhere for the reindex operation.

I had with the engine 7.3 ( see in the archives ) a duplicate error during
reindexes at least one each month, for instance the server was another one,
and at that time I solved it not reindexing the DB daily ( so I decreased the
chances ).

With the 7.4 is the first time, since November 2003, that I see this error
( and for coincidence during a reindex too ) so I suspect that the race condition
is still there but with less chance to pops up.

Is it so dangerous teach the postmaster to solve this kind of problems without
a direct user intervention ?

Regards
Gaetano Mendola






> On 9/30/2004 6:16 AM, Gaetano Mendola wrote:>>> Hi all,>> I'm running postgres 7.4.5 on a linux box, this morning I
gotthis>> error on my logs:>>>> WARNING:  FlushRelationBuffers("exp_provider", 1836): block 1460 is>> referenced
(private0, global 1)>> ERROR:  FlushRelationBuffers returned -2>> DEBUG:  AbortCurrentTransaction>> PANIC:  cannot
aborttransaction 354676201, it was already committed>>>> after the recovery:>>>> ERROR:  could not access status of
transaction352975274>> DEBUG:  AbortCurrentTransaction>>>> this messages for 5 hours>>>>>>>> I had my verbosity equal
toterse ( I run the server with debug2 level>> ) so I didn't see the>> exactly reason for this, after putting verbosity
to"verbose" I got>> the entire message:>>>> ERROR:  58P01: could not access status of transaction 352975274>> DETAIL:
couldnot open file "/var/lib/pgsql/data/pg_clog/0150": No>> such file or directory>> LOCATION:  SlruReportIOError,
slru.c:609>>DEBUG:  00000: AbortCurrentTransaction>> LOCATION:  PostgresMain, postgres.c:2721>>>> In the pg_clog
directoryI had only the  file   0152 !>>>>>> I had to create a 8k file with zeroes and I discover the offset:>>>>
ERROR: XX000: could not access status of transaction 352975274>> DETAIL:  could not read from file
"/var/lib/pgsql/data/pg_clog/0150">>at offset 155648: Success>> LOCATION:  SlruReportIOError, slru.c:630>> DEBUG:
00000:AbortCurrentTransaction>> LOCATION:  PostgresMain, postgres.c:2721>>>> After creating that file till to cover
thatoffset the problem seems>> be fixed.>>>> Info for hackers: exp_provider is an index and during that message a>>
reindexwas in place.>>>> Some questions:>> What about the 0151  file?>> Don't you think that even with verbosity terse
themessage about the>> file missing shall appear ?>> Why emit the offset only if the file was found ?>>>> I have to
thankNeil Conway that was helping me on IRC about this error.>>>> If you need further infos, please let me know.>>>>
Regards>>Gaetano Mendola>>>>>> ---------------------------(end of broadcast)--------------------------->> TIP 3: if
posting/readingthrough Usenet, please send an appropriate>>       subscribe-nomail command to majordomo@postgresql.org
sothat your>>       message can get through to the mailing list cleanly>>>
 


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: FlushRelationBuffers error
Next
From: Tom Lane
Date:
Subject: More pgindent bizarreness