Re: GIN data corruption bug(s) in 9.6devel - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: GIN data corruption bug(s) in 9.6devel
Date
Msg-id CAMkU=1xEbup-ARaGOd2AQampJeZApq5WGNF9JHc86fw1CnJmtQ@mail.gmail.com
Whole thread Raw
In response to Re: GIN data corruption bug(s) in 9.6devel  (Teodor Sigaev <teodor@sigaev.ru>)
Responses Re: GIN data corruption bug(s) in 9.6devel  (Teodor Sigaev <teodor@sigaev.ru>)
List pgsql-hackers
On Tue, Apr 12, 2016 at 9:53 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
>
> With pending cleanup patch backend will try to get lock on metapage with
> ConditionalLockPage. Will it interrupt autovacum worker?


Correct, ConditionalLockPage should not interrupt the autovacuum worker.

>>
>> Alvaro's recommendation, to let the cleaner off the hook once it
>> passes the page which was the tail page at the time it started, would
>> prevent any process from getting pinned down indefinitely, but would
>> not prevent the size of the list from increasing without bound.  I
>> think that would probably be good enough, because the current
>> throttling behavior is purely accidentally and doesn't *guarantee* a
>> limit on the size of the pending list.
>
> Added, see attached patch (based on v3.1)

With this applied, I am getting a couple errors I have not seen before
after extensive crash recovery testing:

ERROR:  attempted to delete invisible tuple

ERROR:  unexpected chunk number 1 (expected 2) for toast value
100338365 in pg_toast_16425

I've restarted the test harness with intentional crashes turned off,
to see if the problems are related to crash recovery or are more
generic than that.

I've never seen these particular problems before, so don't have much
insight into what might be going on or how to debug it.

Cheers,

Jeff



pgsql-hackers by date:

Previous
From: Bill Moran
Date:
Subject: Can we improve this error message?
Next
From: Terence Ferraro
Date:
Subject: SSL certificate location