Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows - Mailing list pgsql-bugs
From | Peter Geoghegan |
---|---|
Subject | Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows |
Date | |
Msg-id | CAH2-WzkGLtffpGJoSp+cpN_q4VP9eF3-BhjZ+YxgAQa=O1niXA@mail.gmail.com Whole thread Raw |
In response to | Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: IRe: BUG #16792: silent corruption of GIN index resulting in SELECTs returning non-matching rows
|
List | pgsql-bugs |
On Fri, Jul 16, 2021 at 5:30 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Hmm, seems we should fix that. But could a prematurely recycled deleted > page cause permanent corruption? If scans can find a page that is wholly unrelated to the expected page (and possibly even in the wrong high level page category), then it's really hard to predict what might happen. This could lead to real chaos. ginInsertCleanup() makes no attempt to perform basic validation of its assumptions about what kind of page this is, except for some assertions. We should have something like a "can't happen" error on !GinPageIsList() inside ginInsertCleanup() -- if we had that already then I might be able to reason about this problem. It wouldn't hurt to have similar checks in other code that deals with posting trees and entry trees, too. ginInsertCleanup() is tolerant of all kinds of things. It's not just the lack of page-level sanity checks. It's also the basic approach to crash safety, which relies on the fact that GIN only does lossy index scans. My guess is that there could be lots of problems without it being obvious to users. Things really went downhill in ginInsertCleanup() starting in commit e956808328. > On this page, the DATA flag is set, so it is an internal *posting* tree > page. > > That's weird: the scan walked straight from an internal entry tree page > (root, at blk 1) into an internal posting tree page (blk 1452). That > doesn't make sense to me. I agree that the internal entry tree page (root, at blk 1) looks sane, from what I've seen. The tuple sizes are plausible -- 16 byte index tuples aren't possible on an entry tree leaf page. Nor in a pending list page. Anyway, this is roughly the kind of bug I had in mind. It's possible that the underlying problem doesn't actually involve ginInsertCleanup() -- as I said we have seen similar issues elsewhere (one such issue was fixed in commit 52ac6cd2d0). But as Alexander pointed out, that doesn't mean much. It's possible that this problem is 1 or 2 problems removed from the original problem, which really did start in ginInsertCleanup() -- who knows? Why shouldn't corruption lead to more corruption, given that we don't do much basic page level validation? We do at least sanitize within ginStepRight(), but we need to be more consistent about it. > The next ReadBuffer call is this: > > > 2021-07-16 07:01:19 UTC LOG: ReadBuffer 1663/16390/16526 read gin blk 15559 (ginbtree.c:183 ginStepRight) > > Where did block 15559 come from? How come we're stepping right to it? > It's not the right sibling of the previously accessed page, 1452. In > fact, 15559 is a leaf posting tree page. I don't understand how that > sequence of page reads could happen. Maybe take a look at Block 1452 using pg_hexedit? pg_hexedit is designed to do well at interpreting quasi-corrupt data (or at least allowing the user to do so). We see from your pg_filedump output that the tuple contents for the page are totally wild. We should not trust the reported right sibling page, given everything else -- is that really what Postgres thinks the right sibling is? I mean, clearly it doesn't. I think it's possible that pg_filedump is interpreting it in a way that is kind of wrong. If you saw the same page (1452) in pg_hexedit you might spot a pattern that pg_filedump output will never reveal. At least looking at the raw bytes might give you some idea. -- Peter Geoghegan
pgsql-bugs by date: