Re: [BUG] Error in BRIN summarization - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: [BUG] Error in BRIN summarization
Date
Msg-id 20200811234321.GA17597@alvherre.pgsql
Whole thread Raw
In response to Re: [BUG] Error in BRIN summarization  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Responses Re: [BUG] Error in BRIN summarization
List pgsql-hackers
On 2020-Jul-30, Anastasia Lubennikova wrote:

> While testing this fix, Alexander Lakhin spotted another problem. I
> simplified  the test case to this:

Ah, good catch.  I think a cleaner way to fix this problem is to just
consider the range as not summarized and return NULL from there, as in
the attached patch.  Running your test case with a telltale WARNING
added at that point, it's clear that it's being hit.

By returning NULL, we're forcing the caller to scan the heap, which is
not great.  But note that if you retry, and your VACUUM hasn't run yet
by the time we go through the loop again, the same thing would happen.
So it seems to me a good enough answer.

A much more troubling thought is what happens if the range is
desummarized, then the index item is used for the summary of a different
range.  Then the index might end up returning corrupt results.

> At first, I tried to fix it by holding the lock on revmap->rm_currBuf until
> we locked the regular page, but it causes a deadlock with brinsummarize(),
> It can be easily reproduced with the same test as above.
> Is there any rule about the order of locking revmap and regular pages in
> brin? I haven't found anything in README.

Umm, I thought that stuff was in the README, but it seems I didn't add
it there.  I think I had a .org file with my notes on that ... must be
in an older laptop disk, because it's not in my worktree for that.  I'll
see if I can fish it out.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Inconsistent behavior of smart shutdown handling for queries with and without parallel workers
Next
From: Andres Freund
Date:
Subject: Re: Improving connection scalability: GetSnapshotData()