Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processingBRIN indexes in VACUUM - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processingBRIN indexes in VACUUM
Date
Msg-id 083d996a-4a8a-0e13-800a-851dd09ad8cc@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processingBRIN indexes in VACUUM  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processingBRIN indexes in VACUUM  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Hi,

On 10/31/2017 08:46 PM, Tom Lane wrote:
> I wrote:
>> maybe
>> we just have some run-of-the-mill bugs to find, like the off-the-end
>> bug I spotted in brin_doupdate.  There's apparently at least one
>> more, but given the error message it must be something like not
>> checking for a page to have turned into a revmap page.  Shouldn't
>> be too hard to find...
> 
> Actually, I think it might be as simple as the attached.
> brin_getinsertbuffer checks for the old page having turned into revmap,
> but the "samepage" path in brin_doupdate does not :-(
> 
> With this applied, Alvaro's version of the test case has survived
> without error for quite a bit longer than its former MTBF.  There
> might still be some issues though in other code paths.
> 

That does fix the crashes for me - I've been unable to reproduce any
even after one hour (it took a couple of minutes to crash before).

Unfortunately, I think we still have a problem ... I've been wondering
if we end up producing correct indexes, so I've done a simple test.

1) create the table as before

2) let the insert + vacuum run for some time, to see if there are
crashes (result: no crashes after one hour, inserting ~92M rows)

3) do a bunch of random updates on the data (while still doing the
concurrent vacuum in another session)

4) run a bunch of simple queries to compare the results, essentially

   -- BRIN index
   SET enable_bitmapscan = on;
   SELECT COUNT(*) FROM brin_test WHERE a = $1;


   -- seq scan
   SET enable_bitmapscan = on;
   SELECT COUNT(*) FROM brin_test WHERE a = $1;

and unfortunately what I get is not particularly pleasant:

    test=# set enable_bitmapscan = on;
    SET
    test=# select count(*) from brin_test where a = 0;
     count
    -------
      9062
    (1 row)

    test=# set enable_bitmapscan = off;
    SET
    test=# select count(*) from brin_test where a = 0;
     count
    -------
      9175
    (1 row)

Attached is a SQL script with commands I used. You'll need to copy the
commands into multiple psql sessions, though, to simulate concurrent
activity).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors
Next
From: "David G. Johnston"
Date:
Subject: Re: [HACKERS] PostgreSQL 10 parenthesized single-column updates canproduce errors