Re: BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan - Mailing list pgsql-bugs

From Maxim Boguk
Subject Re: BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan
Date
Msg-id CAK-MWwRVVsn0zfgdvzcyBKbWkgf5CQKrCD5YD8LqRQaLGNchwg@mail.gmail.com
Whole thread Raw
In response to BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan  (maxim.boguk@gmail.com)
Responses Re: BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
> gate_platbox=# set enable_indexscan to 0;
> SET
> (force bitmap scan)
> gate_platbox=# explain analyze select * from transactions where
> id=53265020;
> ERROR:  tuple offset out of range: 0
>
> This tuple had been frozen not that long time ago:
>
> gate_platbox=# select id,xmin,xmax,cmin,cmax,ctid from transactions where
> id=53265020;
>     id    | xmin | xmax | cmin | cmax |    ctid
> ----------+------+------+------+------+-------------
>  53265020 |    2 |    0 |    1 |    1 | (2413168,1)
>
> pageinspect over used index also show nothing suspicious:
> select gs.i,t.* from (select generate_series(1,94961) as i) as gs,
> bt_page_items('transactions_pkey', i) as t where ctid::text='(2413168,1)';
>    i   | itemoffset |    ctid     | itemlen | nulls | vars |          data
>
> -------+------------+-------------+---------+-------+------+-------------------------
>  88472 |         93 | (2413168,1) |      16 | f     | f    | 7c c2 2c 03 00
> 00 00 00
>
> Any ideas what's can be wrong with the database?
>

Some playing with GDB end me with the  btgetbitmap function in nbtree.c,

And with the next content of the
BTScanOpaque so = (BTScanOpaque) scan->opaque
structure:

(gdb) p *(BTScanOpaque) scan->opaque
$21 = {
  qual_ok = 1 '\001',
  numberOfKeys = 1,
  keyData = 0x7f099afaad80,
  arrayKeyData = 0x0,
  numArrayKeys = 0,
  arrayKeys = 0x0,
  arrayContext = 0x0,
  killedItems = 0x0,
  numKilled = 0,
  currTuples = 0x0,
  markTuples = 0x0,
  markItemIndex = -1,
  currPos = {
    buf = 137250,
    nextPage = 88473,
    moreLeft = 0 '\000',
    moreRight = 0 '\000',
    nextTupleOffset = 0,
    firstItem = 0,
    lastItem = 1,
    itemIndex = 1,
    items =       {[0] = {
        heapTid = {
          ip_blkid = {
            bi_hi = 36,
            bi_lo = 53872
          },
          ip_posid = 1
        },
        indexOffset = 93,
        tupleOffset = 0
      },
      [1] = {
        heapTid = {
          ip_blkid = {
            bi_hi = 35,
            bi_lo = 44171
          },
          ip_posid = 0
        },
        indexOffset = 94,
        tupleOffset = 0
      },
...

It could be seen that there some weird second element in items array with
ip_posid = 0.

Now that lead me to check the next entry in transactions_pkey index (with
offset=94):

select * from bt_page_items('transactions_pkey', 88472) where itemoffset in
(93,94);
 itemoffset |    ctid     | itemlen | nulls | vars |          data
------------+-------------+---------+-------+------+-------------------------
         93 | (2413168,1) |      16 | f     | f    | 7c c2 2c 03 00 00 00 00
         94 | (2337931,0) |      16 | f     | f    | 7c c2 2c 03 00 00 00 00

Ok so now we have a two entry for the same value in the index.
Index on the master database have the same content.

But the second entry very likely not visible on master db.

So it confirm my idea that there some permanent corruption of visibility
map on replicas (and probably on master database as well).

What's strange that the problem could not be fixed via vacuum freeze with
vacuum_freeze_table_age=0.

Any suggestions what I can do next?
Unfortunately dump/restore the database not an issue at this moment (it
huge project with strict online read-write requirements).

Is there any way force rebuild a visibility map for the table without
access exclusive lock (e.g. without cluster/vacuum full)?

I going to try upgrade database to 9.3.4 today and see is it fix an issue.
But I see nothing about visibility map on change notes.

Kind Regards,
Maksym

pgsql-bugs by date:

Previous
From: maxim.boguk@gmail.com
Date:
Subject: BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan
Next
From: Tom Lane
Date:
Subject: Re: BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan