Re: BRIN INDEX value - Mailing list pgsql-hackers

From Amit Langote
Subject Re: BRIN INDEX value
Date
Msg-id 55E82CF9.7000402@lab.ntt.co.jp
Whole thread Raw
In response to BRIN INDEX value  (Tatsuo Ishii <ishii@postgresql.org>)
Responses Re: BRIN INDEX value  (Tatsuo Ishii <ishii@postgresql.org>)
Re: BRIN INDEX value  (Tatsuo Ishii <ishii@postgresql.org>)
List pgsql-hackers
On 9/3/2015 5:49 PM, Tatsuo Ishii wrote:
>
> However I inserted data *after* creating index, the value is
> different.
> VACUUM;
> VACUUM
> SELECT * FROM brin_revmap_data(get_raw_page('brinidx', 1)) WHERE pages != '(0,0)'::tid;
>  pages
> -------
>  (2,1)
>  (2,2)
>  (2,3)
>  (2,4)
> (4 rows)
>
> SELECT * FROM brin_page_items(get_raw_page('brinidx', 2), 'brinidx');
>  itemoffset | blknum | attnum | allnulls | hasnulls | placeholder |      value
> ------------+--------+--------+----------+----------+-------------+------------------
>           1 |      0 |      1 | f        | f        | f           | {1 .. 28928}
>           2 |    128 |      1 | f        | f        | f           | {28929 .. 57856}
>           3 |    256 |      1 | f        | f        | f           | {57857 .. 86784}
>           4 |    384 |      1 | f        | f        | f           | {1 .. 100000}
> (4 rows)
> ===============================================================
>
> How the index value for block 384 could be {1 .. 100000}?
>

The summarization during VACUUM invokes IndexBuildHeapRangeScan() which is
passed scanStartBlock and scanNumBlocks. If scanStartBlock + scanNumBlocks
> heapTotalBlocks, further down the line, heapgettup() may start returning
tuples from the beginning given the following code in it:

  page++;
  if (page >= scan->rs_nblocks)
      page = 0;

  finished = (page == scan->rs_startblock) ||
               (scan->rs_numblocks != InvalidBlockNumber ?
                 --scan->rs_numblocks == 0 :
                 false);

Where finished indicates whether it thinks the end of heap is reached.

In this case, scan->rs_startblock is 384 set by IndexBuildHeapRangeScan()
using heap_setscanlimits(). One can imagine how the above heap finish
criteria might not work as expected.

That helps explain why 1 becomes the min for that brin tuple.

Attached hack fixes the symptom but perhaps not the correct fix for this.

Thanks,
Amit

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Can pg_dump make use of CURRENT/SESSION_USER
Next
From: Fujii Masao
Date:
Subject: Re: GIN pending clean up is not interruptable