[HACKERS] Parallel Bitmap scans a bit broken - Mailing list pgsql-hackers

From David Rowley
Subject [HACKERS] Parallel Bitmap scans a bit broken
Date
Msg-id CAKJS1f8OtrHE+-P+=E=4ycnL29e9idZKuaTQ6o2MbhvGN9D8ig@mail.gmail.com
Whole thread Raw
Responses Re: [HACKERS] Parallel Bitmap scans a bit broken  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
I was just doing some testing on [1] when I noticed that there's a problem with parallel bitmap index scans scans.

Test case:

patch with [1]

=# create table r1(value int);
CREATE TABLE
=# insert into r1 select (random()*1000)::int from generate_Series(1,1000000);
INSERT 0 1000000
=# create index on r1 using brin(value);
CREATE INDEX
=# set enable_seqscan=0;
SET
=# explain select * from r1 where value=555;
                                       QUERY PLAN                                        
-----------------------------------------------------------------------------------------
 Gather  (cost=3623.52..11267.45 rows=5000 width=4)
   Workers Planned: 2
   ->  Parallel Bitmap Heap Scan on r1  (cost=2623.52..9767.45 rows=2083 width=4)
         Recheck Cond: (value = 555)
         ->  Bitmap Index Scan on r1_value_idx  (cost=0.00..2622.27 rows=522036 width=0)
               Index Cond: (value = 555)
(6 rows)

=# explain analyze select * from r1 where value=555;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

The crash occurs in tbm_shared_iterate() at:

PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];


I see in tbm_prepare_shared_iterate() tbm->npages is zero. I'm unsure if bringetbitmap() does something different with npages than btgetbitmap() around setting npages?

But anyway, due to the npages being 0 the tbm->ptpages is not allocated in tbm_prepare_shared_iterate()

if (tbm->npages)
{
tbm->ptpages = dsa_allocate(tbm->dsa, sizeof(PTIterationArray) +
tbm->npages * sizeof(int));

so when tbm_shared_iterate runs this code;

/*
* If both chunk and per-page data remain, must output the numerically
* earlier page.
*/
if (istate->schunkptr < istate->nchunks)
{
PagetableEntry *chunk = &ptbase[idxchunks[istate->schunkptr]];
PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
BlockNumber chunk_blockno;

chunk_blockno = chunk->blockno + istate->schunkbit;

if (istate->spageptr >= istate->npages ||
chunk_blockno < page->blockno)
{
/* Return a lossy page indicator from the chunk */
output->blockno = chunk_blockno;
output->ntuples = -1;
output->recheck = true;
istate->schunkbit++;

LWLockRelease(&istate->lock);
return output;
}
}

it fails, due to idxpages pointing to random memory

Probably this is a simple fix for the authors, so passing it along. I'm a bit unable to see how the part above is meant to work.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Write Ahead Logging for Hash Indexes
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] [bug fix] dblink leaks unnamed connections