Re: Incorrect result of bitmap heap scan. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Incorrect result of bitmap heap scan.
Date
Msg-id 6nev7zfx57h5pkqpgdp4q37xjcj2dxqilsgbb5ac5lhub4whsb@ap7ptu6yqyj6
Whole thread Raw
In response to Re: Incorrect result of bitmap heap scan.  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Incorrect result of bitmap heap scan.
List pgsql-hackers
Hi,

On 2024-12-02 13:39:43 -0500, Peter Geoghegan wrote:
> I guess it's natural to suspect more recent work -- commit 7c70996e is
> about 6 years old. But I the race condition that I suspect is at play
> here is very narrow.

FWIW, the test I just posted shows the issue down to 11 (although for 11 one
has to remove the (TRUNCATE false). 10 returns correct results.

I don't think the race is particularly narrow. Having a vacuum complete
between the start of the bitmap indexscan and the end of the bitmap heapscan,
leaves a lot of time with an expensive query.


I suspect one contributor to this avoiding attention till now was that the
optimization is fairly narrow:

            /*
             * We can potentially skip fetching heap pages if we do not need
             * any columns of the table, either for checking non-indexable
             * quals or for returning data.  This test is a bit simplistic, as
             * it checks the stronger condition that there's no qual or return
             * tlist at all. But in most cases it's probably not worth working
             * harder than that.
             */
            need_tuples = (node->ss.ps.plan->qual != NIL ||
                           node->ss.ps.plan->targetlist != NIL);

Even an entry in the targetlist that that does not depend on the current row
disables the optimization.

Due to not being able to return any content for those rows, it's also somewhat
hard to actually notice that the results are wrong...



> It's pretty unlikely that there'll be a dead-to-all TID returned to a
> scan (not just dead to our MVCC snapshot, dead to everybody's) that is
> subsequently concurrently removed from the index, and then set
> LP_UNUSED in the heap. It's probably impossible if you don't have a
> small table -- VACUUM just isn't going to be fast enough to get to the
> leaf page after the bitmap index scan, but still be able to get to the
> heap before its corresponding bitmap heap scan (that uses the VM as an
> optimization) can do the relevant visibility checks (while it could
> happen with a large table and a slow bitmap scan, the chances of the
> VACUUM being precisely aligned with the bitmap scan, in just the wrong
> way, seem remote in the extreme).

A cursor, waiting for IO, waiting for other parts of the query, ... can all
make this windows almost arbitrarily large.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Memory leak in WAL sender with pgoutput (v10~)
Next
From: Peter Geoghegan
Date:
Subject: Re: Incorrect result of bitmap heap scan.