Re: Fix for parallel BTree initialization bug - Mailing list pgsql-hackers

From Jameson, Hunter 'James'
Subject Re: Fix for parallel BTree initialization bug
Date
Msg-id C8E3188F-9AE8-4F99-85B1-8F35293536A1@amazon.com
Whole thread Raw
In response to Re: Fix for parallel BTree initialization bug  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi, I spent some time trying to create a repro (other than testing it on the production instance where we encountered
thebug), but was unable to create one within a reasonable time.
 

The tricky part is that the bug symptoms are run-time symptoms -- so not only do you need, first, to satisfy conditions
(1),(2), and (3), without the query optimizer optimizing them away! -- but you also need, second, a query that runs
longenough for one or more of the parallel workers' state machines to get confused. (This wasn't a problem on the
productioninstance where we encountered the bug and I tested the fix.)
 

Also, third-- passing InvalidBlockNumber to ReadBuffer() generally just appends a new block to the relation, so the bug
doesn'teven result in an error condition on an RW instance. (The production instance was RO...) So the bug, although
verysmall!, is annoying!
 

James

On 9/9/20, 6:14 AM, "Amit Kapila" <amit.kapila16@gmail.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you
canconfirm the sender and know the content is safe.
 



    On Tue, Sep 8, 2020 at 11:55 PM Jameson, Hunter 'James'
    <hunjmes@amazon.com> wrote:
    >
    > Hi, I ran across a small (but annoying) bug in initializing parallel BTree scans, which causes the parallel-scan
statemachine to get confused.
 
    >
    >
    > To reproduce, you need a query that:
    >
    >
    >
    > 1. Executes parallel BTree index scan;
    >
    > 2. Has an IN-list of size > 1;
    >
    > 3. Has an additional index filter that makes it impossible to satisfy the
    >
    >     first IN-list condition.
    >
    >
    >
    > (We encountered such a query, and therefore the bug, on a production instance.)
    >
    >

    I think I can understand what you are pointing out here but it would
    be great if you can have a reproducible test case because that will
    make it apparent and we might want to include that in the regression
    tests if possible.

    --
    With Regards,
    Amit Kapila.


pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Minor fixes for upcoming version 13
Next
From: Stephen Frost
Date:
Subject: Re: More aggressive vacuuming of temporary tables