Thread: [HACKERS] Parallel Bitmap Heap Scans segfaults due to (tbm->dsa==NULL) onPostgreSQL 10

Hi,

It seems that Q19 from TPC-H is consistently failing with segfaults due
to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL).

I'm not very familiar with how the dsa is initialized and passed around,
but I only see the failures when the bitmap is constructed by a mix of
BitmapAnd and BitmapOr operations.

Another interesting observation is that setting force_parallel_mode=on
may not be enough - there really need to be multiple parallel workers,
which is why the simple query does cpu_tuple_cost=1.

Attached is a bunch of files:

1) details for "full" query:

* query.sql
* plan.txt
* backtrace.txt

2) details for the "minimal" query triggering the issue:

* query-minimal.sql
* plan-minimal.txt
* backtrace-minimal.txt



regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Hi,
>
> It seems that Q19 from TPC-H is consistently failing with segfaults due
> to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL).
>
> I'm not very familiar with how the dsa is initialized and passed around,
> but I only see the failures when the bitmap is constructed by a mix of
> BitmapAnd and BitmapOr operations.
>
I think I have got the issue, bitmap_subplan_mark_shared is not
properly pushing the isshared flag to lower level bitmap index node,
and because of that tbm_create is passing NULL dsa while creating the
tidbitmap.  So this problem will come in very specific combination of
BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the
BitmapOr.  If BitmapIndex is the first subplan under BitmapOr then
there is no problem because BitmapOr node will create the tbm by
itself and isshared is set for BitmapOr.

Attached patch fixing the issue for me.  I will thoroughly test this
patch with other scenario as well.  Thanks for reporting.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

On 10/12/2017 02:40 PM, Dilip Kumar wrote:
> On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> Hi,
>>
>> It seems that Q19 from TPC-H is consistently failing with segfaults due
>> to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL).
>>
>> I'm not very familiar with how the dsa is initialized and passed around,
>> but I only see the failures when the bitmap is constructed by a mix of
>> BitmapAnd and BitmapOr operations.
>>
> I think I have got the issue, bitmap_subplan_mark_shared is not
> properly pushing the isshared flag to lower level bitmap index node,
> and because of that tbm_create is passing NULL dsa while creating the
> tidbitmap.  So this problem will come in very specific combination of
> BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the
> BitmapOr.  If BitmapIndex is the first subplan under BitmapOr then
> there is no problem because BitmapOr node will create the tbm by
> itself and isshared is set for BitmapOr.
> 
> Attached patch fixing the issue for me.  I will thoroughly test this
> patch with other scenario as well.  Thanks for reporting.
> 

Yep, this fixes the failures for me.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

On Thu, Oct 12, 2017 at 6:37 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
>
> On 10/12/2017 02:40 PM, Dilip Kumar wrote:
>> On Thu, Oct 12, 2017 at 4:31 PM, Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>>> Hi,
>>>
>>> It seems that Q19 from TPC-H is consistently failing with segfaults due
>>> to calling tbm_prepare_shared_iterate() with (tbm->dsa==NULL).
>>>
>>> I'm not very familiar with how the dsa is initialized and passed around,
>>> but I only see the failures when the bitmap is constructed by a mix of
>>> BitmapAnd and BitmapOr operations.
>>>
>> I think I have got the issue, bitmap_subplan_mark_shared is not
>> properly pushing the isshared flag to lower level bitmap index node,
>> and because of that tbm_create is passing NULL dsa while creating the
>> tidbitmap.  So this problem will come in very specific combination of
>> BitmapOr and BitmapAnd when BitmapAnd is the first subplan for the
>> BitmapOr.  If BitmapIndex is the first subplan under BitmapOr then
>> there is no problem because BitmapOr node will create the tbm by
>> itself and isshared is set for BitmapOr.
>>
>> Attached patch fixing the issue for me.  I will thoroughly test this
>> patch with other scenario as well.  Thanks for reporting.
>>
>
> Yep, this fixes the failures for me.
>
Thanks for confirming.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

On Thu, Oct 12, 2017 at 9:14 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> Yep, this fixes the failures for me.
>>
> Thanks for confirming.

Committed and back-patched to v10.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers