Home > mailing lists

Re: Fix size estimation for parallel B-Tree scans with skip arrays - Mailing list pgsql-bugs

From	Peter Geoghegan
Subject	Re: Fix size estimation for parallel B-Tree scans with skip arrays
Date	April 29 19:08:12
Msg-id	CAH2-Wz=Z3+rgpAU=SxQCZFGjXvJU6f07zci4HYfuqrVNOT8dHQ@mail.gmail.com Whole thread
In response to	Re: Fix size estimation for parallel B-Tree scans with skip arrays (Tomas Vondra <tomas@vondra.me>)
List	pgsql-bugs

Tree view

On Wed, Apr 29, 2026 at 11:42 AM Tomas Vondra <tomas@vondra.me> wrote:
> Thanks for the report. I'm able to reproduce the crash using your
> reproducer script. At first I've been confused why you need a BRIN index
> when this report is about btree, but I suppose that's just to force a
> parallel index scan. There are easier ways to do that, though, e.g. by
> increasing cpu_tuple_cost. Then it's enough to query just the one rel.

I pushed the fix a short while ago, but didn't include the tests. I
don't think that the added test cycles would have paid for themselves.

> How did you discover this issue? I don't think anyone else reported such
> crashes, so presumably it's not quite common.

There were many ways that this issue could accidentally fail to fail.
For example, if even one of the skip arrays happened to be on a text
column, there'd almost certainly have been no crash. In general we're
very conservative about the space we request. We have to be, because
the request is made only once, long before we really know what nbtree
preprocessing will do/how many arrays it'll output.

> It does fix it for me, but I don't know enough about the skip scan
> internals to say if the fix is right.

_bt_parallel_serialize_arrays assumes that btscan->btps_arrElems[] has
so->numArrayKeys[]-many elements -- with and without the fix. The
easiest way to see that the fix is correct is by noticing that
_bt_parallel_serialize_arrays expects a certain layout in shared
memory that btestimateparallelscan wasn't fully handling.

When btestimateparallelscan estimated the amount of shared memory that
the scan will require, it previously neglected to account for how skip
arrays could contribute to the size of so->numArrayKeys[]. With
Siddharth's fix in place, we conservatively assume that preprocessing
will add the maximum possible number of skip arrays/use the largest
possible so->numArrayKeys[]/so->numArrayKeys when we determine
btscan->btps_arrElems[] space overhead. Making
_bt_parallel_serialize_arrays agree with btestimateparallelscan.

> Is there something we could do to deal with this class of bugs (buffer
> overflow in shared memory)? For buffers in private memory we have tools
> like valgrind and sentinels to make these issues more obvious, but for
> shared memory that's not the case ... :-(

I'm not sure that Valgrind style instrumentation would have actually
caught this issue.

As I said, our conservative approach could mask the issue in many
ways. Plus the test case involved an index with the maximum 32 index
columns, and an input scan key on the very last index column, which is
obviously very atypical.

--
Peter Geoghegan

pgsql-bugs by date:

From: Tomas Vondra
Date: 29 April, 18:42:55
Subject: Re: Fix size estimation for parallel B-Tree scans with skip arrays

From: Masahiko Sawada
Date: 29 April, 19:11:32
Subject: Re: TRAP: failed Assert("offsets[i] > offsets[i - 1]"), File: "tidstore.c"

Re: Fix size estimation for parallel B-Tree scans with skip arrays - Mailing list pgsql-bugs

Previous

Next