Thread: Re: Can rs_cindex be < 0 for bitmap heap scans?
On Thu, Oct 24, 2024 at 3:45 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
Hi,
HeapScanDescData->rs_cindex (the current index into the array of
visible tuples stored in the heap scan descriptor) is used for
multiple scan types, but for bitmap heap scans, AFAICT, it should
never be < 0.
As such, I find this test in heapam_scan_bitmap_next_tuple() pretty confusing.
if (hscan->rs_cindex < 0 || hscan->rs_cindex >= hscan->rs_ntuples)
I know it seems innocuous, but I find it pretty distracting. Am I
missing something? Could it somehow be < 0?
If not, I propose removing that part of the if statement like in the
attached patch.
You are right it can never be < 0 in this case at least. In fact you don't need to explicitly set it to 0 in initscan[1], because before calling heapam_scan_bitmap_next_tuple() we must call heapam_scan_bitmap_next_block() and this function is initializing this to 0 (hscan->rs_cindex = 0;). Anyway no object even if you prefer to initialize in initscan(), just the point is that we are already doing it for each block before fetching tuples from the block.
@@ -378,6 +378,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
ItemPointerSetInvalid(&scan->rs_ctup.t_self);
scan->rs_cbuf = InvalidBuffer;
scan->rs_cblock = InvalidBlockNumber;
+ scan->rs_cindex = 0;
ItemPointerSetInvalid(&scan->rs_ctup.t_self);
scan->rs_cbuf = InvalidBuffer;
scan->rs_cblock = InvalidBlockNumber;
+ scan->rs_cindex = 0;
Thanks for the reply, Dilip! On Thu, Oct 24, 2024 at 12:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Oct 24, 2024 at 3:45 AM Melanie Plageman <melanieplageman@gmail.com> wrote: >> >> HeapScanDescData->rs_cindex (the current index into the array of >> visible tuples stored in the heap scan descriptor) is used for >> multiple scan types, but for bitmap heap scans, AFAICT, it should >> never be < 0. > > You are right it can never be < 0 in this case at least. Cool, thanks for confirming. I will commit this change then. > In fact you don't need to explicitly set it to 0 in initscan[1], because before calling heapam_scan_bitmap_next_tuple()we must call heapam_scan_bitmap_next_block() and this function is initializing this to 0 (hscan->rs_cindex= 0;). Anyway no object even if you prefer to initialize in initscan(), just the point is that we are alreadydoing it for each block before fetching tuples from the block. I am inclined to still initialize it to 0 in initscan(). I was refactoring the bitmap heap scan code to use the read stream API and after moving some things around, this field was no longer getting initialized in heapam_scan_bitmap_next_block(). While that may not be a good reason to change it in this patch, most of the other fields in the heap scan descriptor (like rs_cbuf and rs_cblock) are re-initialized in initscan(), so it seems okay to do that there even though it isn't strictly necessary on master. - Melanie
On Thu, Oct 24, 2024 at 7:11 PM Melanie Plageman <melanieplageman@gmail.com> wrote:
Thanks for the reply, Dilip!
On Thu, Oct 24, 2024 at 12:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Oct 24, 2024 at 3:45 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
>>
>> HeapScanDescData->rs_cindex (the current index into the array of
>> visible tuples stored in the heap scan descriptor) is used for
>> multiple scan types, but for bitmap heap scans, AFAICT, it should
>> never be < 0.
>
> You are right it can never be < 0 in this case at least.
Cool, thanks for confirming. I will commit this change then.
+1
> In fact you don't need to explicitly set it to 0 in initscan[1], because before calling heapam_scan_bitmap_next_tuple() we must call heapam_scan_bitmap_next_block() and this function is initializing this to 0 (hscan->rs_cindex = 0;). Anyway no object even if you prefer to initialize in initscan(), just the point is that we are already doing it for each block before fetching tuples from the block.
I am inclined to still initialize it to 0 in initscan(). I was
refactoring the bitmap heap scan code to use the read stream API and
after moving some things around, this field was no longer getting
initialized in heapam_scan_bitmap_next_block(). While that may not be
a good reason to change it in this patch, most of the other fields in
the heap scan descriptor (like rs_cbuf and rs_cblock) are
re-initialized in initscan(), so it seems okay to do that there even
though it isn't strictly necessary on master.
On Fri, Oct 25, 2024 at 10:07 PM Melanie Plageman <melanieplageman@gmail.com> wrote:
On Fri, Oct 25, 2024 at 10:29 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Tom suggested off-list that if rs_cindex can't be zero, then it should
> be unsigned. I checked the other scan types using the
> HeapScanDescData, and it seems none of them use values of rs_cindex or
> rs_ntuples < 0. As such, here is a patch making both rs_ntuples and
> rs_cindex unsigned.
{
HeapTuple tuple = &(scan->rs_ctup);
Page page;
- int lineindex;
- int linesleft;
+ uint32 lineindex;
+ uint32 linesleft;
IMHO we can't make "lineindex" as uint32, because just see the first code block[1] of heapgettup_pagemode(), we use this index as +ve (Forward scan )as well as -ve (Backward scan).
[1]
if (likely(scan->rs_inited))
{
/* continue from previously returned page/tuple */
page = BufferGetPage(scan->rs_cbuf);
lineindex = scan->rs_cindex + dir;
if (ScanDirectionIsForward(dir))
{
/* continue from previously returned page/tuple */
page = BufferGetPage(scan->rs_cbuf);
lineindex = scan->rs_cindex + dir;
if (ScanDirectionIsForward(dir))
--Refer definition of ScanDirection
typedef enum ScanDirection
{
BackwardScanDirection = -1,
NoMovementScanDirection = 0,
ForwardScanDirection = 1
} ScanDirection;
{
BackwardScanDirection = -1,
NoMovementScanDirection = 0,
ForwardScanDirection = 1
} ScanDirection;
On Sat, Oct 26, 2024 at 12:17 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Oct 25, 2024 at 10:07 PM Melanie Plageman <melanieplageman@gmail.com> wrote: >> >> On Fri, Oct 25, 2024 at 10:29 AM Melanie Plageman >> <melanieplageman@gmail.com> wrote: >> > >> > Tom suggested off-list that if rs_cindex can't be zero, then it should >> > be unsigned. I checked the other scan types using the >> > HeapScanDescData, and it seems none of them use values of rs_cindex or >> > rs_ntuples < 0. As such, here is a patch making both rs_ntuples and >> > rs_cindex unsigned. >> > > @@ -943,8 +945,8 @@ heapgettup_pagemode(HeapScanDesc scan, > { > HeapTuple tuple = &(scan->rs_ctup); > Page page; > - int lineindex; > - int linesleft; > + uint32 lineindex; > + uint32 linesleft; > > IMHO we can't make "lineindex" as uint32, because just see the first code block[1] of heapgettup_pagemode(), we use thisindex as +ve (Forward scan )as well as -ve (Backward scan). > > [1] > if (likely(scan->rs_inited)) > { > /* continue from previously returned page/tuple */ > page = BufferGetPage(scan->rs_cbuf); > > lineindex = scan->rs_cindex + dir; > if (ScanDirectionIsForward(dir)) > > --Refer definition of ScanDirection > typedef enum ScanDirection > { > BackwardScanDirection = -1, > NoMovementScanDirection = 0, > ForwardScanDirection = 1 > } ScanDirection; Yes, so in the case of backwards scan, if scan->rs_cindex is 0, when dir is -1, lineindex will wrap around, but we don't use it in that case because linesleft will be 0 and then we will not meet the conditions to execute the code in the loop under continue_page. And in the loop, when adding -1 to lineindex, for backwards scan, it always starts at linesleft and ticks down to 0. We could add an if statement above the goto that says something like if (linesleft > 0) goto continue_page; Would that make it clearer? - Melanie
On Sat, Oct 26, 2024 at 5:30 PM Melanie Plageman <melanieplageman@gmail.com> wrote:
On Sat, Oct 26, 2024 at 12:17 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Oct 25, 2024 at 10:07 PM Melanie Plageman <melanieplageman@gmail.com> wrote:
>>
>> On Fri, Oct 25, 2024 at 10:29 AM Melanie Plageman
>> <melanieplageman@gmail.com> wrote:
>> >
>> > Tom suggested off-list that if rs_cindex can't be zero, then it should
>> > be unsigned. I checked the other scan types using the
>> > HeapScanDescData, and it seems none of them use values of rs_cindex or
>> > rs_ntuples < 0. As such, here is a patch making both rs_ntuples and
>> > rs_cindex unsigned.
>>
>
> @@ -943,8 +945,8 @@ heapgettup_pagemode(HeapScanDesc scan,
> {
> HeapTuple tuple = &(scan->rs_ctup);
> Page page;
> - int lineindex;
> - int linesleft;
> + uint32 lineindex;
> + uint32 linesleft;
>
> IMHO we can't make "lineindex" as uint32, because just see the first code block[1] of heapgettup_pagemode(), we use this index as +ve (Forward scan )as well as -ve (Backward scan).
Yes, so in the case of backwards scan, if scan->rs_cindex is 0, when
dir is -1, lineindex will wrap around, but we don't use it in that
case because linesleft will be 0 and then we will not meet the
conditions to execute the code in the loop under continue_page. And in
the loop, when adding -1 to lineindex, for backwards scan, it always
starts at linesleft and ticks down to 0.
Yeah you are right, although the lineindex will wrap around when rs_cindex is 0 , it would not be used. So, it won’t actually cause any issues, but I’m not comfortable with the unintentional wraparound. I would have left "scan->rs_cindex" as int itself but I am fine with whatever you prefer.
We could add an if statement above the goto that says something like
if (linesleft > 0)
goto continue_page;
Would that make it clearer?
Not sure it would make it clearer. In fact, In common cases it would add an extra instruction to check the if (linesleft > 0).
On Mon, Oct 28, 2024 at 2:27 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sat, Oct 26, 2024 at 5:30 PM Melanie Plageman <melanieplageman@gmail.com> wrote: >> >> Yes, so in the case of backwards scan, if scan->rs_cindex is 0, when >> dir is -1, lineindex will wrap around, but we don't use it in that >> case because linesleft will be 0 and then we will not meet the >> conditions to execute the code in the loop under continue_page. And in >> the loop, when adding -1 to lineindex, for backwards scan, it always >> starts at linesleft and ticks down to 0. > > > Yeah you are right, although the lineindex will wrap around when rs_cindex is 0 , it would not be used. So, it won’t actuallycause any issues, but I’m not comfortable with the unintentional wraparound. I would have left "scan->rs_cindex"as int itself but I am fine with whatever you prefer. So, I don't see -1 as different from the wrapped around value with an unsigned data type. Also neither is a valid number of entries in rs_vistuples (which is limited to MaxHeapTuplesPerPage). That being said, I can see how having lineindex be an invalid value could be confusing -- either with an unsigned or signed data type for rs_cindex. I think to make this clear we would have to add another special case for backwards and no movement scan. For now, I've committed the version of the patch I proposed above (v3). >> We could add an if statement above the goto that says something like >> if (linesleft > 0) >> goto continue_page; >> >> Would that make it clearer? > > > Not sure it would make it clearer. In fact, In common cases it would add an extra instruction to check the if (linesleft> 0). Yes, indeed. Thank you for taking a look and being so responsive on this thread. - Melanie
On Wed, Dec 18, 2024 at 1:23 PM Ranier Vilela <ranier.vf@gmail.com> wrote: > > Hi. > > Em qua., 18 de dez. de 2024 às 14:01, Melanie Plageman <melanieplageman@gmail.com> escreveu: >> >> For now, I've committed the version of the patch I proposed above (v3). > > What happens if *rs_tuples* equal to zero in function *SampleHeapTupleVisible*? > end = hscan->rs_ntuples - 1; Ah yes, it seems like just changing the top if statement to if (scan->rs_flags & SO_ALLOW_PAGEMODE && hscan->rs_ntuples > 0) Thanks for identifying this. > Would be good to fix the binary search too, now that unsigned types are used. You just mean the one in SampleHeapTupleVisible(), right? > Patch attached. I'm not sure the attached patch is quite right because if rs_ntuples is 0, it should still enter the first if statement and then return false. In fact, with your patch, I think we would incorrectly not return a value when rs_ntuples is 0 from SampleHeapTupleVisible(). How about one of these options: option 1: most straightforward fix diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index d0e5922eed7..fa03bedd4b8 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -2577,6 +2577,11 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, if (scan->rs_flags & SO_ALLOW_PAGEMODE) { + int start, end; + + if (hscan->rs_ntuples == 0) + return false; + /* * In pageatatime mode, heap_prepare_pagescan() already did visibility * checks, so just look at the info it left in rs_vistuples[]. @@ -2586,8 +2591,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, * in increasing order, but it's not clear that there would be enough * gain to justify the restriction. */ - int start = 0, - end = hscan->rs_ntuples - 1; + start = 0; + end = hscan->rs_ntuples - 1; while (start <= end) { option 2: change the binary search code a bit more but avoid the extra branch introduced by option 1. diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index d0e5922eed7..c8e25bdd197 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -2586,18 +2586,17 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, * in increasing order, but it's not clear that there would be enough * gain to justify the restriction. */ - int start = 0, - end = hscan->rs_ntuples - 1; + uint32 start = 0, end = hscan->rs_ntuples; - while (start <= end) + while (start < end) { - int mid = (start + end) / 2; + int mid = (start + end) / 2; OffsetNumber curoffset = hscan->rs_vistuples[mid]; if (tupoffset == curoffset) return true; else if (tupoffset < curoffset) - end = mid - 1; + end = mid; else start = mid + 1; } - Melanie
On Wed, Dec 18, 2024 at 3:13 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > option 1: > most straightforward fix > > diff --git a/src/backend/access/heap/heapam_handler.c > b/src/backend/access/heap/heapam_handler.c > index d0e5922eed7..fa03bedd4b8 100644 > --- a/src/backend/access/heap/heapam_handler.c > +++ b/src/backend/access/heap/heapam_handler.c > @@ -2577,6 +2577,11 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, > > if (scan->rs_flags & SO_ALLOW_PAGEMODE) > { > + int start, end; > + > + if (hscan->rs_ntuples == 0) > + return false; > + > /* > * In pageatatime mode, heap_prepare_pagescan() already did visibility > * checks, so just look at the info it left in rs_vistuples[]. > @@ -2586,8 +2591,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, > * in increasing order, but it's not clear that there would be enough > * gain to justify the restriction. > */ > - int start = 0, > - end = hscan->rs_ntuples - 1; > + start = 0; > + end = hscan->rs_ntuples - 1; > > while (start <= end) > { > > option 2: > change the binary search code a bit more but avoid the extra branch > introduced by option 1. > > diff --git a/src/backend/access/heap/heapam_handler.c > b/src/backend/access/heap/heapam_handler.c > index d0e5922eed7..c8e25bdd197 100644 > --- a/src/backend/access/heap/heapam_handler.c > +++ b/src/backend/access/heap/heapam_handler.c > @@ -2586,18 +2586,17 @@ SampleHeapTupleVisible(TableScanDesc scan, > Buffer buffer, > * in increasing order, but it's not clear that there would be enough > * gain to justify the restriction. > */ > - int start = 0, > - end = hscan->rs_ntuples - 1; > + uint32 start = 0, end = hscan->rs_ntuples; > > - while (start <= end) > + while (start < end) > { > - int mid = (start + end) / 2; > + int mid = (start + end) / 2; > OffsetNumber curoffset = hscan->rs_vistuples[mid]; > > if (tupoffset == curoffset) > return true; > else if (tupoffset < curoffset) > - end = mid - 1; > + end = mid; > else > start = mid + 1; > } I pushed the straightforward option for now so that it's fixed. - Melanie
On Thu, Dec 19, 2024 at 8:18 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > I pushed the straightforward option for now so that it's fixed. I think this binary search code now has a risk of underflow. If 'mid' is calculated as zero, the second 'if' branch will cause 'end' to underflow. Maybe we need to do something like below. --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -2600,7 +2600,11 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, if (tupoffset == curoffset) return true; else if (tupoffset < curoffset) + { + if (mid == 0) + return false; end = mid - 1; + } else start = mid + 1; } Alternatively, we can revert 'start' and 'end' to signed int as they were before. Thanks Richard
Hi.
Sorry for not responding quickly.
I have been without communication until now.
I have been without communication until now.
Em qua., 18 de dez. de 2024 às 17:13, Melanie Plageman <melanieplageman@gmail.com> escreveu:
On Wed, Dec 18, 2024 at 1:23 PM Ranier Vilela <ranier.vf@gmail.com> wrote:
>
> Hi.
>
> Em qua., 18 de dez. de 2024 às 14:01, Melanie Plageman <melanieplageman@gmail.com> escreveu:
>>
>> For now, I've committed the version of the patch I proposed above (v3).
>
> What happens if *rs_tuples* equal to zero in function *SampleHeapTupleVisible*?
> end = hscan->rs_ntuples - 1;
Ah yes, it seems like just changing the top if statement to
if (scan->rs_flags & SO_ALLOW_PAGEMODE &&
hscan->rs_ntuples > 0)
Thanks for identifying this.
> Would be good to fix the binary search too, now that unsigned types are used.
You just mean the one in SampleHeapTupleVisible(), right?
> Patch attached.
I'm not sure the attached patch is quite right because if rs_ntuples
is 0, it should still enter the first if statement and then return
false. In fact, with your patch, I think we would incorrectly not
return a value when rs_ntuples is 0 from SampleHeapTupleVisible().
I'm wondering if *rs_tuples* is zero, would be run the second search *HeapTupleSatisfiesVisibility*.
How about one of these options:
option 1:
most straightforward fix
diff --git a/src/backend/access/heap/heapam_handler.c
b/src/backend/access/heap/heapam_handler.c
index d0e5922eed7..fa03bedd4b8 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2577,6 +2577,11 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
if (scan->rs_flags & SO_ALLOW_PAGEMODE)
{
+ int start, end;
+
+ if (hscan->rs_ntuples == 0)
+ return false;
+
/*
* In pageatatime mode, heap_prepare_pagescan() already did visibility
* checks, so just look at the info it left in rs_vistuples[].
@@ -2586,8 +2591,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
* in increasing order, but it's not clear that there would be enough
* gain to justify the restriction.
*/
- int start = 0,
- end = hscan->rs_ntuples - 1;
+ start = 0;
+ end = hscan->rs_ntuples - 1;
while (start <= end)
{
option 2:
change the binary search code a bit more but avoid the extra branch
introduced by option 1.
diff --git a/src/backend/access/heap/heapam_handler.c
b/src/backend/access/heap/heapam_handler.c
index d0e5922eed7..c8e25bdd197 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2586,18 +2586,17 @@ SampleHeapTupleVisible(TableScanDesc scan,
Buffer buffer,
* in increasing order, but it's not clear that there would be enough
* gain to justify the restriction.
*/
- int start = 0,
- end = hscan->rs_ntuples - 1;
+ uint32 start = 0, end = hscan->rs_ntuples;
- while (start <= end)
+ while (start < end)
{
- int mid = (start + end) / 2;
+ int mid = (start + end) / 2;
OffsetNumber curoffset = hscan->rs_vistuples[mid];
if (tupoffset == curoffset)
return true;
else if (tupoffset < curoffset)
- end = mid - 1;
+ end = mid;
else
start = mid + 1;
}
I'm looking at the commit and the replies in the thread.
best regards,
Ranier Vilela
Em qua., 18 de dez. de 2024 às 23:50, Richard Guo <guofenglinux@gmail.com> escreveu:
On Thu, Dec 19, 2024 at 8:18 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> I pushed the straightforward option for now so that it's fixed.
I think this binary search code now has a risk of underflow. If 'mid'
is calculated as zero, the second 'if' branch will cause 'end' to
underflow.
Maybe we need to do something like below.
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2600,7 +2600,11 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
if (tupoffset == curoffset)
return true;
else if (tupoffset < curoffset)
+ {
+ if (mid == 0)
+ return false;
end = mid - 1;
+ }
else
start = mid + 1;
}
Alternatively, we can revert 'start' and 'end' to signed int as they
were before.
How would it be *signed*?
Wouldn't overflow happen in this case?
rs_tuples now can be
UINT_MAX = 4294967295
best regards,
Ranier Vilela
Em qua., 18 de dez. de 2024 às 20:18, Melanie Plageman <melanieplageman@gmail.com> escreveu:
On Wed, Dec 18, 2024 at 3:13 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> option 1:
> most straightforward fix
>
> diff --git a/src/backend/access/heap/heapam_handler.c
> b/src/backend/access/heap/heapam_handler.c
> index d0e5922eed7..fa03bedd4b8 100644
> --- a/src/backend/access/heap/heapam_handler.c
> +++ b/src/backend/access/heap/heapam_handler.c
> @@ -2577,6 +2577,11 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
>
> if (scan->rs_flags & SO_ALLOW_PAGEMODE)
> {
> + int start, end;
> +
> + if (hscan->rs_ntuples == 0)
> + return false;
> +
> /*
> * In pageatatime mode, heap_prepare_pagescan() already did visibility
> * checks, so just look at the info it left in rs_vistuples[].
> @@ -2586,8 +2591,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
> * in increasing order, but it's not clear that there would be enough
> * gain to justify the restriction.
> */
> - int start = 0,
> - end = hscan->rs_ntuples - 1;
> + start = 0;
> + end = hscan->rs_ntuples - 1;
>
> while (start <= end)
> {
>
> option 2:
> change the binary search code a bit more but avoid the extra branch
> introduced by option 1.
>
> diff --git a/src/backend/access/heap/heapam_handler.c
> b/src/backend/access/heap/heapam_handler.c
> index d0e5922eed7..c8e25bdd197 100644
> --- a/src/backend/access/heap/heapam_handler.c
> +++ b/src/backend/access/heap/heapam_handler.c
> @@ -2586,18 +2586,17 @@ SampleHeapTupleVisible(TableScanDesc scan,
> Buffer buffer,
> * in increasing order, but it's not clear that there would be enough
> * gain to justify the restriction.
> */
> - int start = 0,
> - end = hscan->rs_ntuples - 1;
> + uint32 start = 0, end = hscan->rs_ntuples;
>
> - while (start <= end)
> + while (start < end)
> {
> - int mid = (start + end) / 2;
> + int mid = (start + end) / 2;
> OffsetNumber curoffset = hscan->rs_vistuples[mid];
>
> if (tupoffset == curoffset)
> return true;
> else if (tupoffset < curoffset)
> - end = mid - 1;
> + end = mid;
> else
> start = mid + 1;
> }
I pushed the straightforward option for now so that it's fixed.
How about:
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 2da4e4da13..1620f0c3db 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2574,7 +2574,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
if (scan->rs_flags & SO_ALLOW_PAGEMODE)
{
- uint32 start,
+ int64 start,
end;
if (hscan->rs_ntuples == 0)
@@ -2594,7 +2594,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
while (start <= end)
{
- uint32 mid = (start + end) / 2;
+ int64 mid = (start + end) >> 1;
OffsetNumber curoffset = hscan->rs_vistuples[mid];
if (tupoffset == curoffset)
index 2da4e4da13..1620f0c3db 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2574,7 +2574,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
if (scan->rs_flags & SO_ALLOW_PAGEMODE)
{
- uint32 start,
+ int64 start,
end;
if (hscan->rs_ntuples == 0)
@@ -2594,7 +2594,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
while (start <= end)
{
- uint32 mid = (start + end) / 2;
+ int64 mid = (start + end) >> 1;
OffsetNumber curoffset = hscan->rs_vistuples[mid];
if (tupoffset == curoffset)
best regards
Ranier Vilela
- Melanie
On Wed, Dec 18, 2024 at 9:50 PM Richard Guo <guofenglinux@gmail.com> wrote: > > On Thu, Dec 19, 2024 at 8:18 AM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > I pushed the straightforward option for now so that it's fixed. > > I think this binary search code now has a risk of underflow. If 'mid' > is calculated as zero, the second 'if' branch will cause 'end' to > underflow. Thanks, Richard! > Maybe we need to do something like below. > > --- a/src/backend/access/heap/heapam_handler.c > +++ b/src/backend/access/heap/heapam_handler.c > @@ -2600,7 +2600,11 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, > if (tupoffset == curoffset) > return true; > else if (tupoffset < curoffset) > + { > + if (mid == 0) > + return false; > end = mid - 1; > + } > else > start = mid + 1; > } > > Alternatively, we can revert 'start' and 'end' to signed int as they > were before. What about this instead: diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index 2da4e4da13e..fb90fd869c2 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -2574,11 +2574,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, if (scan->rs_flags & SO_ALLOW_PAGEMODE) { - uint32 start, - end; - - if (hscan->rs_ntuples == 0) - return false; + uint32 start = 0, + end = hscan->rs_ntuples; /* * In pageatatime mode, heap_prepare_pagescan() already did visibility @@ -2589,10 +2586,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, * in increasing order, but it's not clear that there would be enough * gain to justify the restriction. */ - start = 0; - end = hscan->rs_ntuples - 1; - while (start <= end) + while (start < end) { uint32 mid = (start + end) / 2; OffsetNumber curoffset = hscan->rs_vistuples[mid]; @@ -2600,7 +2595,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, if (tupoffset == curoffset) return true; else if (tupoffset < curoffset) - end = mid - 1; + end = mid; else start = mid + 1; } Or to make it easier to read, here: uint32 start = 0, end = hscan->rs_ntuples; while (start < end) { uint32 mid = (start + end) / 2; OffsetNumber curoffset = hscan->rs_vistuples[mid]; if (tupoffset == curoffset) return true; else if (tupoffset < curoffset) end = mid; else start = mid + 1; } I think this gets rid of any subtraction and is still the same. - Melanie
On Fri, Dec 20, 2024 at 7:50 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > On Wed, Dec 18, 2024 at 9:50 PM Richard Guo <guofenglinux@gmail.com> wrote: > > I think this binary search code now has a risk of underflow. If 'mid' > > is calculated as zero, the second 'if' branch will cause 'end' to > > underflow. > What about this instead: > > diff --git a/src/backend/access/heap/heapam_handler.c > b/src/backend/access/heap/heapam_handler.c > index 2da4e4da13e..fb90fd869c2 100644 > --- a/src/backend/access/heap/heapam_handler.c > +++ b/src/backend/access/heap/heapam_handler.c > @@ -2574,11 +2574,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, > > if (scan->rs_flags & SO_ALLOW_PAGEMODE) > { > - uint32 start, > - end; > - > - if (hscan->rs_ntuples == 0) > - return false; > + uint32 start = 0, > + end = hscan->rs_ntuples; > > /* > * In pageatatime mode, heap_prepare_pagescan() already did visibility > @@ -2589,10 +2586,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, > * in increasing order, but it's not clear that there would be enough > * gain to justify the restriction. > */ > - start = 0; > - end = hscan->rs_ntuples - 1; > > - while (start <= end) > + while (start < end) > { > uint32 mid = (start + end) / 2; > OffsetNumber curoffset = hscan->rs_vistuples[mid]; > @@ -2600,7 +2595,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer, > if (tupoffset == curoffset) > return true; > else if (tupoffset < curoffset) > - end = mid - 1; > + end = mid; > else > start = mid + 1; > } > > Or to make it easier to read, here: > > uint32 start = 0, > end = hscan->rs_ntuples; > > while (start < end) > { > uint32 mid = (start + end) / 2; > OffsetNumber curoffset = hscan->rs_vistuples[mid]; > > if (tupoffset == curoffset) > return true; > else if (tupoffset < curoffset) > end = mid; > else > start = mid + 1; > } > > I think this gets rid of any subtraction and is still the same. Looks correct to me. BTW, I kind of doubt that the overflow risk fixed in 28328ec87 is a real issue in real-world scenarios. Is it realistically possible to have more than INT_MAX tuples fit on one heap page? If it is, then the statement below could also be prone to overflow. uint32 mid = (start + end) / 2; Maybe you want to change it to: uint32 mid = start + (end - start) / 2; Thanks Richard
Em qui., 19 de dez. de 2024 às 19:50, Melanie Plageman <melanieplageman@gmail.com> escreveu:
On Wed, Dec 18, 2024 at 9:50 PM Richard Guo <guofenglinux@gmail.com> wrote:
>
> On Thu, Dec 19, 2024 at 8:18 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > I pushed the straightforward option for now so that it's fixed.
>
> I think this binary search code now has a risk of underflow. If 'mid'
> is calculated as zero, the second 'if' branch will cause 'end' to
> underflow.
Thanks, Richard!
> Maybe we need to do something like below.
>
> --- a/src/backend/access/heap/heapam_handler.c
> +++ b/src/backend/access/heap/heapam_handler.c
> @@ -2600,7 +2600,11 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
> if (tupoffset == curoffset)
> return true;
> else if (tupoffset < curoffset)
> + {
> + if (mid == 0)
> + return false;
> end = mid - 1;
> + }
> else
> start = mid + 1;
> }
>
> Alternatively, we can revert 'start' and 'end' to signed int as they
> were before.
What about this instead:
diff --git a/src/backend/access/heap/heapam_handler.c
b/src/backend/access/heap/heapam_handler.c
index 2da4e4da13e..fb90fd869c2 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2574,11 +2574,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
if (scan->rs_flags & SO_ALLOW_PAGEMODE)
{
- uint32 start,
- end;
-
- if (hscan->rs_ntuples == 0)
- return false;
+ uint32 start = 0,
+ end = hscan->rs_ntuples;
/*
* In pageatatime mode, heap_prepare_pagescan() already did visibility
@@ -2589,10 +2586,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
* in increasing order, but it's not clear that there would be enough
* gain to justify the restriction.
*/
- start = 0;
- end = hscan->rs_ntuples - 1;
- while (start <= end)
+ while (start < end)
{
uint32 mid = (start + end) / 2;
OffsetNumber curoffset = hscan->rs_vistuples[mid];
@@ -2600,7 +2595,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
if (tupoffset == curoffset)
return true;
else if (tupoffset < curoffset)
- end = mid - 1;
+ end = mid;
else
start = mid + 1;
}
Or to make it easier to read, here:
uint32 start = 0,
end = hscan->rs_ntuples;
while (start < end)
{
uint32 mid = (start + end) / 2;
OffsetNumber curoffset = hscan->rs_vistuples[mid];
if (tupoffset == curoffset)
return true;
else if (tupoffset < curoffset)
end = mid;
else
start = mid + 1;
}
I think this gets rid of any subtraction and is still the same.
Look goods to me.
I think that you propose, can get rid of the early test too.
Note the way we can avoid an overflow in the mid calculation.
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 9f17baea5d..bd1335276a 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2577,9 +2577,6 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
uint32 start,
end;
- if (hscan->rs_ntuples == 0)
- return false;
-
/*
* In pageatatime mode, heap_prepare_pagescan() already did visibility
* checks, so just look at the info it left in rs_vistuples[].
@@ -2590,17 +2587,17 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
* gain to justify the restriction.
*/
start = 0;
- end = hscan->rs_ntuples - 1;
+ end = hscan->rs_ntuples;
- while (start <= end)
+ while (start < end)
{
- uint32 mid = (start + end) / 2;
+ uint32 mid = (start + end) >> 1;
OffsetNumber curoffset = hscan->rs_vistuples[mid];
if (tupoffset == curoffset)
return true;
else if (tupoffset < curoffset)
- end = mid - 1;
+ end = mid;
else
start = mid + 1;
}
index 9f17baea5d..bd1335276a 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2577,9 +2577,6 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
uint32 start,
end;
- if (hscan->rs_ntuples == 0)
- return false;
-
/*
* In pageatatime mode, heap_prepare_pagescan() already did visibility
* checks, so just look at the info it left in rs_vistuples[].
@@ -2590,17 +2587,17 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
* gain to justify the restriction.
*/
start = 0;
- end = hscan->rs_ntuples - 1;
+ end = hscan->rs_ntuples;
- while (start <= end)
+ while (start < end)
{
- uint32 mid = (start + end) / 2;
+ uint32 mid = (start + end) >> 1;
OffsetNumber curoffset = hscan->rs_vistuples[mid];
if (tupoffset == curoffset)
return true;
else if (tupoffset < curoffset)
- end = mid - 1;
+ end = mid;
else
start = mid + 1;
}
best regards,
Ranier Vilela
On Thu, Dec 19, 2024 at 9:36 PM Richard Guo <guofenglinux@gmail.com> wrote: > > On Fri, Dec 20, 2024 at 7:50 AM Melanie Plageman > <melanieplageman@gmail.com> wrote: > Looks correct to me. > > BTW, I kind of doubt that the overflow risk fixed in 28328ec87 is a > real issue in real-world scenarios. Is it realistically possible to > have more than INT_MAX tuples fit on one heap page? > > If it is, then the statement below could also be prone to overflow. > > uint32 mid = (start + end) / 2; > > Maybe you want to change it to: > > uint32 mid = start + (end - start) / 2; I've done this and pushed. Thanks to you and Ranier for your help in identifying and fixing this! - Melanie