Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id aa3e7e5d-a452-4df0-b928-d9765b2d7088@enterprisedb.com
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: BitmapHeapScan streaming read user and prelim refactoring
Re: BitmapHeapScan streaming read user and prelim refactoring
List pgsql-hackers

On 4/7/24 16:24, Melanie Plageman wrote:
> On Sun, Apr 7, 2024 at 7:38 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>>
>>
>> On 4/7/24 06:17, Melanie Plageman wrote:
>>> On Sun, Apr 07, 2024 at 02:27:43AM +0200, Tomas Vondra wrote:
>>>> On 4/6/24 23:34, Melanie Plageman wrote:
>>>>> ...
>>>>>>
>>>>>> I realized it makes more sense to add a FIXME (I used XXX. I'm not when
>>>>>> to use what) with a link to the message where Andres describes why he
>>>>>> thinks it is a bug. If we plan on fixing it, it is good to have a record
>>>>>> of that. And it makes it easier to put a clear and accurate comment.
>>>>>> Done in 0009.
>>>>>>
>>>>>>> OK, thanks. If think 0001-0008 are ready to go, with some minor tweaks
>>>>>>> per above (tuple vs. tuples etc.), and the question about the recheck
>>>>>>> flag. If you can do these tweaks, I'll get that committed today and we
>>>>>>> can try to get a couple more patches in tomorrow.
>>>>>
>>>>> Attached v19 rebases the rest of the commits from v17 over the first
>>>>> nine patches from v18. All patches 0001-0009 are unchanged from v18. I
>>>>> have made updates and done cleanup on 0010-0021.
>>>>>
>>>>
>>>> I've pushed 0001-0005, I'll get back to this tomorrow and see how much
>>>> more we can get in for v17.
>>>
>>> Thanks! I thought about it a bit more, and I got worried about the
>>>
>>>       Assert(scan->rs_empty_tuples_pending == 0);
>>>
>>> in heap_rescan() and heap_endscan().
>>>
>>> I was worried if we don't complete the scan it could end up tripping
>>> incorrectly.
>>>
>>> I tried to come up with a query which didn't end up emitting all of the
>>> tuples on the page (using a LIMIT clause), but I struggled to come up
>>> with an example that qualified for the skip fetch optimization and also
>>> returned before completing the scan.
>>>
>>> I could work a bit harder tomorrow to try and come up with something.
>>> However, I think it might be safer to just change these to:
>>>
>>>       scan->rs_empty_tuples_pending = 0
>>>
>>
>> Hmmm, good point. I haven't tried, but wouldn't something like "SELECT 1
>> FROM t WHERE column = X LIMIT 1" do the trick? Probably in a join, as a
>> correlated subquery?
> 
> Unfortunately (or fortunately, I guess) that exact thing won't work
> because even constant values in the target list disqualify it for the
> skip fetch optimization.
> 
> Being a bit too lazy to look at planner code this morning, I removed
> the target list requirement like this:
> 
> -           need_tuples = (node->ss.ps.plan->qual != NIL ||
> -                          node->ss.ps.plan->targetlist != NIL);
> +           need_tuples = (node->ss.ps.plan->qual != NIL);
> 
> And can easily trip the assert with this:
> 
> create table foo (a int);
> insert into foo select i from generate_series(1,10)i;
> create index on foo(a);
> vacuum foo;
> select 1 from (select 2 from foo limit 3);
> 
> Anyway, I don't know if we could find a query that does actually hit
> this. The only bitmap heap scan queries in the regress suite that meet
> the
>    BitmapHeapScanState->ss.ps.plan->targetlist == NIL
> condition are aggregates (all are count(*)).
> 
> I'll dig a bit more later, but do you think this is worth adding an
> open item for? Even though I don't have a repro yet?
> 

Try this:

create table t (a int, b int) with (fillfactor=10);
insert into t select mod((i/22),2), (i/22) from generate_series(0,1000)
S(i);
create index on t(a);
vacuum analyze t;

set enable_indexonlyscan = off;
set enable_seqscan = off;
explain (analyze, verbose) select 1 from (values (1)) s(x) where exists
(select * from t where a = x);

KABOOM!

#2  0x000078a16ac5fafe in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#3  0x000078a16ac4887f in __GI_abort () at abort.c:79
#4  0x0000000000bb2c5a in ExceptionalCondition (conditionName=0xc42ba8
"scan->rs_empty_tuples_pending == 0", fileName=0xc429c8 "heapam.c",
lineNumber=1090) at assert.c:66
#5  0x00000000004f68bb in heap_endscan (sscan=0x19af3a0) at heapam.c:1090
#6  0x000000000077a94c in table_endscan (scan=0x19af3a0) at
../../../src/include/access/tableam.h:1001

So yeah, this assert is not quite correct. It's not breaking anything at
the moment, so we can fix it now or add it as an open item.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Next
From: Daniel Gustafsson
Date:
Subject: Re: Cluster::restart dumping logs when stop fails