Re: Re: bt Scankey in another contradictory case - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Re: bt Scankey in another contradictory case |
Date | |
Msg-id | CAH2-Wzno9NJSJdROx5immswvEiCSb+GH7oOSMtrn4r3DQ+HMRg@mail.gmail.com Whole thread Raw |
In response to | Re: Re: bt Scankey in another contradictory case ("bigbro_wq@hotmail.com" <bigbro_wq@hotmail.com>) |
List | pgsql-hackers |
On Sun, Sep 1, 2024 at 11:44 AM bigbro_wq@hotmail.com <bigbro_wq@hotmail.com> wrote: > I have reanalysed the code of function _bt_first. I notice that using a multi-attribute index > if we can't identify the starting boundaries and the following attributes markded not required , > that means we need start at first or last page in the index to examine every tuple to satisfy the > qual or not, in the meantime the scan will be stopped while the first attribute evaluated failed. I'm not sure of what it is that you're trying to draw attention to. The behavior of _bt_first doesn't seem relevant to this patch to me. For one thing, _bt_first doesn't actually care about whether or not a scan key is marked required -- _bt_first independently applies its own similar rules to determine which scan keys can be used in the insertion scan key used to find an initial position on the leaf level. More importantly, whether a scan key is marked required doesn't seem relevant to this patch (that is relevant to _bt_preprocess_keys, but doesn't seem relevant to the changes that you propose to make to _bt_preprocess_keys). > For instance: > create table c_s( x int, y int); > insert into c_s select generate_series(1, 20000), generate_series(1, 20000); > create index my_c_s_idx on c_s using btree(x,y); > explain (analyze, buffers) select * from c_s where x >4000 and y >10 and y <10 order by x desc; Your patch (which I haven't tested for myself) is based on the observation that "y > 10 and y < 10" is a contradictory condition. This is true regardless of any other factor, such as whether we're able to mark the "y" scan key required. All that _bt_first has to do is return when it notices "so->qual_ok" was set to false within _bt_preprocess_keys. It doesn't matter if there is an "x > 4000", and it doesn't matter if you use a composite index like this that completely omits a condition on "x". > What's more, if i change the number 4000 to 1000. > ----------------------------------------------------------------------------------------------------- > Sort (cost=441.01..441.01 rows=1 width=8) (actual time=2.974..2.975 rows=0 loops=1) > Sort Key: x DESC > Sort Method: quicksort Memory: 25kB > Buffers: shared hit=89 > -> Seq Scan on c_s (cost=0.00..441.00 rows=1 width=8) (actual time=2.971..2.971 rows=0 loops=1) > Filter: ((x > 1000) AND (y > 10) AND (y < 10)) > Rows Removed by Filter: 20000 > Buffers: shared hit=89 > Planning: > Buffers: shared hit=2 > Planning Time: 0.113 ms > Execution Time: 2.990 ms > (12 rows) > > The planner choosed the Seq Scan, and the executor have done the unnecessary jobs 20000 times. It's true that a sequential scan won't ever apply the logic from _bt_preprocess_keys, nor any similar logic. A sequential scan with contradictory quals will therefore not be detected in cases where it would have been detected had we used an index scan with the same quals. This inconsistency doesn't make much sense; it's just an implementation deficiency. It's historical. We've talked about moving the current _bt_preprocess_keys logic to plan time before. See Tom's remarks at the end of this email: https://www.postgresql.org/message-id/2587523.1647982549@sss.pgh.pa.us I think that it would be possible to move _bt_preprocess_keys to plan time, and to generalize it to work with sequential scans, but that is a much larger project. It seems out of scope. I think that you should just focus on the immediate problem of not detecting contradictory quals that happen to involve (> or >=) and (< or <=) operators. It's a completely independent problem, and one that is well scoped. -- Peter Geoghegan
pgsql-hackers by date: