Re: Avoiding superfluous buffer locking during nbtree backwards scans - Mailing list pgsql-hackers

From Masahiro Ikeda
Subject Re: Avoiding superfluous buffer locking during nbtree backwards scans
Date
Msg-id f8efb9c0f8d1a71b44fd7f8e42e49c25@oss.nttdata.com
Whole thread Raw
In response to Avoiding superfluous buffer locking during nbtree backwards scans  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Hi, thanks for working on these improvements.

I noticed an unexpected behavior where the index scan continues instead 
of
stopping, even when it detects that there are no tuples that match the 
conditions.
(I observed this while reviewing the skip scan patch, though it isn't 
directly
related to this issue.)

On 2024-10-12 08:29, Peter Geoghegan wrote:
> On Thu, Oct 10, 2024 at 1:41 PM Peter Geoghegan <pg@bowt.ie> wrote:
> * We now reset currPos state (including even its moreLeft/moreRight
> fields) within _bt_parallel_seize, automatically and regardless of any
> other details.

IIUC, the above change is the root cause. The commit 1bd4bc8 adds a 
reset of
the currPos state in _bt_parallel_seize(). However, this change can 
overwrite
currPos.moreRight which should be preserved before calling 
_bt_readnextpage().

* Test case

-- Prepare
DROP TABLE IF EXISTS test;
CREATE TABLE test (smallint smallint, bool bool);
INSERT INTO test (SELECT -20000+i%40000, random()>0.5 FROM 
generate_series(1, 1_000_000) s(i));
CREATE INDEX test_smallint ON test (smallint);
VACUUM ANALYZE test;

-- Check the number of pages of the index
=# SELECT relpages FROM pg_class WHERE relname = 'test_smallint';
  relpages
----------
       937
(1 row)

-- Test
=# SET max_parallel_workers = 0;
=# EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT COUNT(*) FROM test WHERE 
smallint < -10000;
                                                                          
       QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Finalize Aggregate  (cost=5170.23..5170.24 rows=1 width=8) (actual 
time=71.352..71.402 rows=1 loops=1)
    Output: count(*)
    Buffers: shared hit=934
    ->  Gather  (cost=5170.01..5170.22 rows=2 width=8) (actual 
time=71.344..71.395 rows=1 loops=1)
          Output: (PARTIAL count(*))
          Workers Planned: 2
          Workers Launched: 0
          Buffers: shared hit=934
          ->  Partial Aggregate  (cost=4170.01..4170.02 rows=1 width=8) 
(actual time=71.199..71.199 rows=1 loops=1)
                Output: PARTIAL count(*)
                Buffers: shared hit=934
                ->  Parallel Index Only Scan using test_smallint on 
public.test  (cost=0.42..3906.27 rows=105495 width=0) (actual 
time=0.062..49.137 rows=250000 loops=1)
                      Output: "smallint"
                      Index Cond: (test."smallint" < '-10000'::integer)
                      Heap Fetches: 0
                      Buffers: shared hit=934  -- This is not the result 
I expected. Almost all relpages are being read to retrieve only 25% of 
the tuples.
                                               -- Without commit 1bd4bc8, 
the number was '236' in my environment.
  Planning Time: 0.105 ms
  Execution Time: 71.454 ms
(18 rows)

Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY
Next
From: Alena Rybakina
Date:
Subject: Re: Incremental Sort Cost Estimation Instability