Home > mailing lists

Re: New access method for b-tree. - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: New access method for b-tree.
Date	February 2 02:54:14
Msg-id	a652515e-278a-4838-815b-ecd2ad4495f6@vondra.me Whole thread Raw
In response to	New access method for b-tree. (Alexandre Felipe <alexandre.felipe@tpro.io>)
Responses	Re: New access method for b-tree. Re: New access method for b-tree.
List	pgsql-hackers

Tree view

Hello Felipe,

On 2/1/26 11:02, Alexandre Felipe wrote:
> Hello Hackers,
> 
> Please check this out,
> 
> It is an access method to scan a table sorting by the second column of
> an index, filtered on the first.
> Queries like
> SELECT x, y FROM grid
> WHERE x in (array of Nx elements)
> ORDER BY y, x
> LIMIT M
> 
> Can execute streaming the rows directly from disk instead of loading
> everything.
> 
> Using  btree index on (x, y)
> 
> On a grid with N x N will run by fetching only what is necessary
> A skip scal will run with O(N * Nx) I/O, O(N x Nx) space, O(N x Nx *
> log( N * Nx)) comput (assuming a generic in memory sort)
> 
> The proposed access method does it O(M + Nx) I/O, O(Nx) space, and O(M *
> log(Nx)) compute.
> 

So how does this compare to skip scan in practice? It's hard to compare,
as the patch does not implement an actual access path, but I tried this:

  CREATE TABLE merge_test_int (
    prefix_col int4,
    suffix_col int4
  );

  INSERT INTO merge_test_int
  SELECT p, s
  FROM generate_series(1, 10000) AS p,
       generate_series(1, 1000) AS s;
  CREATE INDEX merge_test_int_idx
      ON merge_test_int (prefix_col, suffix_col);

and then

1) master

  SELECT * FROM merge_test_int
  WHERE prefix_col IN (1,3,4,5,6,7,8,9,10,11,12,13,14,15)
    AND suffix_col >= 900
  ORDER BY suffix_col LIMIT 100;

vs.

2) merge scan

  SELECT * FROM test_btree_merge_scan_int(

      'merge_test_int',
      'merge_test_int_idx',
      ARRAY[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
      900,
      100);

And with explain analyze we get this:

1) Buffers: shared hit=26 read=25

2) Buffers: shared hit=143 read=17

So it seems to access many more buffers, even if the number of reads is
lower. Presumably the merge scan is not always better than skip scan,
probably depending on number of prefixes in the query etc. What is the
cost model to decide between those two?

If you had to construct the best case and worst cases (vs. skip scan),
what would that look like?

I'm also wondering how common is the targeted query pattern? How common
it is to have an IN condition on the leading column in an index, and
ORDER BY on the second one?

regards

-- 
Tomas Vondra

pgsql-hackers by date:

From: Gyan Sreejith
Date: 02 February, 02:05:19
Subject: Re: [Proposal] Adding Log File Capability to pg_createsubscriber

From: Michael Paquier
Date: 02 February, 02:54:21
Subject: Re: AIX support

Re: New access method for b-tree. - Mailing list pgsql-hackers

Previous

Next