Home > mailing lists

Re: Using multidimensional indexes in ordinal queries - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: Using multidimensional indexes in ordinal queries
Date	June 21, 2010 18:29:15
Msg-id	AANLkTim8yryTqBV7rVucvN1h9SpIMgt8fFEmXJSO1_sE@mail.gmail.com Whole thread Raw
In response to	Re: Using multidimensional indexes in ordinal queries (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Using multidimensional indexes in ordinal queries
List	pgsql-hackers

Tree view

On Mon, Jun 21, 2010 at 5:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:

It seems like you can get more or less the same benefit from a
multicolumn btree index. On my system, with the individual btree
indices, the query ran in 7625 ms; with an additional index on (v1,
v2, v3), it ran in 94 ms. I didn't get the same plans as Alexander
did, though, so it may not really be apples to apples. See attached
session trace.

Benefit of multicolumn btree index was more or less the same than cube benefit because of very bad picksplit behavior in this case. I attached the patch which significally improves cube index search performance:

test=# explain (analyze, buffers) select * from test where cube(ARRAY[v1,v2,v3]) <@ cube(ARRAY[480,480,'-inf'::float8], ARRAY[500,500,'+inf'::float8]) order by cube(ARRAY[v1,v2,v3]) <*> 4 LIMIT 10;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.00..38.07 rows=10 width=28) (actual time=0.495..0.570 rows=10 loops=1)

Buffers: shared hit=21

-> Index Scan using test_cube_idx on test (cost=0.00..38064.52 rows=10000 width=28) (actual time=0.489..0.537 rows=10 loops=1)

Index Cond: (cube(ARRAY[v1, v2, v3]) <@ '(480, 480, -inf),(500, 500, inf)'::cube)

Sort Cond: (cube(ARRAY[v1, v2, v3]) <*> 4)

Buffers: shared hit=21

Total runtime: 0.659 ms

(7 rows)

Now this patch greatly increases tree construction time, but I believe that picksplit implementation, that is good enough for tree search and tree construction, can be found.

The trouble is that it's hard to think of
a way of teaching the planner about these cases without hard-coding
lots and lots of special-case kludges into the planner. Still, if
someone has a clever idea...

I think that two things can be done to improve the situation:

1) Make knngist deal with negative values. I think this will make easier using knngist just for sorting, not only k-neighbor searching.

2) Let gist interface methods take care about multicolumn indexes. I think that if cube index from the example above will be constructed on separate columns v1, v2, v3 then it would be easier for planner to use cube index for queries with filters on these columns. I don't know exactly how to do that.

Attachment

cube.gz

pgsql-hackers by date:

From: Andy Balholm
Date: 21 June 2010, 18:07:28
Subject: Re: dividing money by money

From: Robert Haas
Date: 21 June 2010, 18:59:08
Subject: Re: Using multidimensional indexes in ordinal queries

Re: Using multidimensional indexes in ordinal queries - Mailing list pgsql-hackers

Attachment

Previous

Next