Home > mailing lists

Re: Cube Index Size - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: Cube Index Size
Date	June 1, 2011 12:18:50
Msg-id	BANLkTinRfzBz=ygsO+fckxN5sn62YVQ4qg@mail.gmail.com Whole thread Raw
In response to	Re: Cube Index Size (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Wed, Jun 1, 2011 at 3:37 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

My guess is that the picksplit algorithm performs poorly with that data. Unfortunately, I have no idea how to improve that.

Current cube picksplit function have no storage utilization guarantees, while original Guttman's picksplit has them (if one of group size reaches some threshold, then all other entries go to another group). Also, current picksplit is mix of Guttman's linear and quadratic algorithms. It picks seeds quadratically, but distributes entries linearly.

I see following ways of solving picksplit problem for cube:

1) Add storage utilization guarantees to current picksplit. It may cause increase of overlaps, but should descrease index size.

2) Add storage utilization guarantees to current picksplit and replace entries distribution algorithm to the quadratic one. Picksplit will take more time, but it should give more stable and predictable result.

3) I had some experiments with my own picksplit algorithm, which showed pretty good results on tests which I've run. But current implementation is dirty and it's require more testing.

------

With best regards,

Alexander Korotkov.

pgsql-hackers by date:

From: Dave Page
Date: 01 June 2011, 12:04:35
Subject: Re: pg_listener in 9.0

From: Andrew Dunstan
Date: 01 June 2011, 12:27:57
Subject: Re: pg_listener in 9.0

Re: Cube Index Size - Mailing list pgsql-hackers

Previous

Next