Home > mailing lists

Re: WIP: Fast GiST index build - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: WIP: Fast GiST index build
Date	September 1, 2011 06:24:19
Msg-id	CAPpHfdsUTusxjB26cHnbVWCFkxQc87juJsTkkoPUj=GrJzU5ag@mail.gmail.com Whole thread
In response to	Re: WIP: Fast GiST index build (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses	Re: WIP: Fast GiST index build Re: WIP: Fast GiST index build
List	pgsql-hackers

Tree view

On Thu, Sep 1, 2011 at 12:59 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

So I changed the test script to generate the table as:

CREATE TABLE points AS SELECT random() as x, random() as y FROM generate_series(1, $NROWS);

The unordered results are in:

testname | nrows | duration | accesses
-----------------------------+-----------+-----------------+----------
points unordered buffered | 250000000 | 05:56:58.575789 | 2241050
points unordered auto | 250000000 | 05:34:12.187479 | 2246420
points unordered unbuffered | 250000000 | 04:38:48.663952 | 2244228

Although the buffered build doesn't lose as badly as it did with more overlap, it still doesn't look good :-(. Any ideas?

But it's still a lot of overlap. It's about 220 accesses per small area request. It's about 10 - 20 times greater than should be without overlaps. If we roughly assume that 10 times more overlap makes 1/10 of tree to be used for actual inserts, then that part of tree can easily fit to the cache.

You can try my splitting algorithm on your test setup (it this case I advice to start from smaller number of rows, 100 M for example).

I'm requesting real-life datasets which makes troubles in real life from Oleg. Probably those datasets is even larger or new linear split produce less overlaps on them.

------
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 01 September 2011, 05:59:34
Subject: Re: WIP: Fast GiST index build

From: Heikki Linnakangas
Date: 01 September 2011, 06:37:58
Subject: Re: WIP: Fast GiST index build

Re: WIP: Fast GiST index build - Mailing list pgsql-hackers

Previous

Next