Home > mailing lists

Re: WIP: Fast GiST index build - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: WIP: Fast GiST index build
Date	June 6, 2011 12:00:56
Msg-id	BANLkTimnkr-ECezSXDF6ZNkt1r1KNnajPA@mail.gmail.com Whole thread Raw
In response to	Re: WIP: Fast GiST index build (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List	pgsql-hackers

Tree view

Hi!

On Mon, Jun 6, 2011 at 2:51 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

On 06.06.2011 10:42, Heikki Linnakangas wrote:
I ran another test with a simple table generated with:

CREATE TABLE pointtest (p point);
INSERT INTO pointtest SELECT point(random(), random()) FROM
generate_series(1,50000000);

Generating a gist index with:

CREATE INDEX i_pointtest ON pointtest USING gist (p);

took about 15 hours without the patch, and 2 hours with it. That's quite
dramatic.

Oops, that was a rounding error, sorry. The run took about 2.7 hours with the patch, which of course should be rounded to 3 hours, not 2. Anyway, it is still a very impressive improvement.

I have similar results on 100 millions of rows: 21.6 hours without patch and 2 hours with patch. But I found a problem: index quality is worse. See following query plans. There test is relation where index was created in ordinal way, and test2 is relation where patch was used.

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------

Bitmap Heap Scan on test (cost=4391.01..270397.31 rows=100000 width=20) (actual time=1.257..2.147 rows=838 loops=1)

Recheck Cond: (v <@ '(0.903,0.203),(0.9,0.2)'::box)

Buffers: shared hit=968

-> Bitmap Index Scan on test_idx (cost=0.00..4366.01 rows=100000 width=0) (actual time=1.162..1.162 rows=838 loops=1)

Index Cond: (v <@ '(0.903,0.203),(0.9,0.2)'::box)

Buffers: shared hit=131

Total runtime: 2.214 ms

(7 rows)

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------

Bitmap Heap Scan on test2 (cost=4370.84..270377.13 rows=100000 width=20) (actual time=5.252..6.056 rows=838 loops=1)

Recheck Cond: (v <@ '(0.903,0.203),(0.9,0.2)'::box)

Buffers: shared hit=1458

-> Bitmap Index Scan on test2_idx (cost=0.00..4345.84 rows=100000 width=0) (actual time=5.155..5.155 rows=838 loops=1)

Index Cond: (v <@ '(0.903,0.203),(0.9,0.2)'::box)

Buffers: shared hit=621

Total runtime: 6.121 ms

(7 rows)

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------

Bitmap Heap Scan on test (cost=4391.01..270397.31 rows=100000 width=20) (actual time=2.148..2.977 rows=850 loops=1)

Recheck Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box)

Buffers: shared hit=1099

-> Bitmap Index Scan on test_idx (cost=0.00..4366.01 rows=100000 width=0) (actual time=2.052..2.052 rows=850 loops=1)

Index Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box)

Buffers: shared hit=249

Total runtime: 3.033 ms

(7 rows)

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------

Bitmap Heap Scan on test2 (cost=4370.84..270377.13 rows=100000 width=20) (actual time=6.806..7.602 rows=850 loops=1)

Recheck Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box)

Buffers: shared hit=1615

-> Bitmap Index Scan on test2_idx (cost=0.00..4345.84 rows=100000 width=0) (actual time=6.709..6.709 rows=850 loops=1)

Index Cond: (v <@ '(0.503,0.503),(0.5,0.5)'::box)

Buffers: shared hit=773

Total runtime: 7.667 ms

(7 rows)

We can see that index scan requires read of several times more pages. Original paper denotes such effect. It explains it by the routing rectangles in less optimal ways. But this effect wasn't so dramatic in tests provided in the paper. So, I have following thoughts about this problem:

1) Number of pages, which was readed from index is too large even with ordinal index build. Querying of small area requires read of hundred of pages. It probbably caused by picksplit implementation. I've version of picksplit algorithm which seems to be much more efficient. I'll do some benchmarks with my picksplit algorithm. I hope difference in index quality will be not so dramatic.

2) I can try to do some enchancements in fast build alogrithms which could improve tree quality. In original paper Hilbert heuristic was used to achive even better tree quality than tree which was created in ordinal way. But since we use GiST we are restricted by it's interface (or we have to create new interface functions(s), but I like to avoid it). I would like to try to do some ordering by penalty value in buffer emptying process and buffers relocation on split.

3) Probably, there is some bug which affects tree quality.

Could you please create a TODO list on the wiki page, listing all the missing features, known bugs etc. that will need to be fixed? That'll make it easier to see how much work there is left. It'll also help anyone looking at the patch to know which issues are known issues.

Sure. I'll create such list on wiki page. I believe that currenlty most important issue is index quality.

Meanwhile, it would still be very valuable if others could test this with different workloads. And Alexander, it would be good if at some point you could write some benchmark scripts too, and put them on the wiki page, just to see what kind of workloads have been taken into consideration and tested already. Do you think there's some worst-case data distributions where this algorithm would perform particularly badly?

I don't expect some bad cases in terms in IO. My most worrying is about index quality which is strongly related to data distribution.

------
With best regards,

Alexander Korotkov.

pgsql-hackers by date:

From: Robert Haas
Date: 06 June 2011, 12:00:09
Subject: Re: reducing the overhead of frequent table locks - now, with WIP patch

From: Heikki Linnakangas
Date: 06 June 2011, 12:02:21
Subject: Re: reducing the overhead of frequent table locks - now, with WIP patch

Re: WIP: Fast GiST index build - Mailing list pgsql-hackers

Previous

Next