Re: WIP: Fast GiST index build - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: WIP: Fast GiST index build
Date
Msg-id 4E044D68.5070605@enterprisedb.com
Whole thread Raw
In response to Re: WIP: Fast GiST index build  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: WIP: Fast GiST index build
Optimizing pg_trgm makesign() (was Re: WIP: Fast GiST index build)
List pgsql-hackers
On 21.06.2011 13:08, Alexander Korotkov wrote:
> I've created section about testing in project wiki page:
> http://wiki.postgresql.org/wiki/Fast_GiST_index_build_GSoC_2011#Testing_results
> Do you have any notes about table structure?

It would be nice to have links to the datasets and scripts used, so that 
others can reproduce the tests.

It's surprising that the search time differs so much between the 
point_ops tests with uniformly random data with 100M and 10M rows. Just 
to be sure I'm reading it correctly: a small search time is good, right? 
You might want to spell that out explicitly.

> As you can see I found that CPU usage might be much higher
> with gist_trgm_ops.

Yeah, that is a bit worrysome. 6 minutes without the patch and 18 
minutes with it.

> I believe it's due to relatively expensive penalty
> method in that opclass.

Hmm, I wonder if it could be optimized. I did a quick test, creating a 
gist_trgm_ops index on a list of English words from 
/usr/share/dict/words. oprofile shows that with the patch, 60% of the 
CPU time is spent in the makesign() function.

> But, probably index build can be still faster when
> index doesn't fit cache even for gist_trgm_ops.

Yep.

> Also with that opclass index
> quality is slightly worse but the difference is not dramatic.

5-10% difference should be acceptable

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: HuangQi
Date:
Subject: debugging tools inside postgres
Next
From: Shigeru Hanada
Date:
Subject: Re: debugging tools inside postgres