Re: GiST index performance - Mailing list pgsql-performance
From | Matthew Wakeling |
---|---|
Subject | Re: GiST index performance |
Date | |
Msg-id | alpine.DEB.2.00.0906101232020.4337@aragorn.flymine.org Whole thread Raw |
In response to | Re: GiST index performance (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: GiST index performance
|
List | pgsql-performance |
On Fri, 5 Jun 2009, Robert Haas wrote: > On Thu, Jun 4, 2009 at 12:33 PM, Matthew Wakeling<matthew@flymine.org> wrote: >> Do you have a recommendation for how to go about profiling Postgres, what >> profiler to use, etc? I'm running on Debian Linux x86_64. > > I mostly compile with --enable-profiling and use gprof. I know Tom > Lane has had success with oprofile for quick and dirty measurements > but I haven't quite figured out how to make all the bits work for that > yet. Okay, I recompiled Postgres 8.4 beta 2 with profiling, installed btree_gist and bioseg, and did a large multi-column (btree_gist, bioseg) index search. EXPLAIN ANALYSE SELECT * FROM location l1, location l2 WHERE l1.objectid = l2.objectid AND bioseg_create(l1.intermine_start, l1.intermine_end) && bioseg_create(l2.intermine_start, l2.intermine_end); QUERY PLAN ----------------------------------------------------------------------- Nested Loop (cost=0.01..9292374.77 rows=19468831 width=130) (actual time=0.337..24538315.569 rows=819811624 loops=1) -> Seq Scan on location l1 (cost=0.00..90092.17 rows=4030117 width=65) (actual time=0.033..2561.142 rows=4030122 loops=1) -> Index Scan using location_object_bioseg on location l2 (cost=0.01..1.58 rows=35 width=65) (actual time=0.467..5.990 rows=203 loops=4030122) Index Cond: ((l2.objectid = l1.objectid) AND (bioseg_create(l1.intermine_start, l1.intermine_end) && bioseg_create(l2.intermine_start,l2.intermine_end))) Total runtime: 24613918.894 ms (5 rows) Here is the top of the profiling result: Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ks/call Ks/call name 35.41 2087.04 2087.04 823841746 0.00 0.00 gistnext 15.36 2992.60 905.56 8560743382 0.00 0.00 fmgr_oldstyle 8.65 3502.37 509.77 3641723296 0.00 0.00 FunctionCall5 7.08 3919.87 417.50 3641723296 0.00 0.00 gistdentryinit 5.03 4216.59 296.72 6668 0.00 0.00 DirectFunctionCall1 3.84 4443.05 226.46 3641724371 0.00 0.00 FunctionCall1 2.32 4579.94 136.89 1362367466 0.00 0.00 hash_search_with_hash_value 1.89 4691.15 111.21 827892583 0.00 0.00 FunctionCall2 1.83 4799.27 108.12 FunctionCall6 1.77 4903.56 104.30 2799321398 0.00 0.00 LWLockAcquire 1.45 4989.24 85.68 1043922430 0.00 0.00 PinBuffer 1.37 5070.15 80.91 823844102 0.00 0.00 index_getnext 1.33 5148.29 78.15 1647683886 0.00 0.00 slot_deform_tuple 0.95 5204.36 56.07 738604164 0.00 0.00 heap_page_prune_opt The final cumulative time is 5894.06 seconds, which doesn't seem to match the query run time at all. Also, no calls to anything including "bioseg" in the name are recorded, although they are definitely called as the GiST support functions for that data type. Could someone give me a hand decyphering this result? It seems from this that the time is spent in the gistnext function (in src/backend/access/gist/gistget.c) and not its children. However, it's quite a large function containing several loops - is there a way to make the profiling result more specific? Matthew -- If you let your happiness depend upon how somebody else feels about you, now you have to control how somebody else feels about you. -- Abraham Hicks
pgsql-performance by date: