Re: GiST index performance - Mailing list pgsql-performance

From Yeb Havinga
Subject Re: GiST index performance
Date
Msg-id 4B9FAF4B.20907@gmail.com
Whole thread Raw
In response to Re: GiST index performance  (Matthew Wakeling <matthew@flymine.org>)
Responses Re: GiST index performance  (Yeb Havinga <yebhavinga@gmail.com>)
List pgsql-performance
Matthew Wakeling wrote:
>> Matthew Wakeling wrote:
>>> A second quite distinct issue is the general performance of GiST
>>> indexes
>>> which is also mentioned in the old thread linked from Open Items. For
>>> that, we have a test case at
>>> http://archives.postgresql.org/pgsql-performance/2009-04/msg00276.php
>>> for
>>> btree_gist indexes. I have a similar example with the bioseg GiST
>>> index. I
>>> have completely reimplemented the same algorithms in Java for algorithm
>>> investigation and instrumentation purposes, and it runs about a hundred
>>> times faster than in Postgres. I think this is a problem, and I'm
>>> willing
>>> to do some investigation to try and solve it.
> I have not made any progress on this issue. I think Oleg and Teodor
> would be better placed working it out. All I can say is that I
> implemented the exact same indexing algorithm in Java, and it
> performed 100 times faster than Postgres. Now, Postgres has to do a
> lot of additional work, like mapping the index onto disc, locking
> pages, and abstracting to plugin user functions, so I would expect
> some difference - I'm not sure 100 times is reasonable though. I tried
> to do some profiling, but couldn't see any one section of code that
> was taking too much time. Not sure what I can further do.
Hello Mathew and list,

A lot of time spent in gistget.c code and a lot of functioncall5's to
the gist's consistent function which is out of sight for gprof.
Something different but related since also gist: we noticed before that
gist indexes that use a compressed form for index entries suffer from
repeated compress calls on query operands (see
http://archives.postgresql.org/pgsql-hackers/2009-05/msg00078.php).

The btree_gist int4 compress function calls the generic
gbt_num_compress, which does a palloc. Maybe this palloc is allso hit al
lot when scanning the index, because the constants that are queries with
are repeatedly compressed and palloced.

regards,
Yeb Havinga



pgsql-performance by date:

Previous
From: Meena_Ramkumar
Date:
Subject: Postgres DB maintainenance - vacuum and reindex
Next
From: Scott Marlowe
Date:
Subject: Re: Postgres DB maintainenance - vacuum and reindex