Thread: Any "guide to indexes" exists?

Any "guide to indexes" exists?

From
"Jan Bilek"
Date:
Hello,
I would need more info about index types in postgre (btree, hash, gin and
gist) - is there any guide that explains in detail when to use which index
type? These index types have different performance with certain collumn
types and data characteristics store in them. There's not much info about it
pg docs. So, does any document describing detailed index usage or do you
have any personal recomendations when to use which index?

Thanks for you answers.

Regards,
Jan


Re: Any "guide to indexes" exists?

From
Peter Eisentraut
Date:
Am Montag, 7. Mai 2007 16:09 schrieb Jan Bilek:
> I would need more info about index types in postgre (btree, hash, gin and
> gist) - is there any guide that explains in detail when to use which index
> type?

You use btree unless you have a specific, particular reason to use one of the
other ones.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Any "guide to indexes" exists?

From
"Jan Bilek"
Date:
I was asking for these "specific/particular reasons". Im not the database
developer, to be expert on indexes (i know whats btree and hash - gin and
gist are rather mysterious for me).
Ok - btree is fine, but sometimes could be better to use gist - my question
is: when is that "sometimes"? Unless i know how to use the indexes, then
they are useless for me - am i right?

Note: We are using gin with tsearch2 vectors, but here we had no choice -
tsearch2 works only with gin and gist.

Regards,
Jan

> Am Montag, 7. Mai 2007 16:09 schrieb Jan Bilek:
>> I would need more info about index types in postgre (btree, hash, gin and
>> gist) - is there any guide that explains in detail when to use which
>> index
>> type?
>
> You use btree unless you have a specific, particular reason to use one of
> the
> other ones.
>
> --
> Peter Eisentraut
> http://developer.postgresql.org/~petere/


Re: Any "guide to indexes" exists?

From
"Pavel Stehule"
Date:
> Hello,
> I would need more info about index types in postgre (btree, hash, gin and
> gist) - is there any guide that explains in detail when to use which index
> type? These index types have different performance with certain collumn
> types and data characteristics store in them. There's not much info about it
> pg docs. So, does any document describing detailed index usage or do you
> have any personal recomendations when to use which index?

http://people.planetpostgresql.org/mha/index.php?/archives/112-GIN-performance-postgresql.org-websearch-part-2.html
http://people.planetpostgresql.org/xzilla/index.php?/archives/278-PostgreSQL-full-text-search-testing.html


Summary:
* fulltext, GIS, arrays .. gin, gist indexes (gin is faster, but slow update)
* others .. btree index (I don't know anybody who use hash index)

Regards
Pavel Stehule

Re: Any "guide to indexes" exists?

From
Jim Nasby
Date:
On May 7, 2007, at 9:44 AM, Pavel Stehule wrote:
>> Hello,
>> I would need more info about index types in postgre (btree, hash,
>> gin and
>> gist) - is there any guide that explains in detail when to use
>> which index
>> type? These index types have different performance with certain
>> collumn
>> types and data characteristics store in them. There's not much
>> info about it
>> pg docs. So, does any document describing detailed index usage or
>> do you
>> have any personal recomendations when to use which index?
>
> http://people.planetpostgresql.org/mha/index.php?/archives/112-GIN-
> performance-postgresql.org-websearch-part-2.html
> http://people.planetpostgresql.org/xzilla/index.php?/archives/278-
> PostgreSQL-full-text-search-testing.html
>
>
> Summary:
> * fulltext, GIS, arrays .. gin, gist indexes (gin is faster, but
> slow update)
> * others .. btree index (I don't know anybody who use hash index)

GiST can also be useful if you have to query in multiple dimensions,
which can occur outside the normal case of geometry. Best example I
know of is a table containing duration information in the form of
start_time and end_time. Trying to query for what events happened on
5/28/2005 will generally be much cheaper with a GiST index than a b-
tree.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



Re: Any "guide to indexes" exists?

From
Karsten Hilbert
Date:
On Mon, May 07, 2007 at 10:47:24AM -0500, Jim Nasby wrote:

> GiST can also be useful if you have to query in multiple dimensions,
> which can occur outside the normal case of geometry. Best example I
> know of is a table containing duration information in the form of
> start_time and end_time. Trying to query for what events happened on
> 5/28/2005 will generally be much cheaper with a GiST index than a b-
> tree.
Are you referring to queries with

    ... where some_timestamp between some_start and some_end ...

or

    ... where some_timestamp > some_start and some_timestamp < some_end ...

?

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: Any "guide to indexes" exists?

From
Tom Lane
Date:
"Jan Bilek" <bilekj@gmail.com> writes:
> Ok - btree is fine, but sometimes could be better to use gist - my question
> is: when is that "sometimes"? Unless i know how to use the indexes, then
> they are useless for me - am i right?

gist and gin are for indexing queries that btree is not capable of
dealing with, ie, the WHERE clauses are not simple scalar equality
or range checks.  All you need to pay attention to is whether the
index has an operator class that includes the WHERE operator you
want to use.

            regards, tom lane

Re: Any "guide to indexes" exists?

From
Chris Browne
Date:
bilekj@gmail.com ("Jan Bilek") writes:
> I was asking for these "specific/particular reasons". Im not the
> database developer, to be expert on indexes (i know whats btree and
> hash - gin and gist are rather mysterious for me).
> Ok - btree is fine, but sometimes could be better to use gist - my
> question is: when is that "sometimes"? Unless i know how to use the
> indexes, then they are useless for me - am i right?
>
> Note: We are using gin with tsearch2 vectors, but here we had no
> choice -
> tsearch2 works only with gin and gist.

>> You use btree unless you have a specific, particular reason to use
>> one of the
>> other ones.

Peter's comments are pretty germane.

You use btree if you can.  btree is fine for anything where values are
reasonable "scalar," and compare to one another either in a
straightforward scalar fashion, or piece-wise scalar, as with compound
keys where you start by comparing the first component, then, if it's
equal, considering the second, and so on.

You use GIST if you *need to*, if you have a data structure where
values *aren't* scalar, *can't* be compared (e.g. - where you can't
simply say A > B, B > C).
--
"cbbrowne","@","cbbrowne.com"
http://cbbrowne.com/info/wp.html
"Access to a COFF symbol table via ldtbread is even less abstract,
 really sucks in general, and should be banned from earth."
        -- SCSH 0.5.1 unix.c

Re: Any "guide to indexes" exists?

From
Jim Nasby
Date:
On May 7, 2007, at 11:01 AM, Karsten Hilbert wrote:
> On Mon, May 07, 2007 at 10:47:24AM -0500, Jim Nasby wrote:
>> GiST can also be useful if you have to query in multiple dimensions,
>> which can occur outside the normal case of geometry. Best example I
>> know of is a table containing duration information in the form of
>> start_time and end_time. Trying to query for what events happened on
>> 5/28/2005 will generally be much cheaper with a GiST index than a b-
>> tree.
> Are you referring to queries with
>
>     ... where some_timestamp between some_start and some_end ...
>
> or
>
>     ... where some_timestamp > some_start and some_timestamp <
> some_end ...

Yes. :)

Say the table spans 10 years of records. You're looking for a
specific set of records that affect a day 5 years ago. Your where
clause will look something like this:

WHERE start_time < now() - 5 years AND end_time > now() - 5 years 1 day

B-tree indexes will be nearly useless in this case, because each one
is going to match on over half the table. But there are operators
that would let you treat this as a 2 dimensional problem and then use
GiST (or r-tree, but that's no longer recommended). For example, if
you create an index on:

box(point(start_time, start_time), point(end_time, end_time))

Then you just need to query for all rows that overlap the box defined
by:

box(point(now() - 5 years 1 day, now() - 5 years 1 day), point(now()
- 5 years, now() - 5 years))

You'd have to double-check the exact logic and see which overlaps
operator you want, but the basic idea is the same: you're converting
2 one-dimensional timelines (start_time and end_time) into a two-
dimensional timeline, and then using geometric functions to index that.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)