Thread: Any "guide to indexes" exists?
Hello, I would need more info about index types in postgre (btree, hash, gin and gist) - is there any guide that explains in detail when to use which index type? These index types have different performance with certain collumn types and data characteristics store in them. There's not much info about it pg docs. So, does any document describing detailed index usage or do you have any personal recomendations when to use which index? Thanks for you answers. Regards, Jan
Am Montag, 7. Mai 2007 16:09 schrieb Jan Bilek: > I would need more info about index types in postgre (btree, hash, gin and > gist) - is there any guide that explains in detail when to use which index > type? You use btree unless you have a specific, particular reason to use one of the other ones. -- Peter Eisentraut http://developer.postgresql.org/~petere/
I was asking for these "specific/particular reasons". Im not the database developer, to be expert on indexes (i know whats btree and hash - gin and gist are rather mysterious for me). Ok - btree is fine, but sometimes could be better to use gist - my question is: when is that "sometimes"? Unless i know how to use the indexes, then they are useless for me - am i right? Note: We are using gin with tsearch2 vectors, but here we had no choice - tsearch2 works only with gin and gist. Regards, Jan > Am Montag, 7. Mai 2007 16:09 schrieb Jan Bilek: >> I would need more info about index types in postgre (btree, hash, gin and >> gist) - is there any guide that explains in detail when to use which >> index >> type? > > You use btree unless you have a specific, particular reason to use one of > the > other ones. > > -- > Peter Eisentraut > http://developer.postgresql.org/~petere/
> Hello, > I would need more info about index types in postgre (btree, hash, gin and > gist) - is there any guide that explains in detail when to use which index > type? These index types have different performance with certain collumn > types and data characteristics store in them. There's not much info about it > pg docs. So, does any document describing detailed index usage or do you > have any personal recomendations when to use which index? http://people.planetpostgresql.org/mha/index.php?/archives/112-GIN-performance-postgresql.org-websearch-part-2.html http://people.planetpostgresql.org/xzilla/index.php?/archives/278-PostgreSQL-full-text-search-testing.html Summary: * fulltext, GIS, arrays .. gin, gist indexes (gin is faster, but slow update) * others .. btree index (I don't know anybody who use hash index) Regards Pavel Stehule
On May 7, 2007, at 9:44 AM, Pavel Stehule wrote: >> Hello, >> I would need more info about index types in postgre (btree, hash, >> gin and >> gist) - is there any guide that explains in detail when to use >> which index >> type? These index types have different performance with certain >> collumn >> types and data characteristics store in them. There's not much >> info about it >> pg docs. So, does any document describing detailed index usage or >> do you >> have any personal recomendations when to use which index? > > http://people.planetpostgresql.org/mha/index.php?/archives/112-GIN- > performance-postgresql.org-websearch-part-2.html > http://people.planetpostgresql.org/xzilla/index.php?/archives/278- > PostgreSQL-full-text-search-testing.html > > > Summary: > * fulltext, GIS, arrays .. gin, gist indexes (gin is faster, but > slow update) > * others .. btree index (I don't know anybody who use hash index) GiST can also be useful if you have to query in multiple dimensions, which can occur outside the normal case of geometry. Best example I know of is a table containing duration information in the form of start_time and end_time. Trying to query for what events happened on 5/28/2005 will generally be much cheaper with a GiST index than a b- tree. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Mon, May 07, 2007 at 10:47:24AM -0500, Jim Nasby wrote: > GiST can also be useful if you have to query in multiple dimensions, > which can occur outside the normal case of geometry. Best example I > know of is a table containing duration information in the form of > start_time and end_time. Trying to query for what events happened on > 5/28/2005 will generally be much cheaper with a GiST index than a b- > tree. Are you referring to queries with ... where some_timestamp between some_start and some_end ... or ... where some_timestamp > some_start and some_timestamp < some_end ... ? Thanks, Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
"Jan Bilek" <bilekj@gmail.com> writes: > Ok - btree is fine, but sometimes could be better to use gist - my question > is: when is that "sometimes"? Unless i know how to use the indexes, then > they are useless for me - am i right? gist and gin are for indexing queries that btree is not capable of dealing with, ie, the WHERE clauses are not simple scalar equality or range checks. All you need to pay attention to is whether the index has an operator class that includes the WHERE operator you want to use. regards, tom lane
bilekj@gmail.com ("Jan Bilek") writes: > I was asking for these "specific/particular reasons". Im not the > database developer, to be expert on indexes (i know whats btree and > hash - gin and gist are rather mysterious for me). > Ok - btree is fine, but sometimes could be better to use gist - my > question is: when is that "sometimes"? Unless i know how to use the > indexes, then they are useless for me - am i right? > > Note: We are using gin with tsearch2 vectors, but here we had no > choice - > tsearch2 works only with gin and gist. >> You use btree unless you have a specific, particular reason to use >> one of the >> other ones. Peter's comments are pretty germane. You use btree if you can. btree is fine for anything where values are reasonable "scalar," and compare to one another either in a straightforward scalar fashion, or piece-wise scalar, as with compound keys where you start by comparing the first component, then, if it's equal, considering the second, and so on. You use GIST if you *need to*, if you have a data structure where values *aren't* scalar, *can't* be compared (e.g. - where you can't simply say A > B, B > C). -- "cbbrowne","@","cbbrowne.com" http://cbbrowne.com/info/wp.html "Access to a COFF symbol table via ldtbread is even less abstract, really sucks in general, and should be banned from earth." -- SCSH 0.5.1 unix.c
On May 7, 2007, at 11:01 AM, Karsten Hilbert wrote: > On Mon, May 07, 2007 at 10:47:24AM -0500, Jim Nasby wrote: >> GiST can also be useful if you have to query in multiple dimensions, >> which can occur outside the normal case of geometry. Best example I >> know of is a table containing duration information in the form of >> start_time and end_time. Trying to query for what events happened on >> 5/28/2005 will generally be much cheaper with a GiST index than a b- >> tree. > Are you referring to queries with > > ... where some_timestamp between some_start and some_end ... > > or > > ... where some_timestamp > some_start and some_timestamp < > some_end ... Yes. :) Say the table spans 10 years of records. You're looking for a specific set of records that affect a day 5 years ago. Your where clause will look something like this: WHERE start_time < now() - 5 years AND end_time > now() - 5 years 1 day B-tree indexes will be nearly useless in this case, because each one is going to match on over half the table. But there are operators that would let you treat this as a 2 dimensional problem and then use GiST (or r-tree, but that's no longer recommended). For example, if you create an index on: box(point(start_time, start_time), point(end_time, end_time)) Then you just need to query for all rows that overlap the box defined by: box(point(now() - 5 years 1 day, now() - 5 years 1 day), point(now() - 5 years, now() - 5 years)) You'd have to double-check the exact logic and see which overlaps operator you want, but the basic idea is the same: you're converting 2 one-dimensional timelines (start_time and end_time) into a two- dimensional timeline, and then using geometric functions to index that. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)