Thread: 1-/2-dimensional indexes for common columns, rationale?
Hello, we have several indexes such as: create index foo1 on bla (a); create index foo2 on bla (b); create index foo3 on bla (a,b); They are all used often by frequently used queries (according to pg_statio_user_indexes), but we need somewhat higher INSERT/UPDATE performance (having tuned most other things) so we'd like to remove some. Which of the above would generally speaking be most redundant / best to remove? Is a 2-dimensional index always much slower than a 1-dimensional with the first column for queries on the first column? Any other suggestions? Thanks, Marinos
am Wed, dem 26.03.2008, um 15:18:53 +0100 mailte Marinos Yannikos folgendes: > Hello, > > we have several indexes such as: > > create index foo1 on bla (a); > create index foo2 on bla (b); > create index foo3 on bla (a,b); > > They are all used often by frequently used queries (according to > pg_statio_user_indexes), but we need somewhat higher INSERT/UPDATE > performance (having tuned most other things) so we'd like to remove some. Which version do you have? Since 8.1 pg can use a so called 'bitmap index scan', because of this feature i guess you don't need the index foo3. (if you have 8.1 or higher) Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
On Wed, 26 Mar 2008, A. Kretschmer wrote: >> create index foo1 on bla (a); >> create index foo2 on bla (b); >> create index foo3 on bla (a,b); > > Which version do you have? Since 8.1 pg can use a so called 'bitmap > index scan', because of this feature i guess you don't need the index > foo3. (if you have 8.1 or higher) Depending on your query, the bitmap index scan could be a good deal slower than index foo3. All of this depends on what queries you are going to be running, and how much you value insert performance compared to select performance. I know that foo3 can do everything that foo1 can, so foo1 could be viewed as redundant. I'd be interested in hearing from the Powers That Be whether foo2 is redundant too. It wasn't a while back. My impression is that foo3 isn't much more expensive to alter than foo1 - is that correct? Matthew -- Lord grant me patience, and I want it NOW!
A. Kretschmer schrieb: >> create index foo1 on bla (a); >> create index foo2 on bla (b); >> create index foo3 on bla (a,b); >>[...] > > Which version do you have? Since 8.1 pg can use a so called 'bitmap > index scan', because of this feature i guess you don't need the index > foo3. (if you have 8.1 or higher) 8.3.1 - foo3 is being used though in presence of both foo1 and foo2, so I'd suppose that it's a better choice even with bitmap index scan available... -mjy
am Wed, dem 26.03.2008, um 16:15:20 +0100 mailte Marinos Yannikos folgendes: > A. Kretschmer schrieb: > >>create index foo1 on bla (a); > >>create index foo2 on bla (b); > >>create index foo3 on bla (a,b); > >>[...] > > > >Which version do you have? Since 8.1 pg can use a so called 'bitmap > >index scan', because of this feature i guess you don't need the index > >foo3. (if you have 8.1 or higher) > > 8.3.1 - foo3 is being used though in presence of both foo1 and foo2, so > I'd suppose that it's a better choice even with bitmap index scan > available... Maybe... Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
Marinos Yannikos wrote: >> >> Which version do you have? Since 8.1 pg can use a so called 'bitmap >> index scan', because of this feature i guess you don't need the index >> foo3. (if you have 8.1 or higher) > > 8.3.1 - foo3 is being used though in presence of both foo1 and foo2, > so I'd suppose that it's a better choice even with bitmap index scan > available... > PostgreSQL can also partially use a multi-column index. For example, if you dropped your index on (a) Pg could use index (a,b) to help with queries for `a'. However, the index would be slower than an index on a alone would be. See: http://www.postgresql.org/docs/8.3/interactive/indexes-multicolumn.html As usual, the best answer is really "do some testing with your queries, and with EXPLAIN ANALYZE, and see what works best". Test with inserts too, because it's likely that the cost of updating each of the three indexes isn't equal. It might also be worth looking into using partial indexes if some of your data is "hotter" than others and perhaps more worth the index update cost. -- Craig Ringer
>>> create index foo1 on bla (a); >>> create index foo2 on bla (b); >>> create index foo3 on bla (a,b); You say you need faster INSERT performance. Getting rid of some indexes is a way, but can you tell a bit more about your hardware setup ? For instance, if you only have one HDD, put an extra HDD in the machine, and put the database on it, but leave the pg_xlog on the OS's disk. Or the reverse, depending on which disk is faster, and other factors. Since heavy INSERTs mean heavy log writing traffic, this almost doubles your write bandwidth for the cost of a disk. Cheap and efficient. You can also put the indexes on a third disk, but separating database and log on 2 disks will give you the most benefits. If you already have a monster hardware setup, though...