Re: Better index stategy for many fields with few values - Mailing list pgsql-performance

From Markus Schaber
Subject Re: Better index stategy for many fields with few values
Date
Msg-id 443CF9B4.2080009@logix-tt.com
Whole thread Raw
In response to Better index stategy for many fields with few values  (Oscar Picasso <oscgoogle@yahoo.com>)
Responses Re: Better index stategy for many fields with few values
List pgsql-performance
Hi, Oscar,

Oscar Picasso wrote:

> [ all the 20 paramXX columns are used in the query}

> How can I optimize this kind of query?

PostgreSQL 8.1 has so-called bitmap index scans, which can combine
several index scans before actually accessing the data.

So I think it's best to create an index on each of the paramXX columns,
and see with EXPLAIN ANALYZE what it is doing.

> I was thinking about using a multicolumns index, but I have read that
> we should limit multicolumns indice to at most 2 or 3 columns.

Yes, that's true, the index overhead gets too high.

> If that's true then 22 columns for a multicolumn incdex seems way too
> much. Or maybe it is workable as every column uses only a very limited
> set of values?

Yes, I think that a 22 column index is way too much, especially with the
new bitmap index scans available.

> I was also thinking about about using a functional index.

If there's a logical relation between those values that they can easily
combined, that may be a good alternative.


I just had another weird idea:

As your paramXX values can have only 10 parameters, it also might be
feasible to use a bunch of 10 conditional indices, like:

CREATE INDEX foo1 ON table (param1, param2 WHERE param0='1st value';
CREATE INDEX foo2 ON table (param1, param2 WHERE param0='2nd value';
CREATE INDEX foo3 ON table (param1, param2 WHERE param0='3rd value';
[...]

This way, you don't have the index bloat of a 3-column index, but 10
2-column indices that cover 1/10th of the table each.

For 22 columns, you'd need a bunch of seven such indices plus a
single-column one, or can use some 3+1 and some 2+1 column index.

I'd like to see the query plans from explain analyze.

Btw, I expect query planning time to get rather significant for so much
columns, so gequo tuning, tuning work_mem (for the bitmap scans) and
prepared statements will pay off.

HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

pgsql-performance by date:

Previous
From: PFC
Date:
Subject: Re: FOREIGN KEYS vs PERFORMANCE
Next
From: "Harry Hehl"
Date:
Subject: Re: Sequencial scan instead of using index