Thread: Indexing columns with low cardinality: persistent bitmap indexes?
Hello, I have a column with a small number of distinct values, indexing this one with a standard BTree is useless. How do I can index this column efficiently? I searched and it seems that pg doesn't support the creation of persistent bitmap indexes... Is that feature planned in next releases of pg? Thanks Bruno Lavoie
Bruno Lavoie escribió: > Hello, > > I have a column with a small number of distinct values, indexing this > one with a standard BTree is useless. How do I can index this column > efficiently? I searched and it seems that pg doesn't support the > creation of persistent bitmap indexes... It doesn't. > Is that feature planned in next releases of pg? There are some efforts to get it done, but don't hold your breath (it won't be in 8.4 either, as it has major problems currently.) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Monday 12 January 2009 21:38:02 Bruno Lavoie wrote: > Hello, > > I have a column with a small number of distinct values, indexing this > one with a standard BTree is useless. How do I can index this column > efficiently? I searched and it seems that pg doesn't support the > creation of persistent bitmap indexes... Is that feature planned in next > releases of pg? > > Thanks > Bruno Lavoie I would try partial indexes, as many as the distinct values. I'm not sure this would help, though. -- Fahrbahn ist ein graues Band weisse Streifen, grüner Rand
On Mon, Jan 12, 2009 at 4:16 PM, Reg Me Please <regmeplease@gmail.com> wrote: > On Monday 12 January 2009 21:38:02 Bruno Lavoie wrote: >> Hello, >> >> I have a column with a small number of distinct values, indexing this >> one with a standard BTree is useless. How do I can index this column >> efficiently? I searched and it seems that pg doesn't support the >> creation of persistent bitmap indexes... Is that feature planned in next >> releases of pg? >> >> Thanks >> Bruno Lavoie > > I would try partial indexes, as many as the distinct values. > I'm not sure this would help, though. > you should create partial indexes only on those values that are a lower fraction on the table ie: if you have value "fraction of the table that has this value" 1 5% 2 3% 3 20% 4 25% 5 47% then only partial indexes on values 1 and 2 are of some value -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
On Mon, Jan 12, 2009 at 1:38 PM, Bruno Lavoie <bruno.lavoie@gmail.com> wrote: > Hello, > > I have a column with a small number of distinct values, indexing this one > with a standard BTree is useless. How do I can index this column > efficiently? I searched and it seems that pg doesn't support the creation of > persistent bitmap indexes... Is that feature planned in next releases of pg? You have a few options based on your access patterns. If you tend to access just one of these an get them all at once, then either clusting on this value, or partitioning your table will help. If you access your data using these values and other column values at the same time, then partial or multi-column indexes might help.
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Alvaro Herrera > Sent: Monday, January 12, 2009 12:41 PM > To: Bruno Lavoie > Cc: PostgreSQL > Subject: Re: [GENERAL] Indexing columns with low cardinality: > persistentbitmap indexes? > > Bruno Lavoie escribió: > > Hello, > > > > I have a column with a small number of distinct values, indexing this > > one with a standard BTree is useless. How do I can index this column > > efficiently? I searched and it seems that pg doesn't support the > > creation of persistent bitmap indexes... > > It doesn't. > > > Is that feature planned in next releases of pg? > > There are some efforts to get it done, but don't hold your breath (it > won't be in 8.4 either, as it has major problems currently.) Here is an interesting experiment: Application of Bitmap Index to Information Retrieval. K. Fujioka, Y. Uematsu, and M. Onizuka. WWW 2008 Source: [ACM] Synopsis: This paper proposes a hierarchical structure called HS-bitmap index to represent document-term matrix. The authorsimplemented their data structure on PostgreSQL and observed it to perform better than an inverted index. A short-comingmight be that HS-bitmap index takes more space than the inverted index even after compression. Note this work makes use of PostgreSQL but is unrelated to the on-going work of implementing bitmap index in PostgreSQL. http://portal.acm.org/citation.cfm?doid=1367497.1367680 Here is the research page where I found the above: http://www-users.cs.umn.edu/~kewu/annotated.html
Jaime-
Porque no utiliza Bitmap?
*Saludos Cordiales desde EEUU*
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
> Date: Mon, 12 Jan 2009 16:20:40 -0500
> From: jcasanov@systemguards.com.ec
> To: regmeplease@gmail.com
> Subject: Re: [GENERAL] Indexing columns with low cardinality: persistent bitmap indexes?
> CC: pgsql-general@postgresql.org; bruno.lavoie@gmail.com
>
> On Mon, Jan 12, 2009 at 4:16 PM, Reg Me Please <regmeplease@gmail.com> wrote:
> > On Monday 12 January 2009 21:38:02 Bruno Lavoie wrote:
> >> Hello,
> >>
> >> I have a column with a small number of distinct values, indexing this
> >> one with a standard BTree is useless. How do I can index this column
> >> efficiently? I searched and it seems that pg doesn't support the
> >> creation of persistent bitmap indexes... Is that feature planned in next
> >> releases of pg?
> >>
> >> Thanks
> >> Bruno Lavoie
> >
> > I would try partial indexes, as many as the distinct values.
> > I'm not sure this would help, though.
> >
>
> you should create partial indexes only on those values that are a
> lower fraction on the table
> ie: if you have
>
> value "fraction of the table that has this value"
> 1 5%
> 2 3%
> 3 20%
> 4 25%
> 5 47%
>
> then only partial indexes on values 1 and 2 are of some value
>
> --
> Atentamente,
> Jaime Casanova
> Soporte y capacitación de PostgreSQL
> Asesoría y desarrollo de sistemas
> Guayaquil - Ecuador
> Cel. +59387171157
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Windows Live™: Keep your life in sync. Check it out.
Porque no utiliza Bitmap?
*Saludos Cordiales desde EEUU*
Martin
______________________________________________
Disclaimer and confidentiality note
Everything in this e-mail and any attachments relates to the official business of Sender. This transmission is of a confidential nature and Sender does not endorse distribution to any party other than intended recipient. Sender does not necessarily endorse content contained within this transmission.
> Date: Mon, 12 Jan 2009 16:20:40 -0500
> From: jcasanov@systemguards.com.ec
> To: regmeplease@gmail.com
> Subject: Re: [GENERAL] Indexing columns with low cardinality: persistent bitmap indexes?
> CC: pgsql-general@postgresql.org; bruno.lavoie@gmail.com
>
> On Mon, Jan 12, 2009 at 4:16 PM, Reg Me Please <regmeplease@gmail.com> wrote:
> > On Monday 12 January 2009 21:38:02 Bruno Lavoie wrote:
> >> Hello,
> >>
> >> I have a column with a small number of distinct values, indexing this
> >> one with a standard BTree is useless. How do I can index this column
> >> efficiently? I searched and it seems that pg doesn't support the
> >> creation of persistent bitmap indexes... Is that feature planned in next
> >> releases of pg?
> >>
> >> Thanks
> >> Bruno Lavoie
> >
> > I would try partial indexes, as many as the distinct values.
> > I'm not sure this would help, though.
> >
>
> you should create partial indexes only on those values that are a
> lower fraction on the table
> ie: if you have
>
> value "fraction of the table that has this value"
> 1 5%
> 2 3%
> 3 20%
> 4 25%
> 5 47%
>
> then only partial indexes on values 1 and 2 are of some value
>
> --
> Atentamente,
> Jaime Casanova
> Soporte y capacitación de PostgreSQL
> Asesoría y desarrollo de sistemas
> Guayaquil - Ecuador
> Cel. +59387171157
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Windows Live™: Keep your life in sync. Check it out.
> > Hello, > > > > I have a column with a small number of distinct values, indexing this one > > with a standard BTree is useless. How do I can index this column > > efficiently? I searched and it seems that pg doesn't support the creation of > > persistent bitmap indexes... Is that feature planned in next releases of pg? > You have a few options based on your access patterns. If you tend to > access just one of these an get them all at once, then either clusting > on this value, or partitioning your table will help. How will clustering benefit this pattern? Won't a full table scan be required regardless of the table being clustered? And I thought the point of clustering was the organize the table by some indexed key, requiring fewer seeks and increasing the likelihood of the pages being in the cache .... if the index is never used in this case ( low cardinality ) would it still help? > If you access your data using these values and other column values at > the same time, then partial or multi-column indexes might help.