Thread: array column and b-tree index allowing only 8191 bytes

array column and b-tree index allowing only 8191 bytes

From
Celso Pinto
Date:
Hi all,

I'm checking out some features in pgsql and found out about an array
datatype. As I'm curious to find out how well it performs, I've created
a table that contains an integer[] column and a script to insert about
500K rows in it. The length for the integer[] column is random (can be
10, can be 5000, can be more than that), as are the values in it.

When trying to insert a row, I get the following error:

    index row requires 9796 bytes, maximum size is 8191

If I understood correctly, this is a limit of the b-tree index. Usually
you'd want to use another type of index but, again if I understood
correctly, those are meant mostly for full-text indexing.

Remember, I'm doing this for recreational purposes (the array type is
there, wanted to check out it's performance so why not? :-) ).

So my questions are: is this at all possible? If so, is is possible to
increate that maximum size?

Regards,
Celso


Re: array column and b-tree index allowing only 8191 bytes

From
Alvaro Herrera
Date:
Celso Pinto wrote:

> So my questions are: is this at all possible? If so, is is possible to
> increate that maximum size?

Indexing the arrays themselves is probably pretty useless.  Try indexing
the elements, which you can do with the intarray contrib module.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: array column and b-tree index allowing only 8191 bytes

From
Celso Pinto
Date:
Hi Alvaro,

thanks for the hint. I've since experimented with gin and gist and did a
small pgbench custom script test.

Recalling from my previous message, the int[] on a row can have a
maximum of 5000 values. From here I judged gin to be the best option but
inserting is really slow. The test was performed on a small EC2
instance. I raised maintenance_work_mem to 512MB but still inserting 50K
rows takes more than an hour.

I also tested gist, inserts run quickly but running pgbench with 100
clients, each making 10 selects on a random value contained in the int[]
takes the machine load to values such as 88 which is definately a no go.

What, if any, would be the recommended options to improve this
scenario? Not using intarray? :-)

Cheers,
Celso

On Sáb, 2008-06-07 at 12:38 -0400, Alvaro Herrera wrote:
> Celso Pinto wrote:
>
> > So my questions are: is this at all possible? If so, is is possible to
> > increate that maximum size?
>
> Indexing the arrays themselves is probably pretty useless.  Try indexing
> the elements, which you can do with the intarray contrib module.



Re: array column and b-tree index allowing only 8191 bytes

From
Alvaro Herrera
Date:
Celso Pinto wrote:

> What, if any, would be the recommended options to improve this
> scenario? Not using intarray? :-)

Not using a broken design.  Arrays are a poor fit in the relational model.
Avoid them.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support