Home > mailing lists

Re: index structure for 114-dimension vector - Mailing list pgsql-performance

From	Andrew Lazarus
Subject	Re: index structure for 114-dimension vector
Date	April 20, 2007 21:05:33
Msg-id	3810566934.20070420172820@pillette.com Whole thread Raw
In response to	Re: index structure for 114-dimension vector (Mark Kirkwood <markir@paradise.net.nz>)
Responses	Re: index structure for 114-dimension vector Re: index structure for 114-dimension vector
List	pgsql-performance

Tree view

Because I know the 25 closest are going to be fairly close in each
coordinate, I did try a multicolumn index on the last 6 columns and
used a +/- 0.1 or 0.2 tolerance on each. (The 25 best are very probably inside
that hypercube on the distribution of data in question.)

This hypercube tended to have 10-20K records, and took at least 4
seconds to retrieve. I was a little surprised by how long that took.
So I'm wondering if my data representation is off the wall.

I should mention I also tried a cube index using gist on all 114
elements, but CREATE INDEX hadn't finished in 36 hours, when I killed
it, and I wasn't in retrospect sure an index that took something like
6GB by itself would be helpful on a 2GB of RAM box.

MK> I don't think that will work for the vector norm i.e:

MK> |x - y| = sqrt(sum over j ((x[j] - y[j])^2))


MK> Cheers

MK> Mark


--
Sincerely,
 Andrew Lazarus        mailto:andrew@pillette.com

Attachment

vCard.VCF

pgsql-performance by date:

From: Mark Kirkwood
Date: 20 April 2007, 20:42:46
Subject: Re: index structure for 114-dimension vector

From: Mark Kirkwood
Date: 20 April 2007, 21:42:42
Subject: Re: index structure for 114-dimension vector

Re: index structure for 114-dimension vector - Mailing list pgsql-performance

Attachment

Previous

Next