Re: high-dimensional knn-GIST tests (was Re: Cube extension kNN support) - Mailing list pgsql-hackers

From Marcin Mańk
Subject Re: high-dimensional knn-GIST tests (was Re: Cube extension kNN support)
Date
Msg-id CAK61fk4gh8qRc_0+yig4VnjCPpizUt-dq=dguxUVQ-D=Ztx_Ng@mail.gmail.com
Whole thread Raw
In response to high-dimensional knn-GIST tests (was Re: Cube extension kNN support)  (Gordon Mohr <gojomo-pgsql@xavvy.com>)
List pgsql-hackers



On Thu, Oct 24, 2013 at 3:50 AM, Gordon Mohr <gojomo-pgsql@xavvy.com> wrote:
On 9/22/13 4:38 PM, Stas Kelvich wrote:
Hello, hackers.

Here is the patch that introduces kNN search for cubes with
euclidean, taxicab and chebyshev distances.

Thanks for this! I decided to give the patch a try at the bleeding edge with some high-dimensional vectors, specifically the 1.4 million 1000-dimensional Freebase entity vectors from the Google 'word2vec' project:

I believe the curse of dimensionality is affecting you here. I think it is impossible to get an improvement over sequential scan for 1000 dimensional vectors. Read here:


Regards
Marcin Mańk

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: lob conversion functionality
Next
From: Craig Ringer
Date:
Subject: Re: CLUSTER FREEZE