> From: Tom Lane <tgl@sss.pgh.pa.us>
> Sent: 25 June 2020 17:43
>
> Alastair McKinley <a.mckinley@analyticsengines.com> writes:
> > I know that Cube in it's current form isn't suitable for nearest-neighbour searching these vectors in their raw
form(I have tried recompilation with higher CUBE_MAX_DIM myself), but conceptually kNN GiST searches using Cubes can be
usefulfor these applications. There are other pre-processing techniques that can be used to improved the speed of the
search,but it still ends up with a kNN search in a high-ish dimensional space.
>
> Is there a way to fix the numerical instability involved? If we could do
> that, then we'd definitely have a use-case justifying the work to make
> cube toastable.
I am not that familiar with the nature of the numerical instability, but it might be worth noting for additional
contextthat for the NN use case:
- The value of each dimension is likely to be between 0 and 1
- The L1 distance is meaningful for high numbers of dimensions, which *possibly* suffers less from the numeric issues
thaneuclidean distance.
The numerical stability isn't the only issue for high dimensional kNN, the GiST search performance currently degrades
withincreasing N towards sequential scan performance, although maybe they are related?
> regards, tom lane
Best regards,
Alastair