Re: scoring differences between bitmasks - Mailing list pgsql-general

From Todd A. Cook
Subject Re: scoring differences between bitmasks
Date
Msg-id 43401FF0.9020907@blackducksoftware.com
Whole thread Raw
In response to Re: scoring differences between bitmasks  (Ben <bench@silentmedia.com>)
List pgsql-general
Hi,

Try breaking the vector into 4 bigint columns and building a multi-column
index, with index columns going from the most evenly distributed to the
least.  Depending on the distribution of your data, you may only need 2
or 3 columns in the index.  If you can cluster the table in that order,
it should be really fast.  (This structure is a tabular form of a linked
trie.)

-- todd


Ben wrote:
> Yes, that's the straightforward way to do it. But given that my  vectors
> are 256 bits in length, and that I'm going to eventually have  about 4
> million of them to search through, I was hoping greater minds  than mine
> had figured out how to do it faster, or how compute some  kind of
> indexing....... somehow.

pgsql-general by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: Portable PostgreSQL
Next
From: Tom Lane
Date:
Subject: Re: 8.1 'make check' fails