Re: Speed Question - Mailing list pgsql-performance

From Manfred Koizar
Subject Re: Speed Question
Date
Msg-id flg90v07p4gs6mb03fd2vhpm5t4enkbn6a@4ax.com
Whole thread Raw
In response to Re: Speed Question  (Noah Silverman <noah@allresearch.com>)
Responses Re: Speed Question
Re: Speed Question
List pgsql-performance
On Sat, 21 Dec 2002 13:46:05 -0500, Noah Silverman
<noah@allresearch.com> wrote:
>Without divulging too many company
>secrets, we create a 32 key profile of an object.  We then have to be
>able to search the database to  find "similar" objects.

... where "similar" means that the value of each attribute lies within
a small range around the value of the corresponding attribute of the
reference object?

I fear a multicolumn b-tree index is not the optimal solution to this
problem, unless you have some extremely selective attributes you can
put at the start of the index.  But then again I doubt that it makes
sense to include even the last attribute (or the last few attributes)
into the index.

>In reality, we
>will probably have 20MM to 30MM rows in our table.  I need to very
>quickly find the matching records on a "test" object.

This seems to be a nice case for utilizing bitmaps for index scans.
Thus you would scan several single column indices and combine the
bitmaps before accessing the heap tuples.  This has been discussed on
-hackers and I believe it is a todo item.

I don't know, whether GiST or R-Tree could help.  Is anybody listening
who knows?

>If you're really curious as to more details, let me know (I don't want
>to bore the group with our specifics)

The group is patient :-)

Servus
 Manfred

pgsql-performance by date:

Previous
From: Noah Silverman
Date:
Subject: Re: Speed Question
Next
From: Noah Silverman
Date:
Subject: Re: Speed Question