Re: qsort again (was Re: [PERFORM] Strange Create - Mailing list pgsql-hackers

From Scott Lamb
Subject Re: qsort again (was Re: [PERFORM] Strange Create
Date
Msg-id 87FA36B4-CD97-4CF5-B9B3-02FEEA25CC95@slamb.org
Whole thread Raw
In response to Re: qsort again (was Re: [PERFORM] Strange Create  (Ron <rjpeace@earthlink.net>)
Responses Re: qsort again (was Re: [PERFORM] Strange Create  (Ron <rjpeace@earthlink.net>)
List pgsql-hackers
On Feb 16, 2006, at 8:32 AM, Ron wrote:
> Let's pretend that we have the typical DB table where rows are
> ~2-4KB apiece.  1TB of storage will let us have 256M-512M rows in
> such a table.
>
> A 32b hash code can be assigned to each row value such that only
> exactly equal rows will have the same hash code.
> A 32b pointer can locate any of the 256M-512M rows.
>
> Now instead of sorting 1TB of data we can sort 2^28 to 2^29 32b
> +32b= 64b*(2^28 to 2^29)=  2-4GB of pointers+keys followed by an
> optional pass to rearrange the actual rows if we so wish.

I don't understand this.

This is a true statement: (H(x) != H(y)) => (x != y)
This is not: (H(x) < H(y)) => (x < y)

Hash keys can tell you there's an inequality, but they can't tell you
how the values compare. If you want 32-bit keys that compare in the
same order as the original values, here's how you have to get them:

(1) sort the values into an array
(2) use each value's array index as its key

It reduces to the problem you're trying to use it to solve.


--
Scott Lamb <http://www.slamb.org/>



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: qsort again (was Re: [PERFORM] Strange Create Index
Next
From: "Dann Corbit"
Date:
Subject: Re: qsort again (was Re: [PERFORM] Strange Create Index