Re: qsort again (was Re: [PERFORM] Strange Create - Mailing list pgsql-hackers

From Ron
Subject Re: qsort again (was Re: [PERFORM] Strange Create
Date
Msg-id 7.0.1.0.2.20060217004822.039d46a8@earthlink.net
Whole thread Raw
In response to Re: qsort again (was Re: [PERFORM] Strange Create  (Ron <rjpeace@earthlink.net>)
Responses Re: qsort again (was Re: [PERFORM] Strange Create  (Markus Schaber <schabi@logix-tt.com>)
List pgsql-hackers
At 01:47 PM 2/16/2006, Ron wrote:
>At 12:19 PM 2/16/2006, Scott Lamb wrote:
>>On Feb 16, 2006, at 8:32 AM, Ron wrote:
>>>Let's pretend that we have the typical DB table where rows are
>>>~2-4KB apiece.  1TB of storage will let us have 256M-512M rows in
>>>such a table.
>>>
>>>A 32b hash code can be assigned to each row value such that only
>>>exactly equal rows will have the same hash code.
>>>A 32b pointer can locate any of the 256M-512M rows.
>>>
>>>Now instead of sorting 1TB of data we can sort 2^28 to 2^29 32b
>>>+32b= 64b*(2^28 to 2^29)=  2-4GB of pointers+keys followed by an
>>>optional pass to rearrange the actual rows if we so wish.
>>
>>I don't understand this.
>>
>>This is a true statement: (H(x) != H(y)) => (x != y)
>>This is not: (H(x) < H(y)) => (x < y)
>>
>>Hash keys can tell you there's an inequality, but they can't tell you
>>how the values compare. If you want 32-bit keys that compare in the
>>same order as the original values, here's how you have to get them:
>For most hash codes, you are correct.  There is a class of hash or
>hash-like codes that maintains the mapping to support that second statement.
>
>More later when I can get more time.
>Ron

OK, so here's _a_ way (there are others) to obtain a mapping such that
  if a < b then f(a) < f (b) and
  if a == b then f(a) == f(b)

Pretend each row is a integer of row size (so a 2KB row becomes a
16Kb integer; a 4KB row becomes a 32Kb integer; etc)
Since even a 1TB table made of such rows can only have 256M - 512M
possible values even if each row is unique, a 28b or 29b key is large
enough to represent each row's value and relative rank compared to
all of the others even if all row values are unique.

By scanning the table once, we can map say 0000001h (Hex used to ease
typing) to the row with the minimum value and 1111111h to the row
with the maximum value as well as mapping everything in between to
their appropriate keys.  That same scan can be used to assign a
pointer to each record's location.

We can now sort the key+pointer pairs instead of the actual data and
use an optional final pass to rearrange the actual rows if we wish.

That initial scan to set up the keys is expensive, but if we wish
that cost can be amortized over the life of the table so we don't
have to pay it all at once.  In addition, once we have created those
keys, then can be saved for later searches and sorts.

Further space savings can be obtained whenever there are duplicate
keys and/or when compression methods are used on the Key+pointer pairs.

Ron







pgsql-hackers by date:

Previous
From: Satoshi Nagayasu
Date:
Subject: Re: In Japan with Josh Berkus
Next
From: "Luke Lonergan"
Date:
Subject: Re: In Japan with Josh Berkus