Re: Hash support for arrays - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Hash support for arrays
Date
Msg-id AANLkTimkytUv9F7=bWkFCqz=CPyYzbP3c9idJXY6zmn8@mail.gmail.com
Whole thread Raw
In response to Re: Hash support for arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Hash support for arrays
List pgsql-hackers
On Sat, Oct 30, 2010 at 1:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm, you mean use hash_any on the sequence of hash values?  Interesting
> idea; but hash_any doesn't look very amenable to being invoked
> incrementally, and I don't think I want to construct a temp array with
> enough space for the hashes of all the items in the input array ...

I had rather hoped without evidence that it would be easy enough to
use incrementally. Looking now I see your point; it doesn't lend
itself to that at all.

I suppose you could use crc where our interface does allow incremental use.

I'm not really familiar enough with the math to know whether CRC is
any better than your XOR mixing. I suspect they're pretty similar. I
suppose if you have arrays of various lengths containing repetitions
of a single value than the non-CRC would produce a hash of 0 for all
arrays with a length that's a multiple of 32. I'm not sure what CRC
would do.

Oh, one question that occurred to me you didn't mention including the
bounds of the array or the null bitmap in the hash function. I suppose
it's correct with or without them (or in the case of the null bitmap
another way to put it is the choice of the hash value to represent
null is arbitrary).


--
greg


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: ALTER OBJECT any_name SET SCHEMA name
Next
From: Tom Lane
Date:
Subject: Re: Hash support for arrays