Re: Hash support for arrays - Mailing list pgsql-hackers

From Nicolas Barbier
Subject Re: Hash support for arrays
Date
Msg-id AANLkTik+zqRBEYzy1X5p_6DnGA=V9BeMS=+HCg_hoSLE@mail.gmail.com
Whole thread Raw
In response to Re: Hash support for arrays  (Kenneth Marshall <ktm@rice.edu>)
Responses Re: Hash support for arrays
Re: Hash support for arrays
List pgsql-hackers
2010/11/2 Kenneth Marshall <ktm@rice.edu>:

> Given that our hash implimentation mixes the input data well (It does.
> I tested it.) then a simple rotate-and-xor method is all that should
> be needed to maintain all of the needed information. The original
> hash function has done the heavy lifting in this case.

Even with the perfect hash function for the elements, certain
combinations of elements could still lead to massive collisions. E.g.,
if repeated values are typical in the input data we are talking about,
then the rotate-and-xor method would still lead to collisions between
any array of the same values of certain lengths, regardless of the
value. In Tom's implementation, as he mentioned before, those
problematical lengths would be multiples of 32 (e.g., an array of 32
1s would collide with an array of 32 2s would collide with an array of
32 3s, etc).

Nicolas


pgsql-hackers by date:

Previous
From: Marc Cousin
Date:
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Next
From: Greg Stark
Date:
Subject: Re: pgsql: Bootstrap WAL to begin at segment logid=0 logseg=1 (000000010000