On Fri, Apr 13, 2012 at 10:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <stark@mit.edu> writes:
>> On Fri, Apr 13, 2012 at 8:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> That's probably true, but I'm not sure it's worth worrying about -
>>> one-in-four-billion is a pretty small probability.
>
>> Is this not subject to the birthday paradox? If you have a given hash
>> you're worried about a collision with then you have a
>> one-in-four-billion chance. But if you have a collection of hashes and
>> you're worried about any collisions then it only takes about 64k
>> before there's likely a collision.
>
> ... so, if pg_stat_statements.max were set as high as 64k, you would
> need to worry.
Well... at 64k, you'd be very likely to have a collision. But the
whole birthday paradox thing means that there's a non-trivial
collision probability even at lower numbers of entries. Seems like
maybe we ought to be using 64 bits here...
> Realistically, I'm more worried about collisions due to inadequacies in
> the jumble calculation logic (Peter already pointed out some risk
> factors in that regard).
...especially if collisions are even more frequent than random chance
would suggest.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company