Home > mailing lists

Re: Why hash OIDs? - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: Why hash OIDs?
Date	August 28, 2018 05:45:49
Msg-id	CAEepm=2F2qCo7+5wZKVM4311TUWmWYuupDqpyTwdT9nvW=7WYA@mail.gmail.com Whole thread
In response to	Re: Why hash OIDs? (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Why hash OIDs?
List	pgsql-hackers

Tree view

On Tue, Aug 28, 2018 at 2:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-08-28 13:50:43 +1200, Thomas Munro wrote:
> >> What bad thing would happen if we used OIDs directly as hash values in
> >> internal hash tables (that is, instead of uint32_hash() we'd use
> >> uint32_identity(), or somehow optimise it away entirely, as you can
> >> see in some C++ standard libraries for eg std::hash<int>)?
>
> > Oids are very much not equally distributed, so in all likelihood you'd
> > get cases very you currently have a reasonably well averaged out usage
> > of the hashtable, not be that anymore.
>
> Right.  In particular, most of our hash usages assume that all bits of
> the hash value are equally "random", so that we can just mask off the
> lowest N bits of the hash and not get values that are biased towards
> particular hash buckets.  It's unlikely that raw OIDs would have that
> property.

Yeah, it would be a terrible idea as a general hash function for use
in contexts where the "avalanche effect" assumption is made about
information being spread out over the bits (the HJ batching code
wouldn't work for example).  I was wondering specifically about the
limited case of hash tables that are used to look things up in caches.

-- 
Thomas Munro
http://www.enterprisedb.com

pgsql-hackers by date:

From: Michael Paquier
Date: 28 August 2018, 05:38:19
Subject: Re: [HACKERS] Proposal to add work_mem option to postgres_fdw module

From: Kyotaro HORIGUCHI
Date: 28 August 2018, 05:49:26
Subject: Re: Refactor textToQualifiedNameList()

Re: Why hash OIDs? - Mailing list pgsql-hackers

Previous

Next