Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)
Date
Msg-id 20141220051542.GM5023@alap3.anarazel.de
Whole thread Raw
In response to Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)
List pgsql-hackers
On 2014-12-19 22:03:55 -0600, Jim Nasby wrote:
> I'm not suggesting we change BufferTag or BufferLookupEnt; clearly we
> can't simply throw away any of the fields I was talking about (well,
> except possibly tablespace ID. AFAICT that's completely redundant for
> searching because relid is UNIQUE).

It's actually not. BufferTag's contain relnodes via RelFileNode - that's
not the relation's oid, but the filenode. And that's *not* guranteed
unique across database unfortunately.

> What I am thinking is not using all of those fields in their raw form to calculate the hash value. IE: something
analogousto:
 
> 
> hash_any(SharedBufHash, (rot(forkNum, 2) | dbNode) ^ relNode) << 32 | blockNum)
> 
> perhaps that actual code wouldn't work, but I don't see why we couldn't do something similar... am I missing
something?

I don't think that'd improve anything. Jenkin's hash does have a quite
mixing properties, I don't believe that the above would improve the
quality of the hash.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Commitfest problems
Next
From: Noah Misch
Date:
Subject: Re: pgsql: Allow pushdown of WHERE quals into subqueries with window functi