Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber) - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)
Date
Msg-id 5494A9AF.6010604@BlueTreble.com
Whole thread Raw
In response to Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)  ("ktm@rice.edu" <ktm@rice.edu>)
List pgsql-hackers
On 12/18/14, 5:00 PM, Jim Nasby wrote:
> 2201582 20 -- Mostly LOCALLOCK and Shared Buffer

Started looking into this; perhaps https://code.google.com/p/fast-hash/ would be worth looking at, though it requires
uint64.

It also occurs to me that we're needlessly shoving a lot of 0's into the hash input by using RelFileNode and
ForkNumber.RelFileNode includes the tablespace Oid, which is pointless here because relid is unique per-database. We
alsohave very few forks and typically care about very few databases. If we crammed dbid and ForkNum together that gets
usdown to 12 bytes, which at minimum saves us the trip through the case logic. I suspect it also means we could
eliminateone of the mix() calls.
 

But I wonder if we could still do better, because we typically also won't have that many relations. Is there some fast
waywe could combine dbid, forkNum and relid into a uint32? That gets us down to 8 bytes, which means we could use
fash-hash,or a stripped down mix().
 

Unfortunately I don't know much about hash algorithms, so I don't know how practical any of this actually is, or what a
goodmethod for combining those fields would be. My current idea is something like (rot(forkNum, 2) | dbid) ^ relid, but
ifyou got unlucky with your oid values you could end up with a lot of collissions from that.
 

I can put some effort into this, but I'd like some guidance.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Commitfest problems
Next
From: Tom Lane
Date:
Subject: Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)