On Thu, May 24, 2018 at 11:50 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-05-25 09:40:10 +0300, Heikki Linnakangas wrote: > On 25/05/18 09:25, Asim Praveen wrote: > > My parochial vision of the overhead is restricted to 4 * NBuffers of > > additional shared memory, as 4 bytes are being added to BufferTag. May I > > please get some enlightenment? > > Any extra fields in BufferTag make computing the hash more expensive. It's a > very hot code path, so any cycles spent are significant.
Indeed, very much so.
But I'm not sure we need anything in the tags themselves. We don't denote buffers for unlogged tables in the tag itself either. As Tom observed the oids for temp tables are either unique or can be made unique easy enough. And the temporaryness can be declared in a bit in the buffer header, rather than the tag itself. I don't see why a hash lookup would need to know that.
Currently, relfilenodes (specifically spcid,dbid,relfilenode) for temp and regular tables can collide as temp files have "t_nnn" representation on-disk. Due to this relfilenode allocation logic can assign same relfilenode for temp and non-temp. If relfilenode uniqueness can be achieved then need for adding anything to buffer tag goes away.
When starting to work on the radix tree stuff I had, to address the size of buffer tag issue you mention above, a prototype patch that created a shared 'relfilenode' table. That guaranteed that relfilenodes are unique. That'd work here as well, and would allow to get rid of a good chunk of uglyness we have around allocating relfilenodes right now (like not unlinking files etc).
That would be great!
But more generally, I don't see why it'd be that problematic to just get rid of the backendid? I don't really see any technical necessity to have it.
Backendid was also added it seems due to same reason of not having unique relfilnodes for temp tables. So, yes with uniqueness guaranteed this can go away as well.