Re: Keeping temporary tables in shared buffers - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Keeping temporary tables in shared buffers
Date
Msg-id 4b94cc72-c505-9625-11e2-563dc5d39398@iki.fi
Whole thread Raw
In response to Re: Keeping temporary tables in shared buffers  (Asim Praveen <apraveen@pivotal.io>)
Responses Re: Keeping temporary tables in shared buffers  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On 25/05/18 09:25, Asim Praveen wrote:
> On Thu, May 24, 2018 at 8:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> So then you have to think about how to transition smoothly between "rel
>> is in local buffers" and "rel is in shared buffers", bearing in mind that
>> ever having the same page in two different buffers would be disastrous.
> 
> Local buffers would not be used at all if temp tables start residing in
> shared buffers.  The transition mentioned above shouldn't be needed.

What is the performance difference between the local buffer manager and 
the shared buffer manager? The local buffer manager avoids all the 
locking overhead, which has to amount to something, but how big a 
difference is it?

>> I think that would be a deal breaker right there, because of the
>> distributed overhead of making the tags bigger.  However, I don't
>> actually understand why you would need to do that.  Temp tables
>> have unique OIDs/relfilenodes anyway, don't they?  Or if I'm
>> misremembering and they don't, couldn't we make them so?
> 
> My parochial vision of the overhead is restricted to 4 * NBuffers of
> additional shared memory, as 4 bytes are being added to BufferTag.  May I
> please get some enlightenment?

Any extra fields in BufferTag make computing the hash more expensive. 
It's a very hot code path, so any cycles spent are significant.

In relation to Andres' patches to rewrite the buffer manager with a 
radix tree, there was actually some discussion of trying to make 
BufferTag *smaller*. For example, we could rearrange things so that 
pg_class.relfilenode is 64 bits wide. Then you could assume that it 
never wraps around, and is unique across all relations in the cluster. 
Then you could replace the 12-byte relfilenode+dbid+spcid triplet, with 
just the 8-byte relfilenode. Doing something like that might be the 
solution here, too.

> Temp tables have unique filename on disk: t_<backendID>_<relfilenode>.  The
> logic to assign OIDs and relfilenodes, however, doesn't differ.  Given a
> RelFileNode, it is not possible to tell if it's a temp table or not.
> RelFileNodeBackend allows for that distinction but it's not used by buffer
> manager.

Could you store the backendid in BufferDesc, outside of BufferTag? Is it 
possible for a normal table and a temporary table to have the same 
relfilenode+dbid+spcid triplet?

- Heikki


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Redesigning the executor (async, JIT, memory efficiency)
Next
From: Andres Freund
Date:
Subject: Re: Redesigning the executor (async, JIT, memory efficiency)