Re: Keeping temporary tables in shared buffers - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Keeping temporary tables in shared buffers
Date
Msg-id CAA4eK1JF-C+OD6cziQvNztXpaNVEcRECywU+vTbYeQGyPUZxsA@mail.gmail.com
Whole thread Raw
In response to Keeping temporary tables in shared buffers  (Asim Praveen <apraveen@pivotal.io>)
Responses Re: Keeping temporary tables in shared buffers  (Asim Praveen <apraveen@pivotal.io>)
Re: Keeping temporary tables in shared buffers  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, May 25, 2018 at 6:33 AM, Asim Praveen <apraveen@pivotal.io> wrote:
> Hello
>
> We are evaluating the use of shared buffers for temporary tables.  The
> advantage being queries involving temporary tables can make use of parallel
> workers.
>

This is one way, but I think there are other choices as well.  We can
identify and flush all the dirty (local) buffers for the relation
being accessed parallelly.  Now, once the parallel operation is
started, we won't allow performing any write operation on them.  It
could be expensive if we have a lot of dirty local buffers for a
particular relation.  I think if we are worried about the cost of
writes, then we can try some different way to parallelize temporary
table scan.  At the beginning of the scan, leader backend will
remember the dirty blocks present in local buffers, it can then share
the list with parallel workers which will skip scanning those blocks
and in the end leader ensures that all those blocks will be scanned by
the leader.  This shouldn't incur a much additional cost as the
skipped blocks should be present in local buffers of backend.

I understand that none of these alternatives are straight-forward, but
I think it is worth considering whether we have any better way to
allow parallel temporary table scans.


> Challenges:
> 1. We lose the performance benefit of local buffers.

Yeah, I think cases, where we need to drop temp relations, will become
costlier as they have to traverse all the shared buffers instead of
just local buffers.

I think if we use shared buffers for temp relations, there will be
some overhead for other backends as well, especially for the cases
when backends need to evict buffers.  It is quite possible that if the
relation is in local buffers, we might not write it at all, but moving
it to shared buffers will increase its probability of being written to
disk.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Is a modern build system acceptable for older platforms
Next
From: Amit Kapila
Date:
Subject: Few comments on commit 857f9c36 (skip full index scans )