On Tue, Nov 12, 2013 at 11:45 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
> Hello,
>
> It is a brief design proposal of a feature I'd like to implement on top of
> custom-scan APIs. Because it (probably) requires a few additional base
> features not only custom-scan, I'd like to see feedback from the hackers.
>
> The cache-only table scan, being in subject line, is an alternative scan
> logic towards sequential scan if all the referenced columns are cached.
> It shall allow to scan a particular table without storage access, thus
> make scan performance improved.
> So what? Which is difference from large shared_buffers configuration?
> This mechanism intends to cache a part of columns being referenced
> in the query, not whole of the records. It makes sense to the workloads
> that scan a table with many columns but qualifier references just a few
> columns, typically used to analytic queries, because it enables to
> reduce memory consumption to be cached, thus more number of records
> can be cached.
> In addition, it has another role from my standpoint. It also performs as
> fast data supplier towards GPU/MIC devices. When we move data to
> GPU device, the source address has to be a region marked as "page-
> locked" that is exempted from concurrent swap out, if we want CUDA
> or OpenCL to run asynchronous DMA transfer mode; the fastest one.
Wouldn't a columnar heap format be a better solution to this?