Re: What's needed for cache-only table scan? - Mailing list pgsql-hackers

From Kohei KaiGai
Subject Re: What's needed for cache-only table scan?
Date
Msg-id CADyhKSWTs4kwVtrbb+fz3MoVag-VgVOkyY=G9UZ2me96kOxTag@mail.gmail.com
Whole thread Raw
In response to Re: What's needed for cache-only table scan?  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers
2013/11/12 Claudio Freire <klaussfreire@gmail.com>:
> On Tue, Nov 12, 2013 at 11:45 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
>> Hello,
>>
>> It is a brief design proposal of a feature I'd like to implement on top of
>> custom-scan APIs. Because it (probably) requires a few additional base
>> features not only custom-scan, I'd like to see feedback from the hackers.
>>
>> The cache-only table scan, being in subject line, is an alternative scan
>> logic towards sequential scan if all the referenced columns are cached.
>> It shall allow to scan a particular table without storage access, thus
>> make scan performance improved.
>> So what? Which is difference from large shared_buffers configuration?
>> This mechanism intends to cache a part of columns being referenced
>> in the query, not whole of the records. It makes sense to the workloads
>> that scan a table with many columns but qualifier references just a few
>> columns, typically used to analytic queries, because it enables to
>> reduce memory consumption to be cached, thus more number of records
>> can be cached.
>> In addition, it has another role from my standpoint. It also performs as
>> fast data supplier towards GPU/MIC devices. When we move data to
>> GPU device, the source address has to be a region marked as "page-
>> locked" that is exempted from concurrent swap out, if we want CUDA
>> or OpenCL to run asynchronous DMA transfer mode; the fastest one.
>
>
> Wouldn't a columnar heap format be a better solution to this?
>
I've implemented using FDW, however, it requires application adjust its SQL
to replace "CREATE TABLE" by "CREATE FOREIGN TABLE". In addition,
it lost a good feature of regular heap, like index scan if its cost is smaller
than sequential columnar scan.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Clang 3.3 Analyzer Results
Next
From: Robert Haas
Date:
Subject: Re: logical changeset generation v6.6