Re: What's needed for cache-only table scan? - Mailing list pgsql-hackers

From Kohei KaiGai
Subject Re: What's needed for cache-only table scan?
Date
Msg-id CADyhKSUJHu6zNC1B-gJOR6v7CfysRqQpwM3RAuJxNJHaN7o1vg@mail.gmail.com
Whole thread Raw
In response to Re: What's needed for cache-only table scan?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
2013/11/12 Robert Haas <robertmhaas@gmail.com>:
> On Tue, Nov 12, 2013 at 9:45 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
>> It is a brief design proposal of a feature I'd like to implement on top of
>> custom-scan APIs. Because it (probably) requires a few additional base
>> features not only custom-scan, I'd like to see feedback from the hackers.
>>
>> The cache-only table scan, being in subject line, is an alternative scan
>> logic towards sequential scan if all the referenced columns are cached.
>> It shall allow to scan a particular table without storage access, thus
>> make scan performance improved.
>> So what? Which is difference from large shared_buffers configuration?
>> This mechanism intends to cache a part of columns being referenced
>> in the query, not whole of the records. It makes sense to the workloads
>> that scan a table with many columns but qualifier references just a few
>> columns, typically used to analytic queries, because it enables to
>> reduce memory consumption to be cached, thus more number of records
>> can be cached.
>> In addition, it has another role from my standpoint. It also performs as
>> fast data supplier towards GPU/MIC devices. When we move data to
>> GPU device, the source address has to be a region marked as "page-
>> locked" that is exempted from concurrent swap out, if we want CUDA
>> or OpenCL to run asynchronous DMA transfer mode; the fastest one.
>>
>> Probably, here is no problem on construction of this table cache.
>> All we need to do is inject a custom-scan node instead of seq-scan,
>> then it can construct table cache in concurrence with regular seq-
>> scan, even though the first access become a little bit slow down.
>>
>> My concern is how to handle a case when table gets modified.
>> A straightforward idea is that each cached entries being modified
>> shall be invalidated by callback mechanism.
>> Trigger can help in case of INSERT, UPDATE, DELETE and
>> TRUNCATE. Furthermore, it's better if extension could inject
>> its own trigger definition at RelationBuildTriggers() on the fly,
>> to perform the feature transparently.
>> On the other hand, we have no way to get control around VACUUM.
>> I want to have a hook that allows extensions to get control when
>> a page got vacuumed. Once we have such a hook, it enables to
>> invalidate cached entries being indexed by tid, but already vacuumed.
>> Its best location is probably lazy_scan_heap() to call back extension
>> for each page getting vacuumed, with
>>
>> How about your opinion?
>
> I think it's hard to say in the abstract.  I'd suggest adding the
> hooks you feel like you need and see what the resulting patch looks
> like.  Then post that and we can talk about how (and whether) to do it
> better.  My personal bet is that triggers are the wrong way to do
> something like this; I'd look to do it all with hooks.  Of course,
> figuring how to position those hooks so that they are maintainable and
> don't affect performance when not used is the tricky part.
>
Yep, right now, all the idea is still in my brain, so it takes additional
times to make up it as a patch form.
The reason why I'm inclined to use trigger is, it is already supported
on the place I want to get control. So, it enables to omit efforts to
add and maintenance similar features in the limited time slot.

>> I'd like to find out the best way to implement this table-caching
>> mechanism within scope of v9.4 functionality set.
>> Any ideas, comments or suggestions are welcome.
>
> I think that Tom is right: there's not time to get something like this
> done for 9.4.  If you start working relatively soon, I think you could
> hope to have a good proposal in time for 9.5.
>
Yes, I agree his suggestion is realistic. It is nearly impossible to
get whole of the table-cache mechanism by CF3 deadline this week.
So, I assume to implement it as extension using facilities in v9.4.

It seems to me the last piece to implement this feature (of course,
custom-scan is pre-requisite) is a hook in vacuum.
Does it make sense to propose something other worth but small
contrib module that utilizes this last piece? For example, a contrib
module that reports progress of vacuum...

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Add \i option to bring in the specified file as a quoted literal
Next
From: Andres Freund
Date:
Subject: Re: Errors on missing pg_subtrans/ files with 9.3