Re: Pluggable Storage - Andres's take - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: Pluggable Storage - Andres's take
Date
Msg-id CAJrrPGfQfiNE6Saw1edfCBZ5advfv=YxTwDRWJ4hUPZScvGmYA@mail.gmail.com
Whole thread Raw
In response to Re: Pluggable Storage - Andres's take  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Responses Re: Pluggable Storage - Andres's take  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers

On Mon, Sep 10, 2018 at 5:42 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Wed, Sep 5, 2018 at 2:04 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Tue, Sep 4, 2018 at 10:33 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

Thanks for the patches!

On 2018-09-03 19:06:27 +1000, Haribabu Kommi wrote:
> I found couple of places where the zheap is using some extra logic in
> verifying
> whether it is zheap AM or not, based on that it used to took some extra
> decisions.
> I am analyzing all the extra code that is done, whether any callbacks can
> handle it
> or not? and how? I can come back with more details later.

Yea, I think some of them will need to stay (particularly around
integrating undo) and som other ones we'll need to abstract.
 
OK. I will list all the areas that I found with my observation of how to
abstract or leaving it and then implement around it.

The following are the change where the code is specific to checking whether
it is a zheap relation or not?

Overall I found that It needs 3 new API's at the following locations.
1. RelationSetNewRelfilenode
2. heap_create_init_fork
3. estimate_rel_size
4. Facility to provide handler options like (skip WAL and etc).

During the porting of Fujitsu in-memory columnar store on top of pluggable
storage, I found that the callers of the "heap_beginscan" are expecting
the returned data is always contains all the records.

For example, in the sequential scan, the heap returns the slot with
the tuple or with value array of all the columns and then the data gets
filtered and later removed the unnecessary columns with projection.
This works fine for the row based storage. For columnar storage, if
the storage knows that upper layers needs only particular columns,
then they can directly return the specified columns and there is no
need of projection step. This will help the columnar storage also
to return proper columns in a faster way.

Is it good to pass the plan to the storage, so that they can find out
the columns that needs to be returned? And also if the projection
can handle in the storage itself for some scenarios, need to be
informed the callers that there is no need to perform the projection
extra.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia

pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: Changing the setting of wal_sender_timeout per standby
Next
From: Andres Freund
Date:
Subject: Re: Pluggable Storage - Andres's take