Re: [HACKERS] Pluggable storage - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: [HACKERS] Pluggable storage
Date
Msg-id CAJrrPGfq2=qz2HmUUDjsueOfADOVAjx7Dbrodi9ZoAnDXO-TsA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Pluggable storage  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: [HACKERS] Pluggable storage
List pgsql-hackers


On Thu, Jan 4, 2018 at 10:00 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
On Wed, Jan 3, 2018 at 10:08 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Wed, Dec 27, 2017 at 11:33 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:

Also, I appreciate that now tuple_insert() and tuple_update() methods are responsible for inserting index tuples.  This unleash pluggable storages to implement another way of interaction with indexes.  However, I didn't get the point of passing InsertIndexTuples IndexFunc to them.  Now, we're always passing ExecInsertIndexTuples() to this argument.  As I understood storage is free to either call ExecInsertIndexTuples() or implement its own logic of interaction with indexes.  But, I don't understand why do we need a callback when tuple_insert() and tuple_update() can call ExecInsertIndexTuples() directly if needed.  Another thing is that tuple_delete() could also interact with indexes (especially when we will enhance index access method API), and we need to pass meta-information about indexes to tuple_delete() too.

The main reason for which I added the callback function to not to introduce the
dependency of storage on executor functions. This way storage can call the
function that is passed to it without any knowledge. I added the function pointer
for tuple_delete also in the new patches, currently it is passed as NULL for heap.
These API's can be enhanced later.

Understood, but in order to implement alternative behavior with indexes (for example,
insert index tuples to only some of indexes), storage am will still have to call executor
functions.  So, yes this needs to be enhanced.  Probably, we just need to implement
nicer executor API for storage am. 

OK.  
 
Apart from rebase, Added storage shared memory API, currently this API is used
only by the syncscan. And also all the exposed functions of syncscan usage is
removed outside the heap.

This makes me uneasy.  You introduce two new hooks for size estimation and initialization
of shared memory needed by storage am's.  But if storage am is implemented in shared library,
then this shared library can use our generic method for allocation of shared memory
(including memory needed by storage am).  If storage am is builtin, then hooks are also not
needed, because we know all our builtin storage am's in advance.  For me, it would be
nice to encapsulate heap am requirements in shared memory into functions like
HeapAmShmemSize() and HeapAmShmemInit(), and don't explicitly show outside that
this memory is needed for synchronized scan.  But separate hooks don't look justified for me.

Yes, I agree that for the builtin storage's there is no need of hooks. But in future,
if we want to support multiple storage's in an instance, we may need hooks for shared memory
registration. I am fine to change it.

Regards,
Hari Babu
Fujitsu Australia

pgsql-hackers by date:

Previous
From: Jing Wang
Date:
Subject: Libpq support to connect to standby server as priority
Next
From: Amit Khandekar
Date:
Subject: Re: [HACKERS] UPDATE of partition key