Re: [v9.3] writable foreign tables - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [v9.3] writable foreign tables
Date
Msg-id CAPpHfds=9aJK2OPjJJzER4XXa43YFDHckuAb3pmCdNpCVfZMZw@mail.gmail.com
Whole thread Raw
In response to Re: [v9.3] writable foreign tables  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
Responses Re: [v9.3] writable foreign tables
Re: [v9.3] writable foreign tables
List pgsql-hackers
On Mon, Sep 24, 2012 at 12:49 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
2012/9/23 Kohei KaiGai <kaigai@kaigai.gr.jp>:
> 2012/8/29 Kohei KaiGai <kaigai@kaigai.gr.jp>:
>> 2012/8/28 Kohei KaiGai <kaigai@kaigai.gr.jp>:
>>> 2012/8/28 Tom Lane <tgl@sss.pgh.pa.us>:
>>>> Kohei KaiGai <kaigai@kaigai.gr.jp> writes:
>>>>>> Would it be too invasive to introduce a new pointer in TupleTableSlot
>>>>>> that is NULL for anything but virtual tuples from foreign tables?
>>>>
>>>>> I'm not certain whether the duration of TupleTableSlot is enough to
>>>>> carry a private datum between scan and modify stage.
>>>>
>>>> It's not.
>>>>
>>>>> Is it possible to utilize ctid field to move a private pointer?
>>>>
>>>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
>>>> TID from scan to modify --- in fact, most of the time what the modify
>>>> step is going to get is a "virtual" TupleTableSlot that hasn't even
>>>> *got* a physical CTID field.
>>>>
>>>> Instead, the planner arranges for the TID to be carried up as an
>>>> explicit resjunk column named ctid.  (Currently this is done in
>>>> rewriteTargetListUD(), but see also preptlist.c which does some related
>>>> things for SELECT FOR UPDATE.)
>>>>
>>>> I'm inclined to think that what we need here is for FDWs to be able to
>>>> modify the details of that behavior, at least to the extent of being
>>>> able to specify a different data type than TID for the row
>>>> identification column.
>>>>
>>> Hmm. It seems to me a straight-forward solution rather than ab-use
>>> of ctid system column. Probably, cstring data type is more suitable
>>> to carry a private datum between scan and modify stage.
>>>
>>> One problem I noticed is how FDW driver returns an extra field that
>>> is in neither system nor regular column.
>>> Number of columns and its data type are defined with TupleDesc of
>>> the target foreign-table, so we also need a feature to extend it on
>>> run-time. For example, FDW driver may have to be able to extend
>>> a "virtual" column with cstring data type, even though the target
>>> foreign table does not have such a column.
>>>
>> I tried to investigate the related routines.
>>
>> TupleDesc of TupleTableSlot associated with ForeignScanState
>> is initialized at ExecInitForeignScan as literal.
>> ExecAssignScanType assigns TupleDesc of the target foreign-
>> table on tts_tupleDescriptor, "as-is".
>> It is the reason why IterateForeignScan cannot return a private
>> datum except for the columns being declared as regular ones.
>>
> The attached patch improved its design according to the upthread
> discussion. It now got away from ab-use of "ctid" field, and adopts
> a concept of pseudo-column to hold row-id with opaque data type
> instead.
>
> Pseudo-column is Var reference towards attribute-number larger
> than number of attributes on the target relation; thus, it is not
> a substantial object. It is normally unavailable to reference such
> a larger attribute number because TupleDesc of each ScanState
> associated with a particular relation is initialized at ExecInitNode.
>
> The patched ExecInitForeignScan was extended to generate its
> own TupleDesc including pseudo-column definitions on the fly,
> instead of relation's one, when scan-plan of foreign-table requires
> to have pseudo-columns.
>
> Right now, the only possible pseudo-column is "rowid" being
> injected at rewriteTargetListUD(). It has no data format
> restriction like "ctid" because of VOID data type.
> FDW extension can set an appropriate value on the "rowid"
> field in addition to contents of regular columns at
> IterateForeignScan method, to track which remote row should
> be updated or deleted.
>
> Another possible usage of this pseudo-column is push-down
> of target-list including complex calculation. It may enable to
> move complex mathematical formula into remote devices
> (such as GPU device?) instead of just a reference of Var node.
>
> This patch adds a new interface: GetForeignRelInfo being invoked
> from get_relation_info() to adjust width of RelOptInfo->attr_needed
> according to the target-list which may contain "rowid" pseudo-column.
> Some FDW extension may use this interface to push-down a part of
> target list into remote side, even though I didn't implement this
> feature on file_fdw.
>
> RelOptInfo->max_attr is a good marker whether the plan shall have
> pseudo-column reference. Then, ExecInitForeignScan determines
> whether it should generate a TupleDesc, or not.
>
> The "rowid" is fetched using ExecGetJunkAttribute as we are currently
> doing on regular tables using "ctid", then it shall be delivered to
> ExecUpdate or ExecDelete. We can never expect the fist argument of
> them now, so "ItemPointer tupleid" redefined to "Datum rowid", and
> argument of BR-trigger routines redefined also.
>
> [kaigai@iwashi sepgsql]$ cat ~/testfile.csv
> 10      aaa
> 11      bbb
> 12      ccc
> 13      ddd
> 14      eee
> 15      fff
> [kaigai@iwashi sepgsql]$ psql postgres
> psql (9.3devel)
> Type "help" for help.
>
> postgres=# UPDATE ftbl SET b = md5(b) WHERE a > 12 RETURNING *;
> INFO:  ftbl is the target relation of UPDATE
> INFO:  fdw_file: BeginForeignModify method
> INFO:  fdw_file: UPDATE (lineno = 4)
> INFO:  fdw_file: UPDATE (lineno = 5)
> INFO:  fdw_file: UPDATE (lineno = 6)
> INFO:  fdw_file: EndForeignModify method
>  a  |                b
> ----+----------------------------------
>  13 | 77963b7a931377ad4ab5ad6a9cd718aa
>  14 | d2f2297d6e829cd3493aa7de4416a18f
>  15 | 343d9040a671c45832ee5381860e2996
> (3 rows)
>
> UPDATE 3
> postgres=# DELETE FROM ftbl WHERE a % 2 = 1 RETURNING *;
> INFO:  ftbl is the target relation of DELETE
> INFO:  fdw_file: BeginForeignModify method
> INFO:  fdw_file: DELETE (lineno = 2)
> INFO:  fdw_file: DELETE (lineno = 4)
> INFO:  fdw_file: DELETE (lineno = 6)
> INFO:  fdw_file: EndForeignModify method
>  a  |  b
> ----+-----
>  11 | bbb
>  13 | ddd
>  15 | fff
> (3 rows)
>
> DELETE 3
>
> In addition, there is a small improvement. ExecForeignInsert,
> ExecForeignUpdate and ExecForeignDelete get being able
> to return number of processed rows; that allows to push-down
> whole the statement into remote-side, if it is enough simple
> (e.g, delete statement without any condition).
>
> Even though it does not make matter right now, pseudo-columns
> should be adjusted when foreign-table is referenced with table
> inheritance feature, because an attribute number being enough
> large in parent table is not enough large in child table.
> We need to fix up them until foreign table feature got inheritance
> capability.
>
> I didn't update the documentation stuff because I positioned
> the state of this patch as proof-of-concept now. Please note that.
>
A tiny bit of this patch was updated. I noticed INTERNAL data type
is more suitable to move a private datum from scan-stage to
modify-stage, because its type-length is declared as SIZEOF_POINTER.

Also, I fixed up some obvious compiler warnings.

I've read previous discussion about this patch. It's generally concentrated on the question how to identify foreign table row? Your last patch introduce "rowid" pseudo-column for foreign table row identification. My notes are following:
1) AFAICS your patch are designed to support arbitrary number of pseudo-columns while only one is currently used. Do you see more use cases of pseudo-columns?
2) You wrote that FDW can support or don't support write depending on having corresponding functions. However it's likely some tables of same FDW could be writable while another are not. I think we should have some mechanism for FDW telling whether particular table is writable.
3) I have another point about identification of foreign rows which I didn't meet in previous discussion. What if we restrict writable FDW table to have set of column which are unique identifier of a row. Many replication systems have this restriction: relicated tables should have a unique key. In case of text or csv file I don't see why making line number column user visible is bad.
 
------
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

Previous
From: Daniel Farina
Date:
Subject: Re: Synchronous commit not... synchronous?
Next
From: Tom Lane
Date:
Subject: Re: Bugs in planner's equivalence-class processing