Re: inherit support for foreign tables - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: inherit support for foreign tables
Date
Msg-id 5487BB9D.7070605@lab.ntt.co.jp
Whole thread Raw
In response to Re: inherit support for foreign tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: inherit support for foreign tables
Re: inherit support for foreign tables
List pgsql-hackers
Hi Ashutosh,

Thanks for the review!

(2014/11/28 18:14), Ashutosh Bapat wrote:
> On Thu, Nov 27, 2014 at 3:52 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
>     (2014/11/17 17:55), Ashutosh Bapat wrote:
>         Here are my review comments for patch fdw-inh-3.patch.

>         Tests
>         -------
>         1. It seems like you have copied from testcase inherit.sql to
>         postgres_fdw testcase. That's a good thing, but it makes the
>         test quite
>         long. May be we should have two tests in postgres_fdw contrib
>         module,
>         one for simple cases, and other for inheritance. What do you say?

>     IMO, the test is not so time-consuming, so it doesn't seem worth
>     splitting it into two.

> I am not worried about the timing but I am worried about the length of
> the file and hence ease of debugging in case we find any issues there.
> We will leave that for the commiter to decide.

OK

>         Documentation
>         --------------------
>         1. The change in ddl.sgml
>         -        We will refer to the child tables as partitions, though
>         they
>         -        are in every way normal <productname>PostgreSQL</> tables.
>         +        We will refer to the child tables as partitions, though
>         we assume
>         +        that they are normal <productname>PostgreSQL</> tables.
>
>         adds phrase "we assume that", which confuses the intention
>         behind the
>         sentence. The original text is intended to highlight the equivalence
>         between "partition" and "normal table", where as the addition
>         esp. the
>         word "assume" weakens that equivalence. Instead now we have to
>         highlight
>         the equivalence between "partition" and "normal or foreign
>         table". The
>         wording similar to "though they are either normal or foreign tables"
>         should be used there.

>     You are right, but I feel that there is something out of place in
>     saying that there (5.10.2. Implementing Partitioning) because the
>     procedure there has been written based on normal tables only.  Put
>     another way, if we say that, I think we'd need to add more docs,
>     describing the syntax and/or the corresponding examples for
>     foreign-table cases.  But I'd like to leave that for another patch.
>     So, how about the wording "we assume *here* that", instead of "we
>     assume that", as I added the following notice in the previous
>     section (5.10.1. Overview)?
>
>     @@ -2650,7 +2669,10 @@ VALUES ('Albany', NULL, NULL, 'NY');
>           table of a single parent table.  The parent table itself is
>     normally
>           empty; it exists just to represent the entire data set.  You
>     should be
>           familiar with inheritance (see <xref linkend="ddl-inherit">)
>     before
>     -    attempting to set up partitioning.
>     +    attempting to set up partitioning.  (The setup and management of
>     +    partitioned tables illustrated in this section assume that each
>     +    partition is a normal table.  However, you can do that in a
>     similar way
>     +    for cases where some or all partitions are foreign tables.)

> This looks ok, though, I would like to see final version of the
> document. But I think, we will leave that for committer to handle.

OK

>         2. The wording "some kind of optimization" gives vague picture.
>         May be
>         it can be worded as "Since the constraints are assumed to be
>         true, they
>         are used in constraint-based query optimization like constraint
>         exclusion for partitioned tables.".
>         +    Those constraints are used in some kind of query
>         optimization such
>         +    as constraint exclusion for partitioned tables (see
>         +    <xref linkend="ddl-partitioning">).

>     Will follow your revision.

Done.

>         Code
>         -------
>         1. In the following change
>         +/*
>             * acquire_inherited_sample_rows -- acquire sample rows from
>         inheritance tree
>             *
>             * This has the same API as acquire_sample_rows, except that
>         rows are
>             * collected from all inheritance children as well as the
>         specified table.
>         - * We fail and return zero if there are no inheritance children.
>         + * We fail and return zero if there are no inheritance children or
>         there are
>         + * inheritance children that foreign tables.
>
>         The addition should be "there are inheritance children that *are all
>         *foreign tables. Note the addition "are all".

>     Sorry, I incorrectly wrote the comment.  What I tried to write is
>     "We fail and return zero if there are no inheritance children or if
>     we are not in VAC_MODE_SINGLE case and inheritance tree contains at
>     least one foreign table.".

> You might want to use "English" description of VAC_MODE_SINGLE instead
> of that macro in the comment, so that reader doesn't have to look up
> VAC_MODE_SINGLE. But I think, we will leave this for the committer.

I corrected the comments and translated the macro into the English
description.

>         2. The function has_foreign() be better named
>         has_foreign_child()? This

>     How about "has_foreign_table"?

> has_foreign_child() would be better, since these are "children" in the
> inheritance hierarchy and not mere "table"s.

Done.  But I renamed it to has_foreign_children() because it sounds more
natural at least to me.

>         function loops through all the tableoids passed even after it
>         has found
>         a foreign table. Why can't we return true immediately after
>         finding the
>         first foreign table, unless the side effects of heap_open() on
>         all the
>         table are required. But I don't see that to be the case, since these
>         tables are locked already through a previous call to
>         heap_open(). In the

>     Good catch!  Will fix.

>         same function instead of argument name parentrelId may be we
>         should use
>         name parent_oid, so that we use same notation for parent and
>         child table
>         OIDs.

>     Will fix.

Done.

>         3.  Regarding enum VacuumMode - it's being used only in case of
>         acquire_inherited_sample_rows(__) and that too, to check only a
>         single
>         value of the three defined there. May be we should just infer
>         that value
>         inside acquire_inherited_sample_rows(__) or pass a boolean true
>         or false
>         from do_analyse_rel() based on the VacuumStmt. I do not see need
>         for a
>         separate three value enum of which only one value is used and
>         also to
>         pass it down from vacuum() by changing signatures of the minion
>         functions.

>     I introduced that for possible future use.  See the discussion in [1].

> Will leave it for the commiter to decide.

I noticed that the signatures need not to be modified, as you said.
Thanks for pointing that out!  So, I revised the patch not to change the
signatures, though I left the enum, renaming it to AnalyzeMode.  Let's
have the committer's review.

>         4. In postgresGetForeignPlan(), the code added by this patch is
>         required
>         to handle the case when the row mark is placed on a parent table and
>         hence is required to be applied on the child table. We need a
>         comment
>         explaining this. Otherwise, the three step process to get the
>         row mark
>         information isn't clear for a reader.

>     Will add the comment.

Done.

>         5. In expand_inherited_rtentry() why do you need a separate variable
>         hasForeign? Instead of using that variable, you can actually
>         set/reset
>         rte->hasForeign since existence of even a single foreign child would
>         mark that member as true. - After typing this, I got the answer
>         when I
>         looked at the function code. Every child's RTE is initially a
>         copy of
>         parent's RTE and hence hasForeign status would be inherited by every
>         child after the first foreign child. I think, this reasoning
>         should be
>         added as comment before assignment to rte->hasForeign at the end
>         of the
>         function.

>     As you mentioned, I think we could set rte->hasForeign directly
>     during the scan for the inheritance set, without the separate
>     variable, hasForeign.  But ISTM that it'd improve code readability
>     to set rte->hasForeign using the separate variable at the end of the
>     function because rte->hasForeign has its meaning only when rte->inh
>     is true and because we know whether rte->inh is true, at the end of
>     the function.

> Fine. Please use "hasForeignChild" instead of just "hasForeign" without
> a clue as to what is "foreign" here.

Done.  But I renamed it to "hasForeignChildren".

>         6. The tests in foreign_data.sql are pretty extensive. Thanks
>         for that.
>         I think, we should also check the effect of each of the following
>         command using \d on appropriate tables.
>         +CREATE FOREIGN TABLE ft2 () INHERITS (pt1)
>         +  SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted"
>         'value');
>         +ALTER FOREIGN TABLE ft2 NO INHERIT pt1;
>         +DROP FOREIGN TABLE ft2;
>         +CREATE FOREIGN TABLE ft2 (
>         +   c1 integer NOT NULL,
>         +   c2 text,
>         +   c3 date
>         +) SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted"
>         'value');
>         +ALTER FOREIGN TABLE ft2 INHERIT pt1;

>     Will fix.

Done.

>     Apart from the above, I noticed that the patch doesn't consider to
>     call ExplainForeignModify during EXPLAIN for an inherited
>     UPDATE/DELETE, as shown below (note that there are no UPDATE remote
>     queries displayed):

Since there seems to be no objections, I updated the patch as proposed
upthread.  Here is an example.

postgres=# explain (format text, verbose) update parent as p set a = a *
2 returning *;
                                    QUERY PLAN
--------------------------------------------------------------------------------
  Update on public.parent p  (cost=0.00..202.33 rows=11 width=10)
    Output: p.a
    For public.ft1 p_1
      Remote SQL: UPDATE public.mytable_1 SET a = $2 WHERE ctid = $1
RETURNING a
    For public.ft2 p_2
      Remote SQL: UPDATE public.mytable_2 SET a = $2 WHERE ctid = $1
RETURNING a
    ->  Seq Scan on public.parent p  (cost=0.00..0.00 rows=1 width=10)
          Output: (p.a * 2), p.ctid
    ->  Foreign Scan on public.ft1 p_1  (cost=100.00..101.16 rows=5
width=10)
          Output: (p_1.a * 2), p_1.ctid
          Remote SQL: SELECT a, ctid FROM public.mytable_1 FOR UPDATE
    ->  Foreign Scan on public.ft2 p_2  (cost=100.00..101.16 rows=5
width=10)
          Output: (p_2.a * 2), p_2.ctid
          Remote SQL: SELECT a, ctid FROM public.mytable_2 FOR UPDATE
(14 rows)

Other changes:
* revised regression tests for contrib/file_fdw to refer to tableoid.
* revised docs a bit further.

Attached are updated patches.  Patch fdw-inh-5.patch has been created on
top of patch fdw-chk-5.patch.  Patch fdw-chk-5.patch is basically the
same as the previous one fdw-chk-4.patch, but I slightly modified that
one.  The changes are the following.
* added to foreign_data.sql more tests for your comments.
* revised docs on ALTER FOREIGN TABLE a bit further.

Thanks,

Best regards,
Etsuro Fujita

Attachment

pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: Sequence Access Method WIP
Next
From: Michael Paquier
Date:
Subject: Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps