Re: inherit support for foreign tables - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: inherit support for foreign tables
Date
Msg-id CAFjFpRf1a=ZqpT69BBhq11gypNEyVqFxhkrCRL=Ry8Xo657MBg@mail.gmail.com
Whole thread Raw
In response to Re: inherit support for foreign tables  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: inherit support for foreign tables  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
We haven't heard anything from Horiguchi-san and Hanada-san for almost a week. So, I am fine marking it as "ready for committer". What do you say?


On Wed, Dec 10, 2014 at 8:48 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
Hi Ashutosh,

Thanks for the review!

(2014/11/28 18:14), Ashutosh Bapat wrote:
On Thu, Nov 27, 2014 at 3:52 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
    (2014/11/17 17:55), Ashutosh Bapat wrote:
        Here are my review comments for patch fdw-inh-3.patch.

        Tests
        -------
        1. It seems like you have copied from testcase inherit.sql to
        postgres_fdw testcase. That's a good thing, but it makes the
        test quite
        long. May be we should have two tests in postgres_fdw contrib
        module,
        one for simple cases, and other for inheritance. What do you say?

    IMO, the test is not so time-consuming, so it doesn't seem worth
    splitting it into two.

I am not worried about the timing but I am worried about the length of
the file and hence ease of debugging in case we find any issues there.
We will leave that for the commiter to decide.

OK


        Documentation
        --------------------
        1. The change in ddl.sgml
        -        We will refer to the child tables as partitions, though
        they
        -        are in every way normal <productname>PostgreSQL</> tables.
        +        We will refer to the child tables as partitions, though
        we assume
        +        that they are normal <productname>PostgreSQL</> tables.

        adds phrase "we assume that", which confuses the intention
        behind the
        sentence. The original text is intended to highlight the equivalence
        between "partition" and "normal table", where as the addition
        esp. the
        word "assume" weakens that equivalence. Instead now we have to
        highlight
        the equivalence between "partition" and "normal or foreign
        table". The
        wording similar to "though they are either normal or foreign tables"
        should be used there.

    You are right, but I feel that there is something out of place in
    saying that there (5.10.2. Implementing Partitioning) because the
    procedure there has been written based on normal tables only.  Put
    another way, if we say that, I think we'd need to add more docs,
    describing the syntax and/or the corresponding examples for
    foreign-table cases.  But I'd like to leave that for another patch.
    So, how about the wording "we assume *here* that", instead of "we
    assume that", as I added the following notice in the previous
    section (5.10.1. Overview)?

    @@ -2650,7 +2669,10 @@ VALUES ('Albany', NULL, NULL, 'NY');
          table of a single parent table.  The parent table itself is
    normally
          empty; it exists just to represent the entire data set.  You
    should be
          familiar with inheritance (see <xref linkend="ddl-inherit">)
    before
    -    attempting to set up partitioning.
    +    attempting to set up partitioning.  (The setup and management of
    +    partitioned tables illustrated in this section assume that each
    +    partition is a normal table.  However, you can do that in a
    similar way
    +    for cases where some or all partitions are foreign tables.)

This looks ok, though, I would like to see final version of the
document. But I think, we will leave that for committer to handle.

OK

        2. The wording "some kind of optimization" gives vague picture.
        May be
        it can be worded as "Since the constraints are assumed to be
        true, they
        are used in constraint-based query optimization like constraint
        exclusion for partitioned tables.".
        +    Those constraints are used in some kind of query
        optimization such
        +    as constraint exclusion for partitioned tables (see
        +    <xref linkend="ddl-partitioning">).

    Will follow your revision.

Done.

        Code
        -------
        1. In the following change
        +/*
            * acquire_inherited_sample_rows -- acquire sample rows from
        inheritance tree
            *
            * This has the same API as acquire_sample_rows, except that
        rows are
            * collected from all inheritance children as well as the
        specified table.
        - * We fail and return zero if there are no inheritance children.
        + * We fail and return zero if there are no inheritance children or
        there are
        + * inheritance children that foreign tables.

        The addition should be "there are inheritance children that *are all
        *foreign tables. Note the addition "are all".

    Sorry, I incorrectly wrote the comment.  What I tried to write is
    "We fail and return zero if there are no inheritance children or if
    we are not in VAC_MODE_SINGLE case and inheritance tree contains at
    least one foreign table.".

You might want to use "English" description of VAC_MODE_SINGLE instead
of that macro in the comment, so that reader doesn't have to look up
VAC_MODE_SINGLE. But I think, we will leave this for the committer.

I corrected the comments and translated the macro into the English description.

        2. The function has_foreign() be better named
        has_foreign_child()? This

    How about "has_foreign_table"?

has_foreign_child() would be better, since these are "children" in the
inheritance hierarchy and not mere "table"s.

Done.  But I renamed it to has_foreign_children() because it sounds more natural at least to me.

        function loops through all the tableoids passed even after it
        has found
        a foreign table. Why can't we return true immediately after
        finding the
        first foreign table, unless the side effects of heap_open() on
        all the
        table are required. But I don't see that to be the case, since these
        tables are locked already through a previous call to
        heap_open(). In the

    Good catch!  Will fix.

        same function instead of argument name parentrelId may be we
        should use
        name parent_oid, so that we use same notation for parent and
        child table
        OIDs.

    Will fix.

Done.

        3.  Regarding enum VacuumMode - it's being used only in case of
        acquire_inherited_sample_rows(__) and that too, to check only a
        single
        value of the three defined there. May be we should just infer
        that value
        inside acquire_inherited_sample_rows(__) or pass a boolean true
        or false
        from do_analyse_rel() based on the VacuumStmt. I do not see need
        for a
        separate three value enum of which only one value is used and
        also to
        pass it down from vacuum() by changing signatures of the minion
        functions.

    I introduced that for possible future use.  See the discussion in [1].

Will leave it for the commiter to decide.

I noticed that the signatures need not to be modified, as you said. Thanks for pointing that out!  So, I revised the patch not to change the signatures, though I left the enum, renaming it to AnalyzeMode.  Let's have the committer's review.

        4. In postgresGetForeignPlan(), the code added by this patch is
        required
        to handle the case when the row mark is placed on a parent table and
        hence is required to be applied on the child table. We need a
        comment
        explaining this. Otherwise, the three step process to get the
        row mark
        information isn't clear for a reader.

    Will add the comment.

Done.

        5. In expand_inherited_rtentry() why do you need a separate variable
        hasForeign? Instead of using that variable, you can actually
        set/reset
        rte->hasForeign since existence of even a single foreign child would
        mark that member as true. - After typing this, I got the answer
        when I
        looked at the function code. Every child's RTE is initially a
        copy of
        parent's RTE and hence hasForeign status would be inherited by every
        child after the first foreign child. I think, this reasoning
        should be
        added as comment before assignment to rte->hasForeign at the end
        of the
        function.

    As you mentioned, I think we could set rte->hasForeign directly
    during the scan for the inheritance set, without the separate
    variable, hasForeign.  But ISTM that it'd improve code readability
    to set rte->hasForeign using the separate variable at the end of the
    function because rte->hasForeign has its meaning only when rte->inh
    is true and because we know whether rte->inh is true, at the end of
    the function.

Fine. Please use "hasForeignChild" instead of just "hasForeign" without
a clue as to what is "foreign" here.

Done.  But I renamed it to "hasForeignChildren".

        6. The tests in foreign_data.sql are pretty extensive. Thanks
        for that.
        I think, we should also check the effect of each of the following
        command using \d on appropriate tables.
        +CREATE FOREIGN TABLE ft2 () INHERITS (pt1)
        +  SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted"
        'value');
        +ALTER FOREIGN TABLE ft2 NO INHERIT pt1;
        +DROP FOREIGN TABLE ft2;
        +CREATE FOREIGN TABLE ft2 (
        +   c1 integer NOT NULL,
        +   c2 text,
        +   c3 date
        +) SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted"
        'value');
        +ALTER FOREIGN TABLE ft2 INHERIT pt1;

    Will fix.

Done.

    Apart from the above, I noticed that the patch doesn't consider to
    call ExplainForeignModify during EXPLAIN for an inherited
    UPDATE/DELETE, as shown below (note that there are no UPDATE remote
    queries displayed):

Since there seems to be no objections, I updated the patch as proposed upthread.  Here is an example.

postgres=# explain (format text, verbose) update parent as p set a = a * 2 returning *;
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Update on public.parent p  (cost=0.00..202.33 rows=11 width=10)
   Output: p.a
   For public.ft1 p_1
     Remote SQL: UPDATE public.mytable_1 SET a = $2 WHERE ctid = $1 RETURNING a
   For public.ft2 p_2
     Remote SQL: UPDATE public.mytable_2 SET a = $2 WHERE ctid = $1 RETURNING a
   ->  Seq Scan on public.parent p  (cost=0.00..0.00 rows=1 width=10)
         Output: (p.a * 2), p.ctid
   ->  Foreign Scan on public.ft1 p_1  (cost=100.00..101.16 rows=5 width=10)
         Output: (p_1.a * 2), p_1.ctid
         Remote SQL: SELECT a, ctid FROM public.mytable_1 FOR UPDATE
   ->  Foreign Scan on public.ft2 p_2  (cost=100.00..101.16 rows=5 width=10)
         Output: (p_2.a * 2), p_2.ctid
         Remote SQL: SELECT a, ctid FROM public.mytable_2 FOR UPDATE
(14 rows)

Other changes:
* revised regression tests for contrib/file_fdw to refer to tableoid.
* revised docs a bit further.

Attached are updated patches.  Patch fdw-inh-5.patch has been created on top of patch fdw-chk-5.patch.  Patch fdw-chk-5.patch is basically the same as the previous one fdw-chk-4.patch, but I slightly modified that one.  The changes are the following.
* added to foreign_data.sql more tests for your comments.
* revised docs on ALTER FOREIGN TABLE a bit further.


Thanks,

Best regards,
Etsuro Fujita



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: logical column ordering
Next
From: "Amit Langote"
Date:
Subject: Re: On partitioning