Re: inherit support for foreign tables - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: inherit support for foreign tables
Date
Msg-id 5476FB4B.9060705@lab.ntt.co.jp
Whole thread Raw
In response to Re: inherit support for foreign tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: inherit support for foreign tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
(2014/11/17 17:55), Ashutosh Bapat wrote:
> Here are my review comments for patch fdw-inh-3.patch.

Thanks for the review!

> Tests
> -------
> 1. It seems like you have copied from testcase inherit.sql to
> postgres_fdw testcase. That's a good thing, but it makes the test quite
> long. May be we should have two tests in postgres_fdw contrib module,
> one for simple cases, and other for inheritance. What do you say?

IMO, the test is not so time-consuming, so it doesn't seem worth 
splitting it into two.

> Documentation
> --------------------
> 1. The change in ddl.sgml
> -        We will refer to the child tables as partitions, though they
> -        are in every way normal <productname>PostgreSQL</> tables.
> +        We will refer to the child tables as partitions, though we assume
> +        that they are normal <productname>PostgreSQL</> tables.
>
> adds phrase "we assume that", which confuses the intention behind the
> sentence. The original text is intended to highlight the equivalence
> between "partition" and "normal table", where as the addition esp. the
> word "assume" weakens that equivalence. Instead now we have to highlight
> the equivalence between "partition" and "normal or foreign table". The
> wording similar to "though they are either normal or foreign tables"
> should be used there.

You are right, but I feel that there is something out of place in saying 
that there (5.10.2. Implementing Partitioning) because the procedure 
there has been written based on normal tables only.  Put another way, if 
we say that, I think we'd need to add more docs, describing the syntax 
and/or the corresponding examples for foreign-table cases.  But I'd like 
to leave that for another patch.  So, how about the wording "we assume 
*here* that", instead of "we assume that", as I added the following 
notice in the previous section (5.10.1. Overview)?

@@ -2650,7 +2669,10 @@ VALUES ('Albany', NULL, NULL, 'NY');     table of a single parent table.  The parent table
itselfis normally     empty; it exists just to represent the entire data set.  You should be     familiar with
inheritance(see <xref linkend="ddl-inherit">) before
 
-    attempting to set up partitioning.
+    attempting to set up partitioning.  (The setup and management of
+    partitioned tables illustrated in this section assume that each
+    partition is a normal table.  However, you can do that in a similar way
+    for cases where some or all partitions are foreign tables.)

> 2. The wording "some kind of optimization" gives vague picture. May be
> it can be worded as "Since the constraints are assumed to be true, they
> are used in constraint-based query optimization like constraint
> exclusion for partitioned tables.".
> +    Those constraints are used in some kind of query optimization such
> +    as constraint exclusion for partitioned tables (see
> +    <xref linkend="ddl-partitioning">).

Will follow your revision.

> Code
> -------
> 1. In the following change
> +/*
>    * acquire_inherited_sample_rows -- acquire sample rows from
> inheritance tree
>    *
>    * This has the same API as acquire_sample_rows, except that rows are
>    * collected from all inheritance children as well as the specified table.
> - * We fail and return zero if there are no inheritance children.
> + * We fail and return zero if there are no inheritance children or
> there are
> + * inheritance children that foreign tables.
>
> The addition should be "there are inheritance children that *are all
> *foreign tables. Note the addition "are all".

Sorry, I incorrectly wrote the comment.  What I tried to write is "We 
fail and return zero if there are no inheritance children or if we are 
not in VAC_MODE_SINGLE case and inheritance tree contains at least one 
foreign table.".

> 2. The function has_foreign() be better named has_foreign_child()? This

How about "has_foreign_table"?

> function loops through all the tableoids passed even after it has found
> a foreign table. Why can't we return true immediately after finding the
> first foreign table, unless the side effects of heap_open() on all the
> table are required. But I don't see that to be the case, since these
> tables are locked already through a previous call to heap_open(). In the

Good catch!  Will fix.

> same function instead of argument name parentrelId may be we should use
> name parent_oid, so that we use same notation for parent and child table
> OIDs.

Will fix.

> 3.  Regarding enum VacuumMode - it's being used only in case of
> acquire_inherited_sample_rows() and that too, to check only a single
> value of the three defined there. May be we should just infer that value
> inside acquire_inherited_sample_rows() or pass a boolean true or false
> from do_analyse_rel() based on the VacuumStmt. I do not see need for a
> separate three value enum of which only one value is used and also to
> pass it down from vacuum() by changing signatures of the minion functions.

I introduced that for possible future use.  See the discussion in [1].

> 4. In postgresGetForeignPlan(), the code added by this patch is required
> to handle the case when the row mark is placed on a parent table and
> hence is required to be applied on the child table. We need a comment
> explaining this. Otherwise, the three step process to get the row mark
> information isn't clear for a reader.

Will add the comment.

> 5. In expand_inherited_rtentry() why do you need a separate variable
> hasForeign? Instead of using that variable, you can actually set/reset
> rte->hasForeign since existence of even a single foreign child would
> mark that member as true. - After typing this, I got the answer when I
> looked at the function code. Every child's RTE is initially a copy of
> parent's RTE and hence hasForeign status would be inherited by every
> child after the first foreign child. I think, this reasoning should be
> added as comment before assignment to rte->hasForeign at the end of the
> function.

As you mentioned, I think we could set rte->hasForeign directly during 
the scan for the inheritance set, without the separate variable, 
hasForeign.  But ISTM that it'd improve code readability to set 
rte->hasForeign using the separate variable at the end of the function 
because rte->hasForeign has its meaning only when rte->inh is true and 
because we know whether rte->inh is true, at the end of the function.

> 6. The tests in foreign_data.sql are pretty extensive. Thanks for that.
> I think, we should also check the effect of each of the following
> command using \d on appropriate tables.
> +CREATE FOREIGN TABLE ft2 () INHERITS (pt1)
> +  SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted" 'value');
> +ALTER FOREIGN TABLE ft2 NO INHERIT pt1;
> +DROP FOREIGN TABLE ft2;
> +CREATE FOREIGN TABLE ft2 (
> +   c1 integer NOT NULL,
> +   c2 text,
> +   c3 date
> +) SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted" 'value');
> +ALTER FOREIGN TABLE ft2 INHERIT pt1;

Will fix.

Apart from the above, I noticed that the patch doesn't consider to call 
ExplainForeignModify during EXPLAIN for an inherited UPDATE/DELETE, as 
shown below (note that there are no UPDATE remote queries displayed):

postgres=# explain verbose update parent set a = a * 2 where a = 5;                                     QUERY PLAN
------------------------------------------------------------------------------------- Update on public.parent
(cost=0.00..280.77rows=25 width=10)   ->  Seq Scan on public.parent  (cost=0.00..0.00 rows=1 width=10)         Output:
(parent.a* 2), parent.ctid         Filter: (parent.a = 5)   ->  Foreign Scan on public.ft1  (cost=100.00..140.38
rows=12width=10)         Output: (ft1.a * 2), ft1.ctid         Remote SQL: SELECT a, ctid FROM public.mytable_1 WHERE
((a= 
 
5)) FOR UPDATE   ->  Foreign Scan on public.ft2  (cost=100.00..140.38 rows=12 width=10)         Output: (ft2.a * 2),
ft2.ctid        Remote SQL: SELECT a, ctid FROM public.mytable_2 WHERE ((a = 
 
5)) FOR UPDATE
(10 rows)

So, I'd like to modify explain.c to show those queries like this:

postgres=# explain verbose update parent set a = a * 2 where a = 5;                                     QUERY PLAN
------------------------------------------------------------------------------------- Update on public.parent
(cost=0.00..280.77rows=25 width=10)   ->  Seq Scan on public.parent  (cost=0.00..0.00 rows=1 width=10)         Output:
(parent.a* 2), parent.ctid         Filter: (parent.a = 5)   Remote SQL: UPDATE public.mytable_1 SET a = $2 WHERE ctid =
$1  ->  Foreign Scan on public.ft1  (cost=100.00..140.38 rows=12 width=10)         Output: (ft1.a * 2), ft1.ctid
Remote SQL: SELECT a, ctid FROM public.mytable_1 WHERE ((a = 
 
5)) FOR UPDATE   Remote SQL: UPDATE public.mytable_2 SET a = $2 WHERE ctid = $1   ->  Foreign Scan on public.ft2
(cost=100.00..140.38rows=12 width=10)         Output: (ft2.a * 2), ft2.ctid         Remote SQL: SELECT a, ctid FROM
public.mytable_2WHERE ((a = 
 
5)) FOR UPDATE
(12 rows)

What do you say?

Sorry for the delay.

[1] http://www.postgresql.org/message-id/1316566782-sup-2678@alvh.no-ip.org

Best regards,
Etsuro Fujita



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: [pgsql-packagers] Palle Girgensohn's ICU patch
Next
From: Kouhei Kaigai
Date:
Subject: Re: [v9.5] Custom Plan API