Re: [HACKERS] reload-through-the-top-parent switch the partition table - Mailing list pgsql-hackers

From Rushabh Lathia
Subject Re: [HACKERS] reload-through-the-top-parent switch the partition table
Date
Msg-id CAGPqQf38k6X7gHQf2OsYG1nd4SRrSmmpnpWpZv3nj3_uF71u7g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] reload-through-the-top-parent switch the partition table  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] reload-through-the-top-parent switch the partition table
List pgsql-hackers


On Thu, Aug 10, 2017 at 8:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Aug 10, 2017 at 3:47 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
>> (1) seems like a pretty arbitrary restriction, so I don't like that
>> option.  (2) would hurt performance in some use cases.  Do we have an
>> option (3)?
>
> How about protecting option 2) with the load-via-partition-root protection.
> Means
> load the parents information even dump is not set only when
> load-via-partition-root
> & ispartition.
>
> This won't hurt performance for the normal cases.

Yes, that seems like the right approach.

+        Dump data via the top-most partitioned table (rather than partition
+        table) when dumping data for the partition table.

I think we should phrase this a bit more clearly, something like this:
When dumping a COPY or INSERT statement for a partitioned table,
target the root of the partitioning hierarchy which contains it rather
than the partition itself.  This may be useful when reloading data on
a server where rows do not always fall into the same partitions as
they did on the original server.  This could happen, for example, if
the partitioning column is of type text and the two system have
different definitions of the collation used to partition the data.


Done.
 
+    printf(_("  --load-via-partition-root    load partition table via
the root relation\n"));

"relation" seems odd to me here.  root table?


Done.
 
         /* Don't bother computing anything for non-target tables, either */
         if (!tblinfo[i].dobj.dump)
+        {
+            /*
+             * Load the parents information for the partition table when
+             * the load-via-partition-root option is set. As we need the
+             * parents information to get the partition root.
+             */
+            if (dopt->load_via_partition_root &&
+                tblinfo[i].ispartition)
+                findParentsByOid(&tblinfo[i], inhinfo, numInherits);
             continue;
+        }

Duplicating the call to findParentsByOid seems less then ideal.  How
about doing something like this:

if (tblinfo[i].dobj.dump)
{
    find_parents = true;
    mark_parents = true;
}
else if (dopt->load_via_partition_root && tblinfo[i].ispartition)
    find_parents = true;

if (find_parents)
    findParentsByOid(&tblinfo[i], inhinfo, numInherits);

etc.


Done changes to avoid duplicate call to findParentsByOid().
 
The comments for this function also need some work - e.g. the function
header comment deserves some kind of update for these changes.


Done.
 
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+    Assert(tbinfo->ispartition);
+    Assert(tbinfo->numParents == 1);
+    if (tbinfo->parents[0]->ispartition)
+        return getRootTableInfo(tbinfo->parents[0]);
+
+    return tbinfo->parents[0];
+}

This code should iterate, not recurse, to avoid any risk of blowing
out the stack.


Done.

Please find attach patch with the changes.

Thanks,
Rushabh Lathia
Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Foreign tables privileges not shown in information_schema.table_privileges
Next
From: Michael Paquier
Date:
Subject: Re: [HACKERS] SCRAM protocol documentation