Re: [HACKERS] Bug in pg_dump --table and --exclude-table fordeclarative partition table handling. - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: [HACKERS] Bug in pg_dump --table and --exclude-table fordeclarative partition table handling.
Date
Msg-id CAFjFpRdDCrk_DNk1+GHCLjmEZaKrVbQNsPrgKipA6TTbUmuXgQ@mail.gmail.com
Whole thread Raw
In response to [HACKERS] Bug in pg_dump --table and --exclude-table for declarative partitiontable handling.  (amul sul <sulamul@gmail.com>)
List pgsql-hackers
On Tue, May 9, 2017 at 12:43 PM, amul sul <sulamul@gmail.com> wrote:
> Hi,
>
> Current pg_dump --exclude-table option excludes partitioned relation
> and dumps all its child relations and vice versa for --table option, which
> I think is incorrect.
>
> In this case we might need to explore all partitions and exclude or include
> from dump according to given pg_dump option, attaching WIP patch proposing
> same fix.   Thoughts/Comments?
>

+1 for fixing this.

Since we didn't catch this issue earlier it looks like we don't have
testcases testing this scenario. May be we should add those in the
patch.

The way this patch has been written, there is a possibility that a
partition would be included multiple times in the list of tables, if
names of the partition and the parent table both match the pattern.
That's not a correctness issue, but it would slow down
simple_oid_list_member() a bit because of longer OID list.

I am not sure what would be the use of dumping a partitioned table
without its partitions or vice-versa. I think, instead, look
partitioned table and its partitions as a single unit as far as dump
is concerned. So, either we dump partitioned table along with all its
partitions or we don't dump any of those. In that case, we should
modify the query in expand_table_name_patterns(), to not include
partitions in the result. Rest of the code in the patch would take
care of including/excluding the partitions when the parent table is
included/excluded.

This patch looks up pg_inherits for every partitioned tables that it
encounters, which is costly. Instead, a fix with better performance is
to set a table dumpable based on the corresponding setting of its
parent. The parent is available in TableInfo structure, but the
comment there says that it's set only for dumpable objects. The
comment is correct since flagInhTables() skips the tables with dump =
false. May be we should modify this function not to skip the
partitions, find their parent tables and set the dump flat based on
the parents. This means that the immediate parent's dump flag should
be set correctly before any of its children are encountered by the
flagInhTables() for the case of multi-level partitioning to work. I
don't think we can guarantee that. May be we can modify
flagInhTables() to set the parent tables of parent if that's not done
already and then set the dump flag from top parent to bottom. If we do
that, we will have to add code in tflagInhTables() to skip the entries
whose parents have been set already since those might have been set
because of an earlier grand child.

Even better if we could dump the partitions along with partitioned
table instead of creating separate TableInfo entries for those. But I
think that's a lot of change.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] Bug in pg_dump --table and --exclude-table fordeclarative partition table handling.
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] Bug in pg_dump --table and --exclude-table fordeclarative partition table handling.