Re: [HACKERS] Effect of dropping a partitioned table's column overtime - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [HACKERS] Effect of dropping a partitioned table's column overtime
Date
Msg-id 337463a4-b0b5-ba40-0965-492ffad8a740@lab.ntt.co.jp
Whole thread Raw
In response to [HACKERS] Effect of dropping a partitioned table's column over time  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] Effect of dropping a partitioned table's column over time  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
Hi Thomas,

On 2017/08/07 10:58, Thomas Munro wrote:
> Hi hackers,
> 
> If you drop a column from a partitioned table then it has a TupleDesc
> that matches existing partitions, but new partitions created after
> that have non-same TupleDescs (according to convert_tuples_by_name)
> because they don't have the dropped column.  That means that inserts
> to partitions created later need to go via the deform->remap->form
> code path in tupconvert.c.  If you're using a time-based partitioning
> scheme where you add a new partition for each month and mostly insert
> into the current month, as is very common, then after dropping a
> column you'll eventually finish up sending ALL your inserts through
> tupconvert.c for the rest of time.

That's good observation.

> For example, having hacked my tree to print out a message to tell me
> if it had to convert a tuple:
> 
> postgres=# create table parent (a int, b int) partition by list (b);
> CREATE TABLE
> postgres=# create table child1 partition of parent for values in (1);
> CREATE TABLE
> postgres=# alter table parent drop column a;
> ALTER TABLE
> postgres=# create table child2 partition of parent for values in (2);
> CREATE TABLE
> postgres=# insert into parent values (1);
> NOTICE:  no map
> INSERT 0 1
> postgres=# insert into parent values (2);
> NOTICE:  map!
> INSERT 0 1
> 
> Of course there are other usage patterns where you might prefer it
> this way, because you'll mostly be inserting into partitions created
> before the change.  In general, would it be better for the partitioned
> table's TupleDesc to match partitions created before or after a
> change?  Since partitioned tables have no storage themselves, is there
> any technical reason we couldn't remove a partitioned table's dropped
> pg_attribute so that its TupleDesc matches partitions created later?

That means the parent's TupleDesc will begin mismatching that of all of
the existing partitions and they will suddenly need a map where they
didn't before.

I guess you considered it, but optimizing for the common case of range
partitioning where most of the inserts go to the newest partition will
hurt the other partitioning methods, like hash, where that won't
necessarily be true.

> Is there some way that tupconvert.c could make this type of difference
> moot?

Do you mean the difference arising due to dropped columns in either the
partitioned table or the table attached as a partition?

Thanks,
Amit




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Effect of dropping a partitioned table's column over time
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] scan on inheritance parent with no children in currentsession