Thread: how to "explain" some ddl

how to "explain" some ddl

From
Marc Millas
Date:
Hi,
We would like to understand where an alter table attach partition spend its time.
to my understanding, explain doesnt do this.

for a BI job we have a partitionned table with 1800+ partitions.
the feeding process of this table leeds to detach and attach partitions.
attaching do take time, something like 12 seconds by partition.
We need to understand where that time is spent (check constraints or check within the default partition or...)

So, how to ?

thanks,
regards,

Marc MILLAS
Senior Architect
+33607850334

Re: how to "explain" some ddl

From
Tom Lane
Date:
Marc Millas <marc.millas@mokadb.com> writes:
> We would like to understand where an alter table attach partition spend its
> time.
> to my understanding, explain doesnt do this.

Nope :-(.  As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.

There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.

> for a BI job we have a partitionned table with 1800+ partitions.

TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions.  We may never be at that
point, although people continue to chip away at the bottlenecks.

            regards, tom lane



Re: how to "explain" some ddl

From
Marc Millas
Date:
Hi,
thanks for the answer.
the pb is that the fact table do have mods for "old" data.
so the current scheme implies to truncate partitions and recreate them, and copy from ods to dm, etc which is better than millions (tens of) delete and vacuuming.
and so, the partitioning scheme is based on day s data. so the 1800+.

the other pb we do have is the very long planning time for most request. was 120 sec in r11, down to 60 sec in 12.
vs an exec time around 4 sec. Looks like the number of indexes is of paramount impact.
Can you take me to any doc about optimizing the index scheme for a fact table with 40 dimensions ?
thanks
regards,




Marc MILLAS
Senior Architect
+33607850334



On Tue, Jul 14, 2020 at 7:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Marc Millas <marc.millas@mokadb.com> writes:
> We would like to understand where an alter table attach partition spend its
> time.
> to my understanding, explain doesnt do this.

Nope :-(.  As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.

There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.

> for a BI job we have a partitionned table with 1800+ partitions.

TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions.  We may never be at that
point, although people continue to chip away at the bottlenecks.

                        regards, tom lane

Re: how to "explain" some ddl

From
Marc Millas
Date:
Hi Tom,
a few tests later.
Looks like when you add a partition as default, all tupples of it are read, even if there is an index on the column that is the partition key.
this do explain our attach time. We are going to clean the default partition...

regards,

Marc MILLAS
Senior Architect
+33607850334



On Tue, Jul 14, 2020 at 7:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Marc Millas <marc.millas@mokadb.com> writes:
> We would like to understand where an alter table attach partition spend its
> time.
> to my understanding, explain doesnt do this.

Nope :-(.  As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.

There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.

> for a BI job we have a partitionned table with 1800+ partitions.

TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions.  We may never be at that
point, although people continue to chip away at the bottlenecks.

                        regards, tom lane

Re: how to "explain" some ddl

From
Michel Pelletier
Date:
Marc,

If you add a check constraint that proves the new child partition has no out of bounds rows, then the ATTACH PARTITION will not block:

"Before running the ATTACH PARTITION command, it is recommended to create a CHECK constraint on the table to be attached matching the desired partition constraint. That way, the system will be able to skip the scan to validate the implicit partition constraint. Without the CHECK constraint, the table will be scanned to validate the partition constraint while holding an ACCESS EXCLUSIVE lock on that partition and a SHARE UPDATE EXCLUSIVE lock on the parent table. It may be desired to drop the redundant CHECK constraint after ATTACH PARTITION is finished."


As for your high dimension table with lots of indexes, are you sure they are all being used?  I almost always find my legacy customers have many indexes that are constantly being updated but are never used by their applications due to either "framework confusion" or just overzealous indexing.   Here's a good article by Laurenze Albe on the subject:


-Michel

On Tue, Jul 14, 2020 at 12:32 PM Marc Millas <marc.millas@mokadb.com> wrote:
Hi Tom,
a few tests later.
Looks like when you add a partition as default, all tupples of it are read, even if there is an index on the column that is the partition key.
this do explain our attach time. We are going to clean the default partition...

regards,

Marc MILLAS
Senior Architect
+33607850334



On Tue, Jul 14, 2020 at 7:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Marc Millas <marc.millas@mokadb.com> writes:
> We would like to understand where an alter table attach partition spend its
> time.
> to my understanding, explain doesnt do this.

Nope :-(.  As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.

There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.

> for a BI job we have a partitionned table with 1800+ partitions.

TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions.  We may never be at that
point, although people continue to chip away at the bottlenecks.

                        regards, tom lane