how to "explain" some ddl
Hi,
We would like to understand where an alter table attach partition spend its
time.
to my understanding, explain doesnt do this.
for a BI job we have a partitionned table with 1800+ partitions.
the feeding process of this table leeds to detach and attach partitions.
attaching do take time, something like 12 seconds by partition.
We need to understand where that time is spent (check constraints or check
within the default partition or...)
So, how to ?
thanks,
regards,
Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com
Marc Millas <marc.millas@mokadb.com> writes:
We would like to understand where an alter table attach partition spend its
time.
to my understanding, explain doesnt do this.
Nope :-(. As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.
There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.
for a BI job we have a partitionned table with 1800+ partitions.
TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions. We may never be at that
point, although people continue to chip away at the bottlenecks.
regards, tom lane
Hi,
thanks for the answer.
the pb is that the fact table do have mods for "old" data.
so the current scheme implies to truncate partitions and recreate them, and
copy from ods to dm, etc which is better than millions (tens of) delete and
vacuuming.
and so, the partitioning scheme is based on day s data. so the 1800+.
the other pb we do have is the very long planning time for most request.
was 120 sec in r11, down to 60 sec in 12.
vs an exec time around 4 sec. Looks like the number of indexes is of
paramount impact.
Can you take me to any doc about optimizing the index scheme for a fact
table with 40 dimensions ?
thanks
regards,
Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com
On Tue, Jul 14, 2020 at 7:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Marc Millas <marc.millas@mokadb.com> writes:
We would like to understand where an alter table attach partition spend
its
time.
to my understanding, explain doesnt do this.Nope :-(. As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.for a BI job we have a partitionned table with 1800+ partitions.
TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions. We may never be at that
point, although people continue to chip away at the bottlenecks.regards, tom lane
Hi Tom,
a few tests later.
Looks like when you add a partition as default, all tupples of it are read,
even if there is an index on the column that is the partition key.
this do explain our attach time. We are going to clean the default
partition...
regards,
Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com
On Tue, Jul 14, 2020 at 7:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Marc Millas <marc.millas@mokadb.com> writes:
We would like to understand where an alter table attach partition spend
its
time.
to my understanding, explain doesnt do this.Nope :-(. As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.for a BI job we have a partitionned table with 1800+ partitions.
TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions. We may never be at that
point, although people continue to chip away at the bottlenecks.regards, tom lane
Marc,
If you add a check constraint that proves the new child partition has no
out of bounds rows, then the ATTACH PARTITION will not block:
"Before running the ATTACH PARTITION command, it is recommended to create a
CHECK constraint on the table to be attached matching the desired partition
constraint. That way, the system will be able to skip the scan to validate
the implicit partition constraint. Without the CHECK constraint, the table
will be scanned to validate the partition constraint while holding an ACCESS
EXCLUSIVE lock on that partition and a SHARE UPDATE EXCLUSIVE lock on the
parent table. It may be desired to drop the redundant CHECK constraint
after ATTACH PARTITION is finished."
https://www.postgresql.org/docs/current/ddl-partitioning.html
As for your high dimension table with lots of indexes, are you sure they
are all being used? I almost always find my legacy customers have many
indexes that are constantly being updated but are never used by their
applications due to either "framework confusion" or just overzealous
indexing. Here's a good article by Laurenze Albe on the subject:
https://www.cybertec-postgresql.com/en/get-rid-of-your-unused-indexes/
-Michel
On Tue, Jul 14, 2020 at 12:32 PM Marc Millas <marc.millas@mokadb.com> wrote:
Show quoted text
Hi Tom,
a few tests later.
Looks like when you add a partition as default, all tupples of it are
read, even if there is an index on the column that is the partition key.
this do explain our attach time. We are going to clean the default
partition...regards,
Marc MILLAS
Senior Architect
+33607850334
www.mokadb.comOn Tue, Jul 14, 2020 at 7:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Marc Millas <marc.millas@mokadb.com> writes:
We would like to understand where an alter table attach partition spend
its
time.
to my understanding, explain doesnt do this.Nope :-(. As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.for a BI job we have a partitionned table with 1800+ partitions.
TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions. We may never be at that
point, although people continue to chip away at the bottlenecks.regards, tom lane