Questions about Partitioned Tables and Indexes - Mailing list pgsql-general
From | Evelyn Dibben |
---|---|
Subject | Questions about Partitioned Tables and Indexes |
Date | |
Msg-id | CADs92_+KSWg7=N70upjxs7E272MD6RANskbAftcNvZT6146D5g@mail.gmail.com Whole thread Raw |
Responses |
Re: Questions about Partitioned Tables and Indexes
|
List | pgsql-general |
I apologize for the lengthy post. I'm trying to get in all the details.
We recently upgraded our Postgres AWS RDS from 9.5 to 11.1.
We have several large partitioned tables implemented using inheritance that we are considering converting to declarative partitioning.
(I'm talking about 5TB of partitioned data). I want to be sure of my methodology before I push forward.
For example here is how we would have created a partitioned table with inheritance. The table has a primary key and an index. The inherited partition has a check constraint and an index. (Not shown is the trigger on the primary table that would put the new rows in the correct partition.)
CREATE TABLE test
(
date_key numeric(15,0) NOT NULL,
metric numeric(15,0) NOT NULL,
value numeric(28,5) NOT NULL,
CONSTRAINT test_pkey PRIMARY KEY (date_key,metric)
)
TABLESPACE pg_default;
CREATE INDEX test_idx1
ON test USING btree
(metric)
TABLESPACE pg_default;
CREATE TABLE test_201908
(
CONSTRAINT const_test_chk CHECK (date_key >= 20190801::numeric AND date_key <= 20190831::numeric)
)
INHERITS (test)
TABLESPACE pg_default;
CREATE INDEX test_idx1_201908
ON test_201908 USING btree
(metric)
TABLESPACE pg_default;
AMZGQ3DW=> \d+ edibben.test
Table "edibben.test"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Indexes:
"test_pkey" PRIMARY KEY, btree (date_key, metric)
"test_idx1" btree (metric)
Child tables: edibben.test_201908
AMZGQ3DW=> \d+ edibben.test_201908
Table "edibben.test_201908"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Indexes:
"test_idx1_201908" btree (metric)
Check constraints:
"const_test_chk" CHECK (date_key >= 20190801::numeric AND date_key <= 20190831::numeric)
Inherits: edibben.test
I know that I can convert this table into a declarative partitioned table by doing the following:
Create a new partitioned table:
CREATE TABLE test_part
(
date_key numeric(15,0) NOT NULL,
metric numeric(15,0) NOT NULL,
value numeric(28,5) NOT NULL,
CONSTRAINT test_part_pkey PRIMARY KEY (date_key,metric)
) PARTITION BY RANGE (date_key)
TABLESPACE pg_default;
CREATE INDEX test_part_idx1
ON test_part USING btree
(metric)
TABLESPACE pg_default;
Drop the inheritance on the test_201908 table:
alter table test_201908 no inherit test;
And then add this table to the partitioned table. The doco says to keep the check constraint in place until after the data is loaded.
alter table test_part
attach partition test_201908
for VALUES FROM (20190801) TO (20190831);
The partition shows up attached to the table:
\d+ edibben.test_part
Table "edibben.test_part"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Partition key: RANGE (date_key)
Indexes:
"test_part_pkey" PRIMARY KEY, btree (date_key, metric)
"test_part_idx1" btree (metric)
Partitions: edibben.test_201908 FOR VALUES FROM ('20190801') TO ('20190831')
My question is about what happens to the indexes. When you examine the partition you see the primary key inherited from the partition table
and the original index (test_idx1_201908).
AMZGQ3DW-> \d+ edibben.test_201908
Table "edibben.test_201908"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Partition of: edibben.test_part FOR VALUES FROM ('20190801') TO ('20190831')
Partition constraint: ((date_key IS NOT NULL) AND (date_key >= '20190801'::numeric(15,0)) AND (date_key < '20190831'::numeric(15,0)))
Indexes:
"test_201908_pkey" PRIMARY KEY, btree (date_key, metric)
"test_idx1_201908" btree (metric)
Check constraints:
"const_test_chk" CHECK (date_key >= 20190801::numeric AND date_key <= 20190831::numeric)
If I add a new partition to the test_part table
CREATE TABLE test_201909 PARTITION OF test_part
FOR VALUES FROM ('20190901') TO ('20190930');
The new table has the primary key and the index but the index has a system generated name.
$\d+ edibben.test_201909
Table "edibben.test_201909"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Partition of: edibben.test_part FOR VALUES FROM ('20190901') TO ('20190930')
Partition constraint: ((date_key IS NOT NULL) AND (date_key >= '20190901'::numeric(15,0)) AND (date_key < '20190930'::numeric(15,0)))
Indexes:
"test_201909_pkey" PRIMARY KEY, btree (date_key, metric)
"test_201909_metric_idx" btree (metric)
Looking at pg_class for the objects I just created:
AMZGQ3DW=> select relname, reltype, relkind,relowner from pg_class where relname like 'test%';
relname | reltype | relkind | relowner
------------------------+---------+---------+----------
test_201908 | 365444 | r | 98603
test_201908_pkey | 0 | i | 98603
test_idx1_201908 | 0 | i | 98603
test_201909 | 366498 | r | 98603
test_201909_metric_idx | 0 | i | 98603
test_201909_pkey | 0 | i | 98603
test_part | 365449 | p | 98603
test_part_idx1 | 0 | I | 98603
test_part_pkey | 0 | I | 98603
The indexes on the partitioned table have a relkind of I and the indexes on the partitions have a rekind of i. Looking at pg_indexes
there are no entries for the indexes on the primary table:
AMZGQ3DW=> select schemaname, tablename, indexname from pg_indexes where schemaname = 'edibben' and tablename = 'test_part';
schemaname | tablename | indexname
------------+-----------+-----------
(0 rows)
The indexes on the partitions do show up:
AMZGQ3DW=> select schemaname, tablename, indexname from pg_indexes where schemaname = 'edibben' and tablename like 'test%' order by tablename;
schemaname | tablename | indexname
------------+-------------+------------------------
edibben | test_201908 | test_201908_pkey
edibben | test_201908 | test_idx1_201908
edibben | test_201909 | test_201909_pkey
edibben | test_201909 | test_201909_metric_idx
So, is this partitioned table properly indexed? (yes there was a question buried in this mess). I can't find any documentation of how the
partitioned indexes work but it appears that the partitioned 'Index' is just a definition and that the real indexes are on the partitions themselves. Is there a way to list all of the indexes associated with a partitioned index? Is there a way to see if the partitioned index is valid?
Also, the doco talks about creating the index on the partitioned table with the CREATE INDEX ON ONLY option. I don't think this
applies to what I need to do. Am I right?
"As explained above, it is possible to create indexes on partitioned tables and they are applied automatically
to the entire hierarchy. This is very convenient, as not only the existing partitions will become indexed,
but also any partitions that are created in the future will. One limitation is that it's not possible to use
the CONCURRENTLY qualifier when creating such a partitioned index. To overcome long lock times,
it is possible to use CREATE INDEX ON ONLY the partitioned table; such an index is marked invalid,
and the partitions do not get the index applied automatically. The indexes on partitions can be created
separately using CONCURRENTLY, and later attached to the index on the parent using ALTER INDEX .. ATTACH PARTITION.
Once indexes for all partitions are attached to the parent index, the parent index is marked valid automatically."
We recently upgraded our Postgres AWS RDS from 9.5 to 11.1.
We have several large partitioned tables implemented using inheritance that we are considering converting to declarative partitioning.
(I'm talking about 5TB of partitioned data). I want to be sure of my methodology before I push forward.
For example here is how we would have created a partitioned table with inheritance. The table has a primary key and an index. The inherited partition has a check constraint and an index. (Not shown is the trigger on the primary table that would put the new rows in the correct partition.)
CREATE TABLE test
(
date_key numeric(15,0) NOT NULL,
metric numeric(15,0) NOT NULL,
value numeric(28,5) NOT NULL,
CONSTRAINT test_pkey PRIMARY KEY (date_key,metric)
)
TABLESPACE pg_default;
CREATE INDEX test_idx1
ON test USING btree
(metric)
TABLESPACE pg_default;
CREATE TABLE test_201908
(
CONSTRAINT const_test_chk CHECK (date_key >= 20190801::numeric AND date_key <= 20190831::numeric)
)
INHERITS (test)
TABLESPACE pg_default;
CREATE INDEX test_idx1_201908
ON test_201908 USING btree
(metric)
TABLESPACE pg_default;
AMZGQ3DW=> \d+ edibben.test
Table "edibben.test"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Indexes:
"test_pkey" PRIMARY KEY, btree (date_key, metric)
"test_idx1" btree (metric)
Child tables: edibben.test_201908
AMZGQ3DW=> \d+ edibben.test_201908
Table "edibben.test_201908"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Indexes:
"test_idx1_201908" btree (metric)
Check constraints:
"const_test_chk" CHECK (date_key >= 20190801::numeric AND date_key <= 20190831::numeric)
Inherits: edibben.test
I know that I can convert this table into a declarative partitioned table by doing the following:
Create a new partitioned table:
CREATE TABLE test_part
(
date_key numeric(15,0) NOT NULL,
metric numeric(15,0) NOT NULL,
value numeric(28,5) NOT NULL,
CONSTRAINT test_part_pkey PRIMARY KEY (date_key,metric)
) PARTITION BY RANGE (date_key)
TABLESPACE pg_default;
CREATE INDEX test_part_idx1
ON test_part USING btree
(metric)
TABLESPACE pg_default;
Drop the inheritance on the test_201908 table:
alter table test_201908 no inherit test;
And then add this table to the partitioned table. The doco says to keep the check constraint in place until after the data is loaded.
alter table test_part
attach partition test_201908
for VALUES FROM (20190801) TO (20190831);
The partition shows up attached to the table:
\d+ edibben.test_part
Table "edibben.test_part"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Partition key: RANGE (date_key)
Indexes:
"test_part_pkey" PRIMARY KEY, btree (date_key, metric)
"test_part_idx1" btree (metric)
Partitions: edibben.test_201908 FOR VALUES FROM ('20190801') TO ('20190831')
My question is about what happens to the indexes. When you examine the partition you see the primary key inherited from the partition table
and the original index (test_idx1_201908).
AMZGQ3DW-> \d+ edibben.test_201908
Table "edibben.test_201908"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Partition of: edibben.test_part FOR VALUES FROM ('20190801') TO ('20190831')
Partition constraint: ((date_key IS NOT NULL) AND (date_key >= '20190801'::numeric(15,0)) AND (date_key < '20190831'::numeric(15,0)))
Indexes:
"test_201908_pkey" PRIMARY KEY, btree (date_key, metric)
"test_idx1_201908" btree (metric)
Check constraints:
"const_test_chk" CHECK (date_key >= 20190801::numeric AND date_key <= 20190831::numeric)
If I add a new partition to the test_part table
CREATE TABLE test_201909 PARTITION OF test_part
FOR VALUES FROM ('20190901') TO ('20190930');
The new table has the primary key and the index but the index has a system generated name.
$\d+ edibben.test_201909
Table "edibben.test_201909"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
----------+---------------+-----------+----------+---------+---------+--------------+-------------
date_key | numeric(15,0) | | not null | | main | |
metric | numeric(15,0) | | not null | | main | |
value | numeric(28,5) | | not null | | main | |
Partition of: edibben.test_part FOR VALUES FROM ('20190901') TO ('20190930')
Partition constraint: ((date_key IS NOT NULL) AND (date_key >= '20190901'::numeric(15,0)) AND (date_key < '20190930'::numeric(15,0)))
Indexes:
"test_201909_pkey" PRIMARY KEY, btree (date_key, metric)
"test_201909_metric_idx" btree (metric)
Looking at pg_class for the objects I just created:
AMZGQ3DW=> select relname, reltype, relkind,relowner from pg_class where relname like 'test%';
relname | reltype | relkind | relowner
------------------------+---------+---------+----------
test_201908 | 365444 | r | 98603
test_201908_pkey | 0 | i | 98603
test_idx1_201908 | 0 | i | 98603
test_201909 | 366498 | r | 98603
test_201909_metric_idx | 0 | i | 98603
test_201909_pkey | 0 | i | 98603
test_part | 365449 | p | 98603
test_part_idx1 | 0 | I | 98603
test_part_pkey | 0 | I | 98603
The indexes on the partitioned table have a relkind of I and the indexes on the partitions have a rekind of i. Looking at pg_indexes
there are no entries for the indexes on the primary table:
AMZGQ3DW=> select schemaname, tablename, indexname from pg_indexes where schemaname = 'edibben' and tablename = 'test_part';
schemaname | tablename | indexname
------------+-----------+-----------
(0 rows)
The indexes on the partitions do show up:
AMZGQ3DW=> select schemaname, tablename, indexname from pg_indexes where schemaname = 'edibben' and tablename like 'test%' order by tablename;
schemaname | tablename | indexname
------------+-------------+------------------------
edibben | test_201908 | test_201908_pkey
edibben | test_201908 | test_idx1_201908
edibben | test_201909 | test_201909_pkey
edibben | test_201909 | test_201909_metric_idx
So, is this partitioned table properly indexed? (yes there was a question buried in this mess). I can't find any documentation of how the
partitioned indexes work but it appears that the partitioned 'Index' is just a definition and that the real indexes are on the partitions themselves. Is there a way to list all of the indexes associated with a partitioned index? Is there a way to see if the partitioned index is valid?
Also, the doco talks about creating the index on the partitioned table with the CREATE INDEX ON ONLY option. I don't think this
applies to what I need to do. Am I right?
"As explained above, it is possible to create indexes on partitioned tables and they are applied automatically
to the entire hierarchy. This is very convenient, as not only the existing partitions will become indexed,
but also any partitions that are created in the future will. One limitation is that it's not possible to use
the CONCURRENTLY qualifier when creating such a partitioned index. To overcome long lock times,
it is possible to use CREATE INDEX ON ONLY the partitioned table; such an index is marked invalid,
and the partitions do not get the index applied automatically. The indexes on partitions can be created
separately using CONCURRENTLY, and later attached to the index on the parent using ALTER INDEX .. ATTACH PARTITION.
Once indexes for all partitions are attached to the parent index, the parent index is marked valid automatically."
Evelyn Dibben
pgsql-general by date: