F.47. pgpro_gbtree — global index on a partitioned table #

The pgpro_gbtree extension enables creation and use of a global index on a partitioned table.

A global index is an index that is created for one or several columns of a partitioned table and does not include the partition key. At the physical level, a global index on a partitioned table contains data related to the indexed columns for all the partitions, which means that the global index allows you to perform index scan without iterating through all the partitions.

A global index provides the following benefits:

  • It enables a much faster search of rows of the partitioned table by a condition on the indexed columns. The search appears to be faster because the global index is not partitioned, and therefore, there is no need to perform index scans of individual partitions.

  • It lifts a Postgres Pro restriction that does not allow creation of a unique index for columns that do not contain the partition key.

F.47.1. Global Index Operation #

These are a few details of how a global index works:

  • A global index is created for a user-defined set of columns, and the primary key columns of the partitioned table are automatically added to the global index as INCLUDE columns.

  • An index scan with a global index is performed in two steps:

    1. The value is searched for in the global index.

    2. The value of the primary key in INCLUDE index columns is used to search for the partition and for the value in this partition.

  • A global index is not rebuilt for delete and DETACH PARTITION operations. So it is OK that a global index contains some outdated data, which can be cleaned up with VACUUM.

  • A global index is not partitioned. Therefore, the data size limitation of 32 TB hods for it. It is assumed that the global index fits these 32 TB as it contains only part of data of the partitioned table (this assumption may, however, appear not to be true in some circumstances).

F.47.2. Limitations #

Creation of global indexes has the following limitations:

  • The partitioned table for which a global index is created must have a primary key.

  • As for any index on a partitioned table, the CONCURRENTLY option cannot be used.

  • Expressions used instead of column names and predicates in the WHERE clause are not currently supported.

  • Specifying non-key columns (in the INCLUDE clause) is not supported. The CREATE INDEX command issues a warning if such columns are specified and ignores them.

Usage of global indexes has the following limitations:

  • A global index, like a standard b-tree index, is not suitable for columns that contain few unique values. An example of an unsuitable column is the gender column with two values M and F. In addition to a high overhead of the pure index, for the global index, when a new value is added, a hash sum of the value gets locked. With few possible values, modifying transactions are likely to block one another, that is, hang on this lock.

  • As a global index does not store information on row versions, a need in garbage collection for this index is not so important as for normal indexes. Therefore, autovacuum is not currently supported.

  • External links to the global index on a partitioned table from other tables are not currently supported.

  • It is forbidden to add a partitioned table with a global index, as a partition, to another partitioned table.

  • If a partitioned table has a global index, deleting its primary key is not allowed. Delete the global index first.

  • The CLUSTER command for global indexes is not currently supported.

  • The ON CONFLICT clause is not currently applicable to unique global indexes.

F.47.3. Installation #

The pgpro_gbtree extension is included in Postgres Pro Enterprise. To enable pgpro_gbtree, create the extension using the following query:

CREATE EXTENSION pgpro_gbtree;

F.47.4. Usage #

F.47.4.1. Creating and Removing Global Indexes #

To create a global index, execute the CREATE INDEX command specifying the gbtree access method from the pgpro_gbtree extension. The following command creates a global index for the t_int table:

CREATE INDEX t_int_v ON t_int USING gbtree (v);

To remove a global index, execute the DROP INDEX command. For example:

DROP INDEX t_int_v;

F.47.4.2. Using Global indexes in Queries #

It is not possible to implement a usual Postgres Pro behavior for a global index as in the standard implementation, an entire partitioned table, on which the global index is created, is not regarded as a node when the query plan is built. Plans get linked to partitions, and then the best scan method is chosen for each partition, such as SeqScan, IndexOnlyScan, IndexScan, and so on. For the Postgres Pro planner, a partitioned table only provides the list of partitions and is not considered a planning object. A global index can be used if the partitioned table becomes a planning object. And this only happens under the following conditions:

  • The query does not contain a condition on the partition key that excludes at least one partition. It is assumed that the search by a condition on the partition key in a query is more efficient than the the global index.

  • The query contains a condition on columns of the global index.

F.47.5. Example #

The following example illustrates usage of a global index.

-- Creating the pgpro_gbtree extension
CREATE EXTENSION pgpro_gbtree;

-- Creating a partitioned table with three partitions
CREATE TABLE t_int (i int PRIMARY KEY, v int, x int) PARTITION BY RANGE (i);
CREATE TABLE t_int_1 PARTITION OF t_int FOR VALUES FROM (0) TO (100);
CREATE TABLE t_int_2 PARTITION OF t_int FOR VALUES FROM (100) TO (200);
CREATE TABLE t_int_3 PARTITION OF t_int FOR VALUES FROM (200) TO (300);

-- Creating a global index for the v column
CREATE INDEX t_int_v ON t_int USING gbtree (v);

-- Creating a normal index for the x column
CREATE INDEX t_int_x ON t_int (x);

-- Query plan for a condition on the column with the global index
EXPLAIN SELECT * FROM t_int WHERE v > 100;

-- Query plan for a condition on the column with the normal index
EXPLAIN SELECT * FROM t_int WHERE x > 100;

The example produces the following output:

CREATE EXTENSION
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE INDEX
CREATE INDEX

                             QUERY PLAN
---------------------------------------------------------------------
 Index Scan using t_int_v on t_int  (cost=0.12..8.14 rows=0 width=0)
   Index Cond: (v > 100)
(2 rows)

                                     QUERY PLAN
------------------------------------------------------------------------------------
 Append  (cost=9.42..93.97 rows=2040 width=12)
   ->  Bitmap Heap Scan on t_int_1  (cost=9.42..27.92 rows=680 width=12)
         Recheck Cond: (x > 100)
         ->  Bitmap Index Scan on t_int_1_x_idx  (cost=0.00..9.25 rows=680 width=0)
               Index Cond: (x > 100)
   ->  Bitmap Heap Scan on t_int_2  (cost=9.42..27.92 rows=680 width=12)
         Recheck Cond: (x > 100)
         ->  Bitmap Index Scan on t_int_2_x_idx  (cost=0.00..9.25 rows=680 width=0)
               Index Cond: (x > 100)
   ->  Bitmap Heap Scan on t_int_3  (cost=9.42..27.92 rows=680 width=12)
         Recheck Cond: (x > 100)
         ->  Bitmap Index Scan on t_int_3_x_idx  (cost=0.00..9.25 rows=680 width=0)
               Index Cond: (x > 100)
(13 rows)

The result of this example shows that with the global index, the index scan is performed for the entire partitioned table, while with the normal index, the index scan is performed for each partition individually, and this search can be slow when the number of partitions is high.