Chapter 30. Hive Partitioning
Postgres Pro AXE supports partitioning analytical tables at creation. Partitioning improves read query performance on large tables by grouping data into smaller, more manageable parts called partitions. Partitions are sets of Parquet files created based on one or multiple table columns in a specific order. Each table has at least one partition.
Example 30.1.
Assume a table has the column1 integer column with values in the range from 1 to 10. Suppose you have performance-critical analytical queries that filter data by this column in the WHERE clause.
Storing table rows (or their RowID values) grouped by column1 values can significantly speed up such analytical queries. In this case, to retrieve all rows with a specific value, such as column1=2, there is no need to scan the entire table — the rows are already grouped together.
In Postgres Pro AXE, the OLAP data is stored in Parquet files, which serve as partitioning units for analytical tables. In each file, each partition column has a fixed value. The query engine can match the WHERE clause against the file metadata and skip unnecessary files.
The solution uses Hive partitioning technology that splits analytical tables into multiple Parquet files based on a sequence of columns (partition keys). The files of a partitioned table are stored in the leaves of a directory tree, with subdirectory names in the column=value format. For example, s3://bucket/table/year=2024/month=01/data.parquet.
Example 30.2.
The example below shows a directory tree for an analytical table partitioned by year and month columns. The orders directory is the root of the tree. While typically named after the table, the root directory name can be changed.
orders
├── year=2021
│ ├── month=1
│ │ ├── file1.parquet
│ │ └── file2.parquet
│ └── month=2
│ └── file3.parquet
└── year=2022
├── month=11
│ ├── file4.parquet
│ └── file5.parquet
└── month=12
└── file6.parquet
You can partition the heap table directly by year and month columns:
COPY orders TO 'orders' (FORMAT parquet, PARTITION_BY (year, month));
The first orders is the heap table name, and the second orders is the root directory name in the tree. The PARTITION_BY clause cannot use expressions. However, they can be created dynamically as additional columns:
COPY (SELECT *, year(timestamp) AS year, month(timestamp) AS month FROM services) TO 'test' (PARTITION_BY (year, month));
The OLAP data can be read from the directory tree using the hive_partitioning flag:
SELECT * FROM read_parquet('orders/*/*/*.parquet', hive_partitioning = true);
If this flag is used, column values are read from the corresponding subdirectories.
In this directory tree, Postgres Pro AXE skips reading Parquet files that are not required for query results. Consider the following command:
SELECT * FROM read_parquet('orders/*/*/*.parquet', hive_partitioning = true) WHERE year = 2022 AND month = 11;
It reads the following Parquet files:
orders
└── year=2022
└── month=11
├── file4.parquet
└── file5.parquet
Creating a partition based on a column with high cardinality (in the extreme case, with the UNIQUE attribute) leads to creating an unreasonably large number of Parquet files (a separate file for each column value), which degrades query performance and can exceed the DBMS file limit for the partitioned table. Even with standard column cardinality, partitioning does not improve performance for analytical queries that do not filter on that column.
The maximum number of Parquet files created as a result of partitioning is 10 000. If this threshold is exceeded, the stored procedure is canceled.
Postgres Pro AXE can write the OLAP data to a maximum of 100 Parquet files simultaneously.
Recommended partition size: 100 MB.