Chapter 15. Storages
Table of Contents
Storages are physical locations of Parquet files and shared directories. Postgres Pro AXE supports the following types of storages:
Local storages: Arrays of NVMe (Non-Volatile Memory Express) disks on servers with Postgres Pro AXE installed.
Network storages: The Network File Systems (NFS).
S3 storages.
You can select the storage type using information provided in the table below.
Local Storage | Network Storage | S3 Storage | |
|---|---|---|---|
Throughput | High Determined by the number of NVMe disks in the array | Medium Limited by the capacity of the network interface of the server, network load, and network storage speed | Medium Limited by the capacity of the network interface of the server, network load, and S3 storage speed |
Data scalability | Medium Determined by the number and volume of NVMe disks | High | High |
Data distribution between servers | Not supported | Within the network of the organization | Global |
High availability provider | Postgres Pro AXE server administrator | NFS administrator | S3 provider |
Cost per terabyte and per one storage request | Low | Medium | Depends on the S3 provider |
The metadata of storages is stored in the pga_storage metadata table.
Any directory structure can be used for storing the OLAP data, for example:
In a local or network storage:
root_path/db_name/schema_name/table_name
In an S3 storage:
s3://bucket/db_name/schema_name/table_name
pgpro_axe allows automatically exporting the OLAP data to multiple Parquet files and adding a unique number to each file name. It is easier to store the OLAP data as multiple Parquet files of the same size.
You can also use Hive partitioning to organize the OLAP data by partition keys in a directory hierarchy:
table_name
├── year=2024
│ ├── month=1
│ │ ├── file1.parquet
│ │ └── file2.parquet
│ └── month=2
│ └── file3.parquet
└── year=2025
├── month=11
│ ├── file4.parquet
│ └── file5.parquet
└── month=12
└── file6.parquet
This type of hierarchy can be useful for large historical tables when analytical queries require the data associated with a specific subset of partition keys.
Performing filter pushdown on the path level is also supported. This allows skipping paths that do not contain the required OLAP data when reading.
Postgres Pro foreign servers are used to work with S3 storages. This allows storing connection parameters and user credentials securely within Postgres Pro without specifying them in functions. Currently, you can create only one S3 storage.