Chapter 15. Storages

Storages are physical locations of Parquet files and shared directories. Postgres Pro AXE supports the following types of storages:

  • Local storages: Arrays of NVMe (Non-Volatile Memory Express) disks on servers with Postgres Pro AXE installed.

  • Network storages: The Network File Systems (NFS).

  • S3 storages.

You can select the storage type using information provided in the table below.

Local Storage

Network Storage

S3 Storage

Throughput

High

Determined by the number of NVMe disks in the array

Medium

Limited by the capacity of the network interface of the server, network load, and network storage speed

Medium

Limited by the capacity of the network interface of the server, network load, and S3 storage speed

Data scalability

Medium

Determined by the number and volume of NVMe disks

High

High

Data distribution between servers

Not supported

Within the network of the organization

Global

High availability provider

Postgres Pro AXE server administrator

NFS administrator

S3 provider

Cost per terabyte and per one storage request

Low

Medium

Depends on the S3 provider

The metadata of storages is stored in the pga_storage metadata table.

Any directory structure can be used for storing the OLAP data, for example:

  • In a local or network storage:

    root_path/db_name/schema_name/table_name
    
  • In an S3 storage:

      s3://bucket/db_name/schema_name/table_name
    

pgpro_axe allows automatically exporting the OLAP data to multiple Parquet files and adding a unique number to each file name. It is easier to store the OLAP data as multiple Parquet files of the same size.

You can also use Hive partitioning to organize the OLAP data by partition keys in a directory hierarchy:

  table_name
  ├── year=2024
  │    ├── month=1
  │    │   ├── file1.parquet
  │    │   └── file2.parquet
  │    └── month=2
  │        └── file3.parquet
  └── year=2025
      ├── month=11
      │   ├── file4.parquet
      │   └── file5.parquet
      └── month=12
          └── file6.parquet

This type of hierarchy can be useful for large historical tables when analytical queries require the data associated with a specific subset of partition keys.

Performing filter pushdown on the path level is also supported. This allows skipping paths that do not contain the required OLAP data when reading.

Postgres Pro foreign servers are used to work with S3 storages. This allows storing connection parameters and user credentials securely within Postgres Pro without specifying them in functions. Currently, you can create only one S3 storage.