Chapter 5. pgpro_metastore

pgpro_metastore is a Postgres Pro extension used for processing and storing the OLAP data.

This extension allows you to manage analytical tables with the OLAP data. Rows of analytical tables are stored as Parquet files in a local, network, or S3 storage. Metadata of analytical tables is stored in metadata tables (for compliance with DuckLake specification).

pgpro_metastore offers the following advantages:

  • Executing analytical queries 10–30 times faster and saving 5–10 times less OLAP data as compared to heap tables.

  • Using Parquet-compatible tools, such as Jupyter and Notebook, for processing the OLAP data.

  • Unlimited storage scalability for increasing the amount of saved OLAP data and accessing it faster.

  • Independent scalability of storage compute resources.

  • Transparent scheme that provides flexible OLAP data processing scenarios by means of required tools.

  • Granular restriction of storage access similar to Postgres Pro.

  • Transactional data updates and consistency between the OLAP data and metadata.

  • Transactions with persistent IDs, which allows querying the OLAP data and metadata based on specific IDs and restoring the data to specific timestamps.

  • Collecting statistics for all columns and transactions, which allows querying required Parquet files.