Chapter 5. pgpro_metastore
Table of Contents
- 5.1. Configuring the pgpro_metastore Catalog
- 5.2. Metadata Tables
- 5.2.1.
pga_snapshotMetadata Table- 5.2.2.
pga_snapshot_changesMetadata Table- 5.2.3.
pga_schemaMetadata Table- 5.2.4.
pga_tableMetadata Table- 5.2.5.
pga_storageMetadata Table- 5.2.6.
pga_uriMetadata Table- 5.2.7.
pga_folderMetadata Table- 5.2.8.
pga_columnMetadata Table- 5.2.9.
pga_data_fileMetadata Table- 5.2.10.
pga_files_scheduled_for_deletionMetadata Table- 5.2.11.
pga_table_statsMetadata Table- 5.2.12.
pga_table_column_statsMetadata Table- 5.2.13.
pga_file_column_statisticsMetadata Table- 5.2.14.
pga_transaction_logMetadata Table - 5.2.2.
- 5.2.1.
- 5.3. Reading Metadata
- 5.4. Configuring Objects Visibility
pgpro_metastore is a Postgres Pro extension used for processing and storing the OLAP data.
This extension allows you to manage analytical tables with the OLAP data. Rows of analytical tables are stored as Parquet files in a local, network, or S3 storage. Metadata of analytical tables is stored in metadata tables (for compliance with DuckLake specification).
pgpro_metastore offers the following advantages:
Executing analytical queries 10–30 times faster and saving 5–10 times less OLAP data as compared to heap tables.
Using Parquet-compatible tools, such as Jupyter and Notebook, for processing the OLAP data.
Unlimited storage scalability for increasing the amount of saved OLAP data and accessing it faster.
Independent scalability of storage compute resources.
Transparent scheme that provides flexible OLAP data processing scenarios by means of required tools.
Granular restriction of storage access similar to Postgres Pro.
Transactional data updates and consistency between the OLAP data and metadata.
Transactions with persistent IDs, which allows querying the OLAP data and metadata based on specific IDs and restoring the data to specific timestamps.
Collecting statistics for all columns and transactions, which allows querying required Parquet files.