Postgres Pro AXE : Documentation: 1: 5.3. Reading Metadata : Postgres Professional

5.3. Reading Metadata
Prev	Up	Chapter 5. pgpro_metastore	Home	Next

5.3. Reading Metadata #

5.3.1. Retrieving a Snapshot
5.3.2. Retrieving Analytical Schemas
5.3.3. Retrieving Analytical Tables
5.3.4. Retrieving Columns of an Analytical Table
5.3.5. Retrieving Filtered Parquet Files

5.3.1. Retrieving a Snapshot #

Execute the following query:

SELECT *
FROM pga_snapshot
WHERE snapshot_id = snapshot_ID;

Where snapshot_ID is the ID of the snapshot from the pga_snapshot metadata table.

To retrieve the latest snapshot, execute the following query:

SELECT *
FROM pga_snapshot
WHERE snapshot_id =
  (SELECT max(snapshot_id) FROM pga_snapshot);

5.3.2. Retrieving Analytical Schemas #

Execute the following query:

SELECT schema_id, schema_name
FROM pga_schema
WHERE
    snapshot_ID >= begin_snapshot AND
    (snapshot_ID < end_snapshot OR end_snapshot IS NULL);

Where snapshot_ID is the ID of the snapshot from the pga_snapshot metadata table.

5.3.3. Retrieving Analytical Tables #

Execute the following query:

SELECT table_id, table_name
FROM pga_table
WHERE
    schema_id = schema_ID AND
    snapshot_ID >= begin_snapshot AND
    (snapshot_ID < end_snapshot OR end_snapshot IS NULL);

Where:

schema_ID: The ID of the analytical schema from the pga_schema metadata table where analytical tables are created.
snapshot_ID: The ID of the snapshot from the pga_snapshot metadata table.

5.3.4. Retrieving Columns of an Analytical Table #

Execute the following query:

SELECT column_id, column_name, column_type
FROM pga_column
WHERE
    table_id = table_ID AND
    snapshot_ID >= begin_snapshot AND
    (snapshot_ID < end_snapshot OR end_snapshot IS NULL)
ORDER BY column_order;

Where:

table_ID: The ID of the analytical table from the pga_table metadata table where columns are created.
snapshot_ID: The ID of the snapshot from the pga_snapshot metadata table.

5.3.5. Retrieving Filtered Parquet Files #

You can retrieve Parquet files after retrieving columns of an analytical table. Retrieved Parquet can be filtered using statistics from the pga_file_column_statistics metadata table.

To retrieve Parquet files and filter them by column values, execute the following query:

SELECT data_file_id
FROM ducklake_file_column_stats
WHERE
    table_id = table_ID AND
    column_id = column_ID AND
    (SCALAR >= min_value OR min_value IS NULL) AND
    (SCALAR <= max_value OR max_value IS NULL);

Where:

table_ID: The ID of the analytical table from the pga_table metadata table associated with Parquet files.
column_ID: The ID of the column from the pga_column metadata table whose values are used for filtering Parquet files.
In this example, only Parquet files that do not contain scalar values in the column_ID column are retrieved.

You can filter column values using different conditions, such as "greater than (>)", by updating the query accordingly.

The minimum and maximum values of each column are stored as arrays and must be converted to integers.

Prev	Up	Next
5.2. Metadata Tables	Home	5.4. Configuring Objects Visibility

epub pdf