39.5. Retrieving Filtered Parquet Files #

You can retrieve Parquet files after retrieving columns of an analytical table. Retrieved Parquet files can be filtered using statistics from the pga_file_column_statistics metadata table.

On the server with the metadata catalog, execute the following command:

SELECT data_file_id
FROM axe_catalog.pga_file_column_statistics
WHERE
    table_id = table_ID AND
    column_id = column_ID AND
    (SCALAR >= min_value OR min_value IS NULL) AND
    (SCALAR <= max_value OR max_value IS NULL);

Where:

  • table_ID: The ID of the analytical table that contains Parquet files, from the pga_table metadata table.

  • column_ID: The ID of the column whose values are used for filtering Parquet files, from the pga_column metadata table.

    Only Parquet files whose range of column values includes the specified scalar value are retrieved.

You can filter column values using different conditions, such as "greater than (>)", by updating the command accordingly. The minimum and maximum values of each column are stored as arrays and must be converted to integers.