3.7. OLAP Data #
3.7.1. Adding the OLAP Data to an Analytical Table From Parquet Files #
Execute the following query:
SELECT metastore.add_files('table_name', 'path_to_Parquet_files');
Where:
table_name: The name of the analytical table to which the OLAP data will be added.path_to_Parquet_files: The path to the Parquet file or to the shared directory with Parquet files from which the OLAP data will be added.Enter the path relative to the shared directory from the
pga_foldermetadata table, starting with the name of this directory.Metadata of Parquet files and the analytical table must be compatible.
Example 3.31.
SELECT add_files('example_table', 'folder/file.parquet');
Once the query is executed, pgpro_metastore performs the following actions:
Verifies user privileges.
Ensures that metadata of Parquet files and the analytical table are compatible.
Copies Parquet files to the shared directory with Parquet files of the analytical table.
Creates a new snapshot in the
pga_snapshotmetadata table and a new Parquet file in thepga_data_filemetadata table.Updates analytical table and column statistics.
3.7.2. Copying the OLAP Data to an Analytical Table From Other Tables #
Execute the following query:
SELECT metastore.copy_table('table_name', SQL_query);
Where:
table_name: The name of the analytical table to which the OLAP data will be copied.SQL_query: The SQL query that determines which the OLAP data to copy, for example,SELECT * FROM my_pgtable.The SQL query must be a single valid
SELECTstatement. TheUNIONoperation is not supported.
Example 3.32.
SELECT copy_table('example_table', $$select * from example_pg_table$$);
Once the query is executed, pgpro_metastore performs the following actions:
Verifies user privileges.
Creates a new snapshot in the
pga_snapshotmetadata table.Creates a directory for results of the SQL query that will be saved as Parquet files.
Executes the SQL query and writes its results to Parquet files.
3.7.3. Deleting Parquet Files #
It is recommended to delete Parquet files during periods of low user activity, since Parquet files are not only deleted from storages, but also from all associated metadata tables.
To delete Parquet files:
Mark snapshots associated with Parquet files as expired:
SELECT metastore.expire_snapshot(
control_time);Where
control_timeis the threshold value for the snapshot creation date and time.You can specify this value in one of the following ways:
As a date in the yyyy-mm-dd format.
Example 3.33.
2025-11-28As a date in the yyyy-mm-dd format and time in the hh:mm:ss format.
Example 3.34.
2025-11-28 12:22:46As a date in the yyyy-mm-dd format and time in the hh:mm:ss format with time zone.
Example 3.35.
2025-11-28 12:22:46+03As a date in the yyyy-mm-dd format and time in the hh:mm:ss.ssssss format with time zone, where the fractional part of seconds must have no more than 6 digits.
Example 3.36.
2025-11-28 12:22:46.123456+03
Once the query is executed, pgpro_metastore performs the following actions:
Marks all snapshots whose
snapshot_timevalue is less thancontrol_timeas expired.Finds Parquet files in the
pga_data_filemetadata table whosebegin_snapshotvalues reference expired snapshots and creates associated Parquet files in thepga_files_scheduled_for_deletionmetadata table if they were not created by previousexpire_snapshotcalls.
Note
Expired snapshots currently affect only the
pga_data_filemetadata table and are not supported for other metadata tables that have thebegin_snapshotvalue.Delete Parquet files:
SELECT metastore.delete_expired_files();
Once the query is executed, pgpro_metastore performs the following actions:
Deletes Parquet files from
pga_files_scheduled_for_deletion,pga_data_file, andpga_file_column_statisticsmetadata tables.Recalculates statistics for analytical tables associated with deleted Parquet files in the
pga_table_statsmetadata table.Recalculates statistics for columns of analytical tables associated with deleted Parquet files in the
pga_table_column_statsmetadata table.Recalculates the
column_ordervalue for other Parquet files.