20.3. Deleting Parquet Files #
In Postgres Pro AXE, the deletion of Parquet files is a two-stage process. First, you must mark snapshots referenced by entries from the pga_data_file metadata table as expired to exclude associated Parquet files from analytical queries. Then, you can delete Parquet files.
Before performing this instruction, ensure that you are assigned the metastore_admin role. For more information, refer to Chapter 14.
To delete Parquet files:
Mark snapshots as expired in one of the following ways:
To mark snapshots created before a certain date and time as expired, execute the following query:
SELECT metastore.expire_snapshot(
threshold_date_and_time);Where
threshold_date_and_timeis the threshold value for the snapshot creation date and time.The snapshot creation date and time are contained in the
snapshot_timecolumn of thepga_snapshotmetadata table.Supported date and time formats:
Date in the yyyy-mm-dd format.
Example 20.3.
2025-11-28Date in the yyyy-mm-dd format and time in the hh:mm:ss format.
Example 20.4.
2025-11-28 12:22:46Date in the yyyy-mm-dd format and time in the hh:mm:ss format with the time zone.
Example 20.5.
2025-11-28 12:22:46+03Date in the yyyy-mm-dd format and time in the hh:mm:ss.ssssss format with time zone, where the fractional part of seconds must have no more than 6 digits.
Example 20.6.
2025-11-28 12:22:46.123456+03
Example 20.7.
SELECT metastore.expire_snapshot('2025-11-13 12:22:46.123456+03');To mark snapshots with certain IDs as expired, execute the following query:
SELECT metastore.expire_snapshot('[list_of_snapshot_IDs]');Where
list_of_snapshot_IDsis a comma-separated list of snapshot IDs.Snapshot IDs are contained in the
snapshot_idcolumn of thepga_snapshotmetadata table.Example 20.8.
SELECT metastore.expire_snapshot('[1,2,3,4]');
Once the query is executed, pgpro_metastore performs the following actions:
Marks snapshots as expired if their entries from the
pga_snapshotmetadata table havesnapshot_timevalues that are less thanthreshold_date_and_timeor havesnapshot_idvalues that are equal to one of the specified snapshot IDs.Finds entries in the
pga_data_filemetadata table whosebegin_snapshotvalues reference expired snapshots and creates associated entries in thepga_files_scheduled_for_deletionmetadata table if these entities were not already created by previousexpire_snapshotcalls.Note
Other metadata tables whose
begin_snapshotvalues reference expired snapshots are not impacted.
Recreate Postgres Pro views from analytical tables associated with Parquet files so that these files are excluded from analytical queries.
Important
Postgres Pro views can be recreated only during periods of zero user activity to prevent potential data loss.
Delete Parquet files:
SELECT metastore.delete_expired_files();
Once the query is executed, pgpro_metastore performs the following actions:
Deletes Parquet files from the storage and associated entries from
pga_files_scheduled_for_deletion,pga_data_file, andpga_file_column_statisticsmetadata tables.Recalculates statistics for entries associated with deleted Parquet files in
pga_table_statsandpga_table_column_statsmetadata tables.Recalculates
column_ordervalues for entries associated with deleted Parquet files in thepga_columnmetadata table.