29.1. Adding Parquet Files to an Analytical Table (metastore.add_files) #
Before performing this instruction, place Parquet files in a shared directory from the pga_folder metadata table.
Required privileges:
INSERTprivilege on the analytical table.SELECTprivilege on the shared directory if Parquet files are added from this directory.
For more information about stored procedures and privileges, refer to Section 22.1.
Execute the following command:
SELECT metastore.add_files('table_name', 'path_to_Parquet_files', 'path_to_JSON');
Where:
table_name: The name of the analytical table to which Parquet files are added.path_to_Parquet_files: The path to Parquet files that are added to the analytical table.This path must be within a shared directory and start with the name of this shared directory.
For multiple Parquet files, specify a directory path ending with
/. This directory must contain only the Parquet files being added.For a single Parquet file, specify a full path ending with the filename.
path_to_JSON: The path to a JSON file with Parquet file storage parameters.These parameters apply when creating new Parquet files. In the
metastore.add_filesstored procedure, parameters are ignored for non-partitioned tables since Parquet files are added as is but apply for partitioned tables where Parquet files are split into multiple files. In themetastore.copy_tablestored procedure, parameters always apply because new Parquet files are created from the SQL command results.For more information about partitioning, refer to Chapter 30.
Optional parameter.
Postgres Pro AXE performs the following actions:
Verifies input parameters and user privileges.
Ensures metadata compatibility between Parquet files and the analytical table: the number, order, names, and types of columns must match.
Creates new entries in
pga_snapshotandpga_data_filemetadata tables.Copies Parquet files to the storage directory of the analytical table, to a new subdirectory with the snapshot ID as the name.
If Parquet files are added to a partitioned analytical table, they are split into multiple files based on partition columns, and a directory tree is created for these files.
Updates statistics in
pga_table_stats,pga_table_column_stats, andpga_file_column_statisticsmetadata tables.
Example 29.1. Executing the metastore.add_files stored procedure
SELECT metastore.add_files('table_example', 'folder/file.parquet', 'folder/options.json');