29.3. Description of Parquet File Storage Parameters #
You can specify the following Parquet file storage parameters in a JSON file, and apply them when executing the metastore.add_files or metastore.copy_table stored procedure:
compression: The data compression algorithm.Possible values:
snappyzstdgziplz4/lz4_rawbrotliuncompressed
compression_level: The data compression level.Possible values are from
1to22.Default value:
3.Optional parameter. It is ignored if any compression algorithm other than zstd is used.
row_group_size: The maximum number of rows in a row group. The larger the value, the better the compression. The smaller the value, the more threads are used when reading Parquet files, and the better the statistics filtering.Minimal value:
2048.Default value:
122_880.Recommended value range is from
100_000to1_000_000.
Example 29.3.
{
"compression": "zstd",
"compression_level": 9,
"row_group_size": 500000
}