The following is an example of the procopy configuration file fragment in the YAML format that lists procopy-specific configuration parameters:
version: 1
work_dir: ""
procopy_options:
readers: 11
loaders: 11
batch_bytes: 1MiB
quick_fetch_lob_size: 20MiB
money_precision: 2
window_rows_limit: 1000000
disable_order_by: false
inprogress_timeout: 10m0s
show_load_percent: true
snapshot_id: null
parquet_options:
path: ./parquet
max_size: 300MiB
page_size: 8KiB
compression: GZIP
row_group_size: 128MiB
For how to set values for time configuration parameters, refer to Section 4.3.6.
The following table explains procopy-specific configuration parameters:
Table 5.1. procopy-Specific Configuration Parameters
| Name | Description | Default Value | Example |
|---|---|---|---|
version | Configuration file version | ||
procopy_options | procopy global parameters | ||
procopy_options.readers | Number of parallel read processes. Each of them processes one task at a time until it is finished. | NumCPU/2 | |
procopy_options.loaders | Number of parallel write processes. Each process retrieves batches from the queue and applies them to the destination. | NumCPU/2 | |
procopy_options.load.max_retries | Number of batch sending attempts in case of error | 0 | |
procopy_options.load.retry_timeout | Time between batch sending attempts in case of error | 0 | |
procopy_options.log | Logging options | ||
procopy_options.batch_bytes | Global limitation on the size of one batch, in bytes. The parameter value must be greater than 0. | 1MiB | |
procopy_options.truncate | Global flag. If enabled, all destination tables will be truncated before the data loading. If not specified or set to Note that for non-completed tasks to migrate heaps or raw queries, it is required to set this value to | null | |
procopy_options.quick_fetch_lob_size | Threshold for the LOB size that defines how to fetch LOBs. If the size of a LOB is less than this value, the binary protocol will be used and all the data in the LOB will be loaded to the random-access memory (RAM). If the size of the LOB is greater than this value, the LOB will be read in parts. The minimum value must be 1024 bytes. | 20MiB | |
procopy_options.bfile_copy | If specified, all the BFILE objects are copied as a whole form Oracle during the data loading. By default, only identifiers are copied, that is, the directory aliases and filenames. Note that prosync cannot load BFILE objects. | false | |
procopy_options.problem_rows | Section for setting up the work with rows that cannot be inserted | ||
procopy_options.problem_rows.local | Section for the local setting up | ||
procopy_options.problem_rows.local.save_dir | Path to the directory where the problematic rows will be stored | ||
procopy_options.money_precision | Number of fractional digits in the destination database. Depends on the lc_monetary configuration parameter on the destination server. Meaningful only for Postgres Pro and the MONEY type. In other cases, leave the default value. | 2 | |
procopy_options.window_rows_limit | Size of the sliding window. Number of rows selected at a time. | ||
procopy_options.disable_order_by | Disables loading of the data sorted by the unique primary key. Actually, the whole table is retrieved through a cursor. | ||
procopy_options.disable_index_hint | Disables index hints when creating select queries | ||
procopy_options.inprogress_timeout | Maximum time to read a data block | 1s | |
procopy_options.show_load_percent | Output the percent of the task execution | true | |
procopy_options.encoder_chan_size | Size of a buffer used by encoders | 10000 | |
procopy_options.snapshot_id | Global identifier of the PostgreSQL/Postgres Pro snapshot. Used if the snapshot configuration parameter is needed, but not specified for the task. | 10000 | |
procopy_options.sub_task_rows | Global setting of the number of rows in a task for parallel reading of tables. Used to split reading large tables into subtasks. The value of zero turns off splitting into subtasks. | 0 | |
procopy_options.enable_auto_snapshot | Global flag to enable automatic creation of a snapshot in PostgreSQL/Postgres Pro. If true, a snapshot is created to be used to get data. Can be redefined at the task level using the tasks.enable_auto_snapshot configuration parameter. | false | |
procopy_options.parquet_options | Section that specifies options related to the Parquet format destination |
parquet_options:
path: ./parquet
max_size: 300MB
page_size: 8kB
compression: GZIP
row_group_size: 128MB
| |
procopy_options.parquet_options.path | Absolute or relative path to the directory where Parquet files will be stored | ./parquet | |
procopy_options.parquet_options.max_size | Maximum size of a Parquet file. If a file size is to exceed this value, a new file is created. The minimum value must be 1024 bytes. | 300MiB | |
procopy_options.parquet_options.page_size | Page size in a Parquet file. See https://parquet.apache.org/docs/file-format/configurations/ for the concept of a page in Parquet. The minimum value must be 1024 bytes. | 8KiB | |
procopy_options.parquet_options.compression | Codec to compress data in a Parquet file. Supported codecs are UNCOMPRESSED, SNAPPY, GZIP, and LZ4. LZO, BROTLI, ZSTD, and LZ4_RAW codecs are not supported. See https://parquet.apache.org/docs/file-format/data-pages/compression/ for information on codecs for Parquet. | GZIP | |
procopy_options.parquet_options.row_group_size | Size of a row group in a Parquet file. See https://parquet.apache.org/docs/file-format/configurations/ for more details. The minimum value must be 1024 bytes. | 128MiB | |
procopy_options.to_char | Key-value dictionary of column types that must be converted to a string at the database side. Only applicable for Oracle. Each key contains the column type, and the value is the format. Key-value pairs are comma-separated. | "to_char": {"DATE":"YYYY-MM-DD HH24:MI:SS", "NUMBER":""} | |
procopy_options.null_char_replace | A global parameter to replace the null symbol (the symbol with the zero code). Can be redefined at the task level for each specific column using the null_char_replace field of the tasks.transform configuration parameter. | If the source is Oracle and destination is PostgreSQL or Postgres Pro, an empty string (""), otherwise nil | null_char_replace: "\n" |