The following is an example of the procheck configuration file in the YAML format:
options:
skip_lobs: true
skip_auto_count_test: true
skip_passed_tables: true
source:
drivername: oracle
host: 192.168.23.251
port: 1523
database: db18c
username: pg_migr
password: <password>
destination:
drivername: pgx
host: 192.168.21.191
port: 5432
database: parus
username: data_quality
password: <password>
max_conn: 10
comparison_rules:
schema1:
source_schema: xpress
destination_schema: xpress
exclude_tables: [unnecessary_table]
tests:
- name: test1
tables: [all]
type: count
- name: test2
tables: [all]
type: max_id
schema2:
source_schema: src_schema
destination_schema: dst_schema
table_mapping:
src_table1: dst_table1
src_table2: dst_table2
src_table3: dst_table3
column_mapping:
src_table1:
src_column1: dst_column1
src_column2: dst_column2
skip_lobs:
table1: true
table2: false
exclude_columns:
table1: [column1, column2]
table2: [column3, column4]
source_where:
src_table4: id >= 100 AND id < 1000
tests:
- name: test1
tables: [src_table1, src_table2]
type: count
- name: test2
tables: [all]
exclude_tables: [src_table1]
exclude_columns:
table3: [column5]
type: all_rows
- name: test3
tables: [all_other]
type: max_id
source_where:
src_table3: src_id >= 100 AND src_id < 1000
destination_where:
dst_table3: dst_id >= 100 AND dst_id < 1000
- name: test4
tables: ['@filename.txt']
type: table_list
- name: test5
type: query
source_query: SELECT col1, COUNT(*) FROM src_schema.table1 GROUP BY col1
destination_query: '@query2.sql'
The table below explains procheck configuration parameters. Note that some global configuration parameters can be redefined at the level of the schema and some parameters can be specified both at the schema level and for individual tests. Lower-level configuration parameters with the same name redefine the higher-level ones. See an example of the configuration file:
Table 5.4. procheck Configuration Parameters
| Name | Description | Default Value | Example |
|---|---|---|---|
options | Global configuration parameters | ||
options.skip_lobs | Exclude columns of the CLOB, BLOB, BFILE, and SFILE types from row-wise comparison. For PostgreSQL, also exclude columns of types specified by the pg_hashed_types configuration parameter. The global value can be redefined for specific tables. | false | skip_lobs: true |
options.pg_hashed_types | Defines additional PostgreSQL data types that are compared by computing their MD5 hashes, in addition to those that are always compared this way, such as SFILE or types defined in the dbms_lob extension. The global value can be redefined for specific tables at the schema level or for an individual test. If the value is an empty list, the default value is used. | [BYTEA, TEXT] | |
options.compare_large_bfiles_by_size | Compare BFILE objects larger than 1 GB by sizes. | false — BFILE objects are compared by computing their hash values | |
options.skip_auto_count_test | Turns off automatic execution of count tests for tables for which tests were skipped due to lack of a required key | false | |
options.skip_passed_tables | Turns off the output of detailed information on tables with the PASSED test status to only output the total number of such tables. | false | |
options.all_rows_max_mistakes | Minimum number of errors that interrupts execution of all_rows tests | 0 — the tests are performed for all rows | |
options.chunk_size | Approximate number of rows in chunks into which the table is split during execution of all_rows tests. When the value is zero, the default value is used. If the value is less than zero, tables are not split into chunks. | 100000 | |
options.max_parallel_chunks | Maximum number of table chunks compared in parallel for all_rows tests. If the value is less than or equal to zero, the default value is used. | 10 | |
options.max_parallel_tables | Maximum number of tables compared in parallel. If the value is less than or equal to zero, the default value is used. | 50 | |
options.ora_empty_string_is_null | Defines processing of Oracle empty strings. If true, an empty string is equivalent to NULL. | true | |
options.ora_true_values | Array of string values equivalent to boolean true in Oracle | [Y] | [Y, YES] |
options.ora_false_values | Array of string values equivalent to boolean false in Oracle | [N] | [N, NO] |
options.comparison_accuracy | Minimum number of consequtive 9 or 0 in the fractional part that causes rounding of the number to the previous decimal place | 0 — no rounding | |
options.round_places | Number of decimal places to which the number is rounded when compared. If the value is less than zero, the whole-number part is rounded to the nearest 10^(-round_places) | 0 — no rounding | |
options.ora_string_types | List of Oracle data types for which COLLATE must be specified when sorting | [CHAR, VARCHAR2] | |
options.pg_string_types | List of PostgreSQL data types for which COLLATE must be specified when sorting | [BPCHAR, VARCHAR] | |
options.skip_type_mismatch | Skip errors of impossible comparison of different types in all_rows tests | false | |
source | Configuration parameters to set up the connection to the source DB | ||
source.drivername | Driver for the connection. Possible values: oracle — the godror driver, which works through the Oracle Instant Client, or go-ora — the go-ora driver that is developed in Go. | ||
source.host | Host name | ||
source.port | Connection port | 1521 for Oracle | |
source.database | Name of the database to connect to | ||
source.username | Connection login | ||
source.password | Connection password | ||
source.max_conn | Maximum number of connections to DB. If the value is less than or equal to zero, the number is unlimited. | Unlimited | |
destination | Configuration parameters to set up the connection to the destination DB | ||
destination.drivername | Driver for the connection. Possible values: postgresql or the pgx alias — the pgx driver. | ||
destination.host | Host name | ||
destination.port | Connection port | 5432 for PostgreSQL | |
destination.database | Name of the database to connect to | ||
destination.username | Connection login | ||
destination.password | Connection password | ||
destination.max_conn | Maximum number of connections to DB. If the value is less than or equal to zero, the number is unlimited. | Unlimited | |
comparison_rules | Data comparison rules | ||
comparison_rules.source_schema | Schema name in the source DB | ||
comparison_rules.destination_schema | Schema name in the destination DB | ||
comparison_rules.exclude_tables | Comma-separated list of tables excluded from all tests. Enclosed in square brackets. Can contain a special value, such as '@filename', which means that the list should be read from the file specified by this value. |
| |
comparison_rules.table_mapping | Mapping of table names | src_table1: dst_table1 src_table2: dst_table2 | |
comparison_rules.exclude_columns | List of table columns to be excluded from all tests. | exclude_columns: table1: [column1, column2] table2: [column3, column4] | |
comparison_rules.column_mapping | Mapping of column names in tables | src_table1: src_column1: dst_column1 src_column2: dst_column2 | |
comparison_rules.skip_lobs | Redefines the skip_lobs global parameter for single tables | skip_lobs: table1: true table2: false | |
comparison_rules.source_where | Filtering condition for the source table. An SQL statement that returns boolean. | src_table3: src_id >= 100 AND src_id < 1000 | |
comparison_rules.destination_where | Filtering condition for the destination table. An SQL statement that returns boolean. | dst_table3: src_id >= 100 AND src_id < 1000 | |
comparison_rules.tests | Tests for the comparison rule | ||
comparison_rules.tests.name | Name of the test | ||
comparison_rules.tests.type | Type of the test. See Table 6.1 for the list of possible values. | ||
comparison_rules.tests.tables | Comma-separated list of tables to perform the test for. Enclosed in square brackets. Can contain special values:
| [src_table1, src_table2] or [all_other] | |
comparison_rules.tests.source_query | Text of the source query for a test of the query type (see Table 6.1 for details). To read the query from a file, specify the value as '@file_name'. | SELECT col1, COUNT(*) FROM src_schema.table1 GROUP BY col1 | |
comparison_rules.tests.destination_query | Text of the destination query for a test of the query type (see Table 6.1 for details). To read the query from a file, specify the value as '@file_name'. | '@destination_query.sql' |