5.4. procheck Configuration Parameters #

The following is an example of the procheck configuration file in the YAML format:

options:
  skip_lobs: true
  skip_auto_count_test: true
  skip_passed_tables: true

source:
  drivername: oracle
  host: 192.168.23.251
  port: 1523
  database: db18c
  username: pg_migr
  password: <password>

destination:
  drivername: pgx
  host: 192.168.21.191
  port: 5432
  database: parus
  username: data_quality
  password: <password>
  max_conn: 10

comparison_rules:
  schema1:
    source_schema: xpress
    destination_schema: xpress
    exclude_tables: [unnecessary_table]
    tests:
      - name: test1
        tables: [all]
        type: count
      - name: test2
        tables: [all]
        type: max_id
  schema2:
    source_schema: src_schema
    destination_schema: dst_schema
    table_mapping:
      src_table1: dst_table1
      src_table2: dst_table2
      src_table3: dst_table3
    column_mapping:
      src_table1:
        src_column1: dst_column1
        src_column2: dst_column2
    skip_lobs:
      table1: true
      table2: false
    exclude_columns:
      table1: [column1, column2]
      table2: [column3, column4]
    source_where:
      src_table4: id >= 100 AND id  < 1000
    tests:
      - name: test1
        tables: [src_table1, src_table2]
        type: count
      - name: test2
        tables: [all]
        exclude_tables: [src_table1]
        exclude_columns:
          table3: [column5]
        type: all_rows
      - name: test3
        tables: [all_other]
        type: max_id
        source_where:
          src_table3: src_id >= 100 AND src_id  < 1000
        destination_where:
          dst_table3: dst_id >= 100 AND dst_id  < 1000
      - name: test4
        tables: ['@filename.txt']
        type: table_list
      - name: test5
        type: query
        source_query: SELECT col1, COUNT(*) FROM src_schema.table1 GROUP BY col1
        destination_query: '@query2.sql'

The table below explains procheck configuration parameters. Note that some global configuration parameters can be redefined at the level of the schema and some parameters can be specified both at the schema level and for individual tests. Lower-level configuration parameters with the same name redefine the higher-level ones. See an example of the configuration file:

Table 5.4. procheck Configuration Parameters

NameDescriptionDefault ValueExample
optionsGlobal configuration parameters  
options.skip_lobsExclude columns of the CLOB, BLOB, BFILE, and SFILE types from row-wise comparison. For PostgreSQL, also exclude columns of types specified by the pg_hashed_types configuration parameter. The global value can be redefined for specific tables.falseskip_lobs: true
options.pg_hashed_typesDefines additional PostgreSQL data types that are compared by computing their MD5 hashes, in addition to those that are always compared this way, such as SFILE or types defined in the dbms_lob extension. The global value can be redefined for specific tables at the schema level or for an individual test. If the value is an empty list, the default value is used. [BYTEA, TEXT] 
options.compare_large_bfiles_by_sizeCompare BFILE objects larger than 1 GB by sizes.falseBFILE objects are compared by computing their hash values 
options.skip_auto_count_testTurns off automatic execution of count tests for tables for which tests were skipped due to lack of a required keyfalse 
options.skip_passed_tablesTurns off the output of detailed information on tables with the PASSED test status to only output the total number of such tables.false 
options.all_rows_max_mistakesMinimum number of errors that interrupts execution of all_rows tests0 — the tests are performed for all rows 
options.chunk_sizeApproximate number of rows in chunks into which the table is split during execution of all_rows tests. When the value is zero, the default value is used. If the value is less than zero, tables are not split into chunks.100000 
options.max_parallel_chunksMaximum number of table chunks compared in parallel for all_rows tests. If the value is less than or equal to zero, the default value is used.10 
options.max_parallel_tablesMaximum number of tables compared in parallel. If the value is less than or equal to zero, the default value is used.50 
options.ora_empty_string_is_nullDefines processing of Oracle empty strings. If true, an empty string is equivalent to NULL.true 
options.ora_true_valuesArray of string values equivalent to boolean true in Oracle[Y][Y, YES]
options.ora_false_valuesArray of string values equivalent to boolean false in Oracle[N][N, NO]
options.comparison_accuracyMinimum number of consequtive 9 or 0 in the fractional part that causes rounding of the number to the previous decimal place0 — no rounding 
options.round_placesNumber of decimal places to which the number is rounded when compared. If the value is less than zero, the whole-number part is rounded to the nearest 10^(-round_places)0 — no rounding 
options.ora_string_typesList of Oracle data types for which COLLATE must be specified when sorting[CHAR, VARCHAR2] 
options.pg_string_typesList of PostgreSQL data types for which COLLATE must be specified when sorting[BPCHAR, VARCHAR] 
options.skip_type_mismatchSkip errors of impossible comparison of different types in all_rows testsfalse 
sourceConfiguration parameters to set up the connection to the source DB  
source.drivernameDriver for the connection. Possible values: oracle — the godror driver, which works through the Oracle Instant Client, or go-ora — the go-ora driver that is developed in Go.  
source.hostHost name  
source.portConnection port1521 for Oracle 
source.databaseName of the database to connect to  
source.usernameConnection login  
source.passwordConnection password  
source.max_connMaximum number of connections to DB. If the value is less than or equal to zero, the number is unlimited.Unlimited 
destinationConfiguration parameters to set up the connection to the destination DB  
destination.drivernameDriver for the connection. Possible values: postgresql or the pgx alias — the pgx driver.  
destination.hostHost name  
destination.portConnection port5432 for PostgreSQL 
destination.databaseName of the database to connect to  
destination.usernameConnection login  
destination.passwordConnection password  
destination.max_connMaximum number of connections to DB. If the value is less than or equal to zero, the number is unlimited.Unlimited 
comparison_rulesData comparison rules  
comparison_rules.source_schemaSchema name in the source DB  
comparison_rules.destination_schemaSchema name in the destination DB  
comparison_rules.exclude_tablesComma-separated list of tables excluded from all tests. Enclosed in square brackets. Can contain a special value, such as '@filename', which means that the list should be read from the file specified by this value. 
[unnecessary_table] or ['@filename.txt']
comparison_rules.table_mappingMapping of table names 
src_table1: dst_table1
src_table2: dst_table2
comparison_rules.exclude_columnsList of table columns to be excluded from all tests. 
exclude_columns:
  table1: [column1, column2]
  table2: [column3, column4]
comparison_rules.column_mappingMapping of column names in tables 
src_table1:
  src_column1: dst_column1
  src_column2: dst_column2
comparison_rules.skip_lobsRedefines the skip_lobs global parameter for single tables 
skip_lobs:
  table1: true
  table2: false
comparison_rules.source_whereFiltering condition for the source table. An SQL statement that returns boolean. src_table3: src_id >= 100 AND src_id < 1000
comparison_rules.destination_whereFiltering condition for the destination table. An SQL statement that returns boolean. dst_table3: src_id >= 100 AND src_id < 1000
comparison_rules.testsTests for the comparison rule  
comparison_rules.tests.nameName of the test  
comparison_rules.tests.typeType of the test. See Table 6.1 for the list of possible values.  
comparison_rules.tests.tablesComma-separated list of tables to perform the test for. Enclosed in square brackets. Can contain special values:
  • all — all tables in the schema

  • all_other — all tables in the schema that are not explicilty listed in test descriptions

  • '@filename' — the list should be read from the file specified by this value

 [src_table1, src_table2] or [all_other]
comparison_rules.tests.source_queryText of the source query for a test of the query type (see Table 6.1 for details). To read the query from a file, specify the value as '@file_name'. SELECT col1, COUNT(*) FROM src_schema.table1 GROUP BY col1
comparison_rules.tests.destination_queryText of the destination query for a test of the query type (see Table 6.1 for details). To read the query from a file, specify the value as '@file_name'. '@destination_query.sql'