procheck

procheck — utility to verify data after a migration

Synopsis

procheck [option...]

Description #

procheck is a utility for data verification using specialized tests. It is intended to be used after a data migration performed by procopy and prosync. The configuration parameters, specified in the configuration file, define how the tests are performed.

procheck performs the following verification:

  • Compares the values of unique identifiers in source and destination tables.

  • Compares the number of rows in tables.

  • Compares data row-by-row.

  • Compares query results.

Usage #

Generating the Configuration File #

To generate a sample configuration file with the config_file name, execute the command:

procheck --generate_config config_file

You can later update configuration values there.

Launching the DB Verification #

To launch the DB verification that is specified in the config_file configuration file, execute the command:

procheck -c config_file

Configuring the DB Connection #

The connection to the source DB is specified in the source section of the configuration file, and the connection to the destination DB is specified in the destination section of the configuration file. See also an example of the configuration file.

Specifying the Tests to Perform #

Tests that procheck performs are defined in the comparison_rules section of the configuration file. See also an example of the configuration file.

For each schema, the source and destination databases must be specified, as well as a list of tests.

To exclude unnecessary tables from tests, use the optional configuration parameter exclude_tables, which can be defined for the whole schema and for an individual test.

To exclude table columns from row-wise comparison, use the optional configuration parameter exclude_columns, which can also be defined for the whole schema and for an individual test.

You can define the mapping of table names between the source and destination databases at the schema level. To do this, use the optional configuration parameter table-mapping.

To define the mapping of column names between the source and destination databases at the schema level, use the optional configuration parameter column-mapping.

To specify tables to be included in tests, use the required configuration parameter tables, which should be defined for each test.

You can set the the optional global parameter skip_lobs to specify whether to compare large objects for specific tables. The global parameter can be redefined for specific tables using the skip_lobs parameter in the comparison_rules section.

Types of tests are listed in the table below:

Table 6.1. procheck Types of Tests

Configuration value of typeDescription
min_idComparison of minimum values of a unique key. Only performed for tables with a unique key that consists of one column.
max_idComparison of maximum values of a unique key. Only performed for tables with a unique key that consists of one column.
minmax_idComparison of minimum and maximum values of a unique key. Only performed for tables with a unique key that consists of one column.
countComparison of the number of rows in tables.
queryComparison of results of user queries to the source and destination DBs. The queries are specified in the source_query and destination_query parameters. When writing the queries, you must ensure the same order of rows and columns in the results. Column names do not matter for the comparison. The tables and exclude_tables parameters are not used for tests of this type.
all_rowsComparison of all rows in tables. For columns of the BLOB, CLOB, BYTEA, BFILE, SFILE, and TEXT types, values of the MD5 hash are compared. If the value of skip_lobs is true, comparison of these types of columns is skipped. The test is only performed if the table has a unique key. If this key is numeric and consists of one column, the table can be split into chunks by the value of this key to be compared in parallel. The maximum number of rows in a chunk and the maximum number of chunks processed in parallel are controlled by the chunk_size and max_parallel_chunks parameters, respectively.
table_listVerifying the availability of tables in DB.
col_typesVerifying the comparability of data types in the source and destination databases.

For tables for which tests were skipped because of the lack of a required key, count tests are automatically performed. If the count test is already specified in the configuration file for some table, the test is not repeated. Automatically preforming count tests can be turned off with the skip_auto_count_test configuration parameter.

Specifying the Row Filtering #

You can filter table rows to perform tests on. To specify filters for tables of the source DB, use the source_where configuration parameter. To specify filters for tables of the destination DB, use the destination_where configuration parameter. Both can be specified either at the schema level or for an individual test. Filters specified for individual tests override those specified at the schema level. The filter condition can be any SQL statement that returns the boolean type.

You need to specify destination_where only if column names in the filter for the destination DB differ from column names in the source DB. In this case, the table name in the destination DB must be specified.

Return Codes #

procheck may return one of the following codes:

  • 0 — Successful termination. All the tests have been successfully passed, that is, have the PASSED status.

  • 1 — The application terminated with an error.

  • 3 — The application terminated due to an error in the configuration or in command-line options.

  • 4 — The application terminated due to an error of the database access.

  • 5 — The application terminated due to an error of the file input/output.

  • 6 — The application terminated due to an error of the terminal input/output.

  • 7 — Normal termination, but at least one test completed with the ERROR status.

  • 8 — Normal termination, all the tests completed without errors, but at least one test completed with the FAILED status.

Command-Line Reference #

This section describes procheck commands. Optional parameters are enclosed in square brackets.

procheck #

procheck -c|--generate_config config_file
[--json] [-v|--version]
[config_options] [output_control_options]

Perfroms the tests if the -c option is specified and generates a procheck configuration file if the --generate_config option is specified. config_file specifies the name of the configuration file.

--json

Output the results of tests in the JSON format and save them to the config_file_results.json file. By default, the results are output in a plain-text tabular human-readable format and saved to the config_file_results.txt file.

-v
--version

Output the procheck version.

Command-Line Configuration Options #

Additionally, command-line configuration options can be specified, which override respective settings in the configuration file:

-p
--skip_passed_tables

Sets the skip_passed_tables configuration parameter to true, which means no display of detailed information on tables with the PASSED test status.

-s integer
--chunk_size integer

Sets the value of the chunk_size configuration parameter.

-g integer
--max_parallel_chunks integer

Sets the value of the max_parallel_chunks configuration parameter.

-t integer
--max_parallel_tables integer

Sets the value of the max_parallel_tables configuration parameter.

Command-Line Output-Control Options #

Command-line options can also be specified that control displaying the information on the screen:

-i

Only display information on tables: list of columns and their types.

--no_tui

Display no progress information for tests.

-q
--quiet

Display no progress information on tests and their results.

The --skip_passed_tables option above also controls displaying the information on the screen.