G.5. pgpro_validate — check data integrity of a Postgres Pro instance #
G.5.1. Overview #
The pgpro_validate utility provides tools for checking the integrity of the Postgres Pro instance files and objects, performing the following types of operations:
database block checksum verification
basedirectory verificationfile system integrity checks (
global,pg_xact,pg_wal, etc.)index integrity checks
referential integrity checks of system catalogs (
pg_depend,pg_shdepend, etc.)table verification
constraint verification (
NOT NULL,FOREIGN KEY,UNIQUE)file access permission verification
tablespace file verification
pgpro_validate operates in read-only mode and does not modify the database cluster. The utility can run locally on the host with the PGDATA directory or in remote mode via SSH, where the local instance launches pgpro_validate on a remote host and streams its output.
G.5.2. Installation #
pgpro_validate is provided with Postgres Pro Standard as a separate pre-built package pgpro-validate-std-16 (for the detailed installation instructions, refer to Chapter 16).
G.5.3. Considerations and Limitations #
Before you start working with pgpro_validate, note the operation specifics described below.
G.5.3.1. General #
For the following commands that require a database connection, the Postgres Pro server must be running:
amcheck,constraints,database_files,index,integrity,molotilka, andtablespace. These commands also require the database user to have superuser privileges or appropriate roles to read system catalogs and user tables.Commands that only access the file system —
checksum,permissions,postgres_files— can run while the server is stopped, provided that files are not modified during the check.The user running pgpro_validate must have read permissions for all files in
PGDATA.For the
amcheckcommand, the amcheck or amcheck_next extension must be installed in each target database. Databases without the extension are skipped with a warning.The
molotilkacommand requires the pageinspect extension. On the first startup, pgpro_validate attempts to install it automatically (CREATE EXTENSION IF NOT EXISTS pageinspect), so the database user must have appropriate privileges or the extension must be preinstalled.
G.5.3.2. CFS (Compressed File System) #
pgpro_validate supports database directory-level checks for CFS files using the following commands:
amcheck,constraints,database_files,index,integrity,molotilka, andtablespace.The
database_filescommand takes into account the*.cfmfiles and does not report them as orphaned.The
permissionsandpostgres_filescommands only check the file system metadata.The
checksum dirandchecksum filecommands do not work with CFS.
G.5.3.3. Remote Mode #
Remote SSH connections do not support interactive password entry. Key-based authentication must be set up in advance.
The database password is passed to the remote host via the standard input of an SSH session and exported on the remote side as the
PGPASSWORDenvironment variable. The password does not appear in command-line arguments orpsoutput.Major versions of the local and remote pgpro_validate instances must match. Minor version mismatches produce a warning but the check continues.
In plain output format, remote process output is streamed. In JSON format, output is buffered and validated as JSON, and a synthetic
{"error": ..., "exit_code": ...}object is returned on error.
G.5.4. Commands #
G.5.4.1. No Command #
pgpro_validate [common_options] [remote_options]
Runs all applicable checks sequentially.
Checks requiring a database connection run only when a connection string is provided, and those requiring PGDATA access run only when the --pgdata option is set. Otherwise, the checks are skipped with an informational message.
Example:
pgpro_validate \
--pgdata=/var/lib/postgresql/15/data \
--username=postgres \
--dbname=postgres
G.5.4.2. amcheck #
pgpro_validate [common_options] [remote_options] amcheck [--all] [--check-db=database_name] [--checkunique] [--exclude-schema=schema_name] [--heapallindexed]
Performs logical B-tree index verification using the amcheck extension.
This command requires only a database connection. The --pgdata option is not used.
--check-db=database_nameA comma-separated list of databases in which to check indexes. If this option is omitted, all databases are checked.
Unlike
indexandconstraints,amcheckuses the--check-dboption rather than--db.--exclude-schema=schema_nameA comma-separated list of schemas to exclude from the check.
--heapallindexedEnable additional check to ensure that all row versions visible according to the current snapshot are present in the index.
--checkuniqueAdditionally check unique constraint enforcement.
-a--allEnable all available checks (
--heapallindexedand--checkunique).
Example:
pgpro_validate amcheck --all \
--dbname=postgres \
--username=postgres
G.5.4.3. checksum dir #
pgpro_validate [common_options] [remote_options] checksum [--percent=percent] dir --pgdata=datadir[dirpath]
Verifies block checksums of all relation files in the specified directory.
This command does not require a database connection but requires the --pgdata option.
dirpath is an optional path to the directory to check. If this argument is omitted, the PGDATA/base directory is checked.
The --percent option specifies the percentage of randomly selected blocks to check within each file. Possible values — from 0 to 100. The default value is 100 (all blocks).
Example:
pgpro_validate checksum --percent=20 dir \
--pgdata=/var/lib/postgresql/15/data
G.5.4.4. checksum file #
pgpro_validate [common_options] [remote_options] checksum [--percent=percent] file --pgdata=datadir[--bstart=number] [--bend=number]path
Validates block checksums of a single relation file.
This command does not require a database connection but requires the --pgdata option, even if path points to a file outside PGDATA.
path is a path to the file to check. Must be a regular file (not a directory or symbolic link).
--bstart=numberThe starting block number (inclusive).
--bend=numberThe ending block number (inclusive). If this option is not specified, all blocks to the end of the file are checked.
--percent=percentThe percentage of randomly selected blocks from the specified range. Possible values — from
0to100. The default value is100.
Example:
pgpro_validate checksum file \
--pgdata=/var/lib/postgresql/15/data \
--bstart=0 --bend=1024 \
/var/lib/postgresql/15/data/base/16384/2619
G.5.4.5. constraints #
pgpro_validate [common_options] [remote_options] constraints [--db=database_name] [-t=table_name]
Checks integrity constraints (NOT NULL, FOREIGN KEY, UNIQUE) for user tables.
--db=database_nameA comma-separated list of databases in which to check constraints. If this option is omitted, all databases are checked.
-t=table_name--table=table_nameA comma-separated list of tables to check. If this option is omitted, all user tables are checked.
Example:
pgpro_validate constraints \
--db=mydb \
--table=public.orders,public.customers \
--username=postgres
G.5.4.6. database_files #
pgpro_validate [common_options] [remote_options] database_files --pgdata=datadir
Compares files in the base and tablespace directories (pg_tblspc) with entries in the pg_class system catalog. It detects orphaned files that have no corresponding relations, as well as relations for which the expected files are missing.
This command requires both access to the PGDATA directory and a database connection.
Example:
pgpro_validate database_files \
--pgdata=/var/lib/postgresql/15/data \
--username=postgres
G.5.4.7. index #
pgpro_validate [common_options] [remote_options] index [--db=database_name]
Checks index format version and metadata consistency.
--db=database_nameA comma-separated list of databases. If this option is omitted, all databases are checked.
Example:
pgpro_validate index \
--db=mydb \
--username=postgres
G.5.4.8. integrity #
pgpro_validate [common_options] [remote_options] integrity [--select-from-relation[=true|false]]
Checks referential integrity of system catalogs: consistency of pg_depend, pg_shdepend, pg_class, pg_namespace, pg_authid, etc.
--select-from-relation[=true|false]Enable an additional check that executes the
SELECTcommand for each found relation to verify readability.
Example:
pgpro_validate integrity \
--dbname=postgres \
--username=postgres \
--select-from-relation
G.5.4.9. molotilka #
pgpro_validate [common_options] [remote_options] molotilka --pgdata=datadir[--percent=percent] [-t=table_name]
Performs a full page scan of tables and detects corrupted pages.
--percent=percentThe percentage of randomly selected pages to scan. Possible values — from
0to100. The default value is100.-t=table_name--table=table_nameA comma-separated list of tables to scan. If this option is omitted, all user tables are scanned.
Example:
pgpro_validate molotilka --percent=10 \
--pgdata=/var/lib/postgresql/15/data \
--dbname=mydb \
--username=postgres
G.5.4.10. permissions #
pgpro_validate [common_options] [remote_options] permissions --pgdata=datadir
Checks the permissions, owner, and group of all files and directories in PGDATA against the expected Postgres Pro values.
This command does not require a database connection but requires the --pgdata option.
permissions runs in parallel. Use the --jobs option to set the number of threads.
Example:
pgpro_validate permissions --jobs=4 \
--pgdata=/var/lib/postgresql/15/data
G.5.4.11. postgres_files #
pgpro_validate [common_options] [remote_options] postgres_files --pgdata=datadir
Checks for the presence of the fixed set of files and directories that Postgres Pro expects to see in PGDATA, for example, PG_VERSION, postgresql.conf, pg_xact, pg_wal.
This command does not require a database connection but requires the --pgdata option.
Example:
pgpro_validate postgres_files \
--pgdata=/var/lib/postgresql/15/data
G.5.4.12. tablespace #
pgpro_validate [common_options] [remote_options] tablespace --pgdata=datadir
Checks tablespace directories: correctness of symbolic links in PGDATA/pg_tblspc, existence of target directories, presence of subdirectories matching the server version, and consistency with pg_tablespace.
This command requires both access to the PGDATA directory and a database connection.
Example:
pgpro_validate tablespace \
--pgdata=/var/lib/postgresql/15/data \
--dbname=postgres
G.5.5. Common Options #
The following options apply to all commands, except where explicitly noted otherwise.
-d=database_name--dbname=database_nameThe database name to connect to. The default value is
postgres.-h=host--hostname=hostThe host name or address of the Postgres Pro server.
-p=port--port=portThe port of the Postgres Pro server. The default value is
5432.-U=username--username=usernameThe database user name. The default value is
postgres.-W=password--password=passwordThe database user password. If this option is not specified, the
PGPASSWORDenvironment variable or a.pgpassfile is used.-D=datadir--pgdata=datadirThe path to the
PGDATAdata directory.--pgdatadoes not apply to theamcheckcommand.--tmp=pathThe path to the temporary directory.
/tmpis used by default.-j=num_threads--jobs=num_threadsThe number of parallel threads for the check. The value must be a positive integer. It defaults to the number of available CPU cores (
runtime.NumCPU()). If the specified value exceeds the number of cores, the command fails with the error “exceed number of CPU cores”.--format={json|plain}The output format. Plain format is used by default. If the option is set to
json, the standard log is disabled, and the result is written as a single JSON document to the standard output.--report-size=integerThe maximum number of error messages to output for a single check.
--log-level={error|warn|info|debug}The logging level.
-?--helpPrint the command-line help.
-v--versionPrint the pgpro_validate version.
G.5.6. Remote Mode Options #
pgpro_validate supports running checks on a remote server via SSH. The local instance launches pgpro_validate on the remote host with the passed parameters.
--remote-host=hostnameThe address of the remote host. Specifying this option activates remote mode.
--remote-path=pathThe path to the pgpro_validate binary on the remote host. This option is required in remote mode.
--remote-port=portThe SSH port. The default value is
22.-u=username--remote-user=usernameThe SSH user.
--ssh-key=pathThe path to the private SSH key file.
--ssh-config=pathThe path to the SSH configuration file.
--ssh-options=optionsAdditional SSH options, separated by a semicolon. For example:
StrictHostKeyChecking=no;UserKnownHostsFile=/dev/null.--remote-versionPrint the pgpro_validate version on the remote host and the local utility version, then exit. If the
--remote-hostoption is not set, the command exits with an error.
G.5.6.1. Example #
Example of a full remote server check:
pgpro_validate \
--remote-host=db-server \
--remote-path=/usr/bin/pgpro_validate \
--remote-user=dba \
--ssh-key=/home/dba/.ssh/id_db_server \
--pgdata=/var/lib/postgresql/15/data \
--username=postgres \
--dbname=mydb