On Fri, Feb 17, 2023 at 10:44:49PM +0100, Daniel Gustafsson wrote:
> When adding a check to pg_upgrade a while back I noticed in a profile that the
> cluster compatibility check phase spend a lot of time in connectToServer. Some
> of this can be attributed to data type checks which each run serially in turn
> connecting to each database to run the check, and this seemed like a place
> where we can do better.
>
> The attached patch moves the checks from individual functions, which each loops
> over all databases, into a struct which is consumed by a single umbrella check
> where all data type queries are executed against a database using the same
> connection. This way we can amortize the connectToServer overhead across more
> accesses to the database.
This change consolidates all the data type checks, so instead of 7 separate
loops through all the databases, there is just one. However, I wonder if
we are leaving too much on the table, as there are a number of other
functions that also loop over all the databases:
* get_loadable_libraries
* get_db_and_rel_infos
* report_extension_updates
* old_9_6_invalidate_hash_indexes
* check_for_isn_and_int8_passing_mismatch
* check_for_user_defined_postfix_ops
* check_for_incompatible_polymorphics
* check_for_tables_with_oids
* check_for_user_defined_encoding_conversions
I suspect consolidating get_loadable_libraries, get_db_and_rel_infos, and
report_extension_updates would be prohibitively complicated and not worth
the effort. old_9_6_invalidate_hash_indexes is only needed for unsupported
versions, so that might not be worth consolidating.
check_for_isn_and_int8_passing_mismatch only loops through all databases
when float8_pass_by_value in the control data differs, so that might not be
worth it, either. The last 4 are for supported versions and, from a very
quick glance, seem possible to consolidate. That would bring us to a total
of 11 separate loops that we could consolidate into one. However, the data
type checks seem to follow a nice pattern, so perhaps this is easier said
than done.
IIUC with the patch, pg_upgrade will immediately fail as soon as a single
check in a database fails. I believe this differs from the current
behavior where all matches for a given check in the cluster are logged
before failing. I wonder if it'd be better to perform all of the data type
checks in all databases before failing so that all of the violations are
reported. Else, users would have to run pg_upgrade, fix a violation, run
pg_upgrade again, fix another one, etc.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com