Re: Reducing connection overhead in pg_upgrade compat check phase - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: Reducing connection overhead in pg_upgrade compat check phase
Date
Msg-id 542946CF-0D77-446E-9FA6-028982175320@yesql.se
Whole thread Raw
In response to Re: Reducing connection overhead in pg_upgrade compat check phase  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Reducing connection overhead in pg_upgrade compat check phase
List pgsql-hackers
> On 18 Feb 2023, at 06:46, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Fri, Feb 17, 2023 at 10:44:49PM +0100, Daniel Gustafsson wrote:
>> When adding a check to pg_upgrade a while back I noticed in a profile that the
>> cluster compatibility check phase spend a lot of time in connectToServer.  Some
>> of this can be attributed to data type checks which each run serially in turn
>> connecting to each database to run the check, and this seemed like a place
>> where we can do better.
>>
>> The attached patch moves the checks from individual functions, which each loops
>> over all databases, into a struct which is consumed by a single umbrella check
>> where all data type queries are executed against a database using the same
>> connection.  This way we can amortize the connectToServer overhead across more
>> accesses to the database.
>
> This change consolidates all the data type checks, so instead of 7 separate
> loops through all the databases, there is just one.  However, I wonder if
> we are leaving too much on the table, as there are a number of other
> functions that also loop over all the databases:
>
>     * get_loadable_libraries
>     * get_db_and_rel_infos
>     * report_extension_updates
>     * old_9_6_invalidate_hash_indexes
>     * check_for_isn_and_int8_passing_mismatch
>     * check_for_user_defined_postfix_ops
>     * check_for_incompatible_polymorphics
>     * check_for_tables_with_oids
>     * check_for_user_defined_encoding_conversions
>
> I suspect consolidating get_loadable_libraries, get_db_and_rel_infos, and
> report_extension_updates would be prohibitively complicated and not worth
> the effort.

Agreed, the added complexity of the code seems hard to justify unless there are
actual reports of problems.

I did experiment with reducing the allocations of namespaces and tablespaces
with a hashtable, see the attached WIP diff.  There is no measurable difference
in speed, but a synthetic benchmark where allocations cannot be reused shows
reduced memory pressure.  This might help on very large schemas, but it's not
worth pursuing IMO.

> old_9_6_invalidate_hash_indexes is only needed for unsupported
> versions, so that might not be worth consolidating.
> check_for_isn_and_int8_passing_mismatch only loops through all databases
> when float8_pass_by_value in the control data differs, so that might not be
> worth it, either.

Yeah, these two aren't all that interesting to spend cycles on IMO.

> The last 4 are for supported versions and, from a very
> quick glance, seem possible to consolidate.  That would bring us to a total
> of 11 separate loops that we could consolidate into one.  However, the data
> type checks seem to follow a nice pattern, so perhaps this is easier said
> than done.

There is that, refactoring the data type checks leads to removal of duplicated
code and a slight performance improvement.  Refactoring the other checks to
reduce overhead would be an interesting thing to look at, but this point in the
v16 cycle might not be ideal for that.

> IIUC with the patch, pg_upgrade will immediately fail as soon as a single
> check in a database fails.  I believe this differs from the current
> behavior where all matches for a given check in the cluster are logged
> before failing.

Yeah, that's wrong. Fixed.

> I wonder if it'd be better to perform all of the data type
> checks in all databases before failing so that all of the violations are
> reported.  Else, users would have to run pg_upgrade, fix a violation, run
> pg_upgrade again, fix another one, etc.

I think that's better, and have changed the patch to do it that way.

One change this brings is that check.c contains version specific checks in the
struct.  Previously these were mostly contained in version.c (some, like the
9.4 jsonb check was in check.c) which maintained some level of separation.
Splitting the array init is of course one option but it also seems a tad messy.
Not sure what's best, but for now I've documented it in the array comment at
least.

This version also moves the main data types check to check.c, renames some
members in the struct, moves to named initializers (as commented on by Justin
downthread), and adds some more polish here and there.

--
Daniel Gustafsson



Attachment

pgsql-hackers by date:

Previous
From: Jim Jones
Date:
Subject: Re: [PATCH] Add pretty-printed XML output option
Next
From: Dean Rasheed
Date:
Subject: Re: Incorrect command tag row count for MERGE with a cross-partition update