Home > mailing lists

Re: Reducing connection overhead in pg_upgrade compat check phase - Mailing list pgsql-hackers

From	Daniel Gustafsson
Subject	Re: Reducing connection overhead in pg_upgrade compat check phase
Date	July 10, 2023 14:43:23
Msg-id	A31FC972-FCBE-47CE-85F0-182B58C29422@yesql.se Whole thread Raw
In response to	Re: Reducing connection overhead in pg_upgrade compat check phase (Nathan Bossart <nathandbossart@gmail.com>)
Responses	Re: Reducing connection overhead in pg_upgrade compat check phase
List	pgsql-hackers

Tree view

> On 8 Jul 2023, at 23:43, Nathan Bossart <nathandbossart@gmail.com> wrote:

Thanks for reviewing!

> Since "script" will be NULL and "db_used" will be false in the first
> iteration of the loop, couldn't we move this stuff to before the loop?

We could.  It's done this way to match how all the other checks are performing
the inner loop for consistency.  I think being consistent is better than micro
optimizing in non-hot codepaths even though it adds some redundancy.

> nitpick: І think the current code has two spaces at the beginning of this
> format string.  Did you mean to remove one of them?

Nice catch, I did not. Fixed.

> Won't "script" always be initialized here?  If I'm following this code
> correctly, I think everything except the fclose() can be removed.

You are right that this check is superfluous.  This is again an artifact of
modelling the code around how the other checks work for consistency.  At least
I think that's a good characteristic of the code.

> I think this is unnecessary since we assign "cur_check" at the beginning of
> every loop iteration.  I see two of these.

Right, this is a pointless leftover from a previous version which used a
while() loop, and I had missed removing them.  Fixed.

> +static int    n_data_types_usage_checks = 7;
>
> Can we determine this programmatically so that folks don't need to remember
> to update it?

Fair point, I've added a counter loop to the beginning of the check function to
calculate it.

> IMHO it's a little strange that this is initialized to all "true", only
> because I think most other Postgres code does the opposite.

Agreed, but it made for a less contrived codepath in knowing when an error has
been seen already, to avoid duplicate error output, so I think it's worth it.

> Do you think we should rename these functions to something like
> "should_check_for_*"?  They don't actually do the check, they just tell you
> whether you should based on the version.

I've been pondering that too, and did a rename now along with moving them all
to a single place as well as changing the comments to make it clearer.

> In fact, I wonder if we could just add the versions directly to
> data_types_usage_checks so that we don't need the separate hook functions.

We could, but it would be sort of contrived I think since some check <= and
some == while some check the catversion as well (and new ones may have other
variants.  I think this is the least paint-ourselves-in-a-corner version, if we
feel it's needlessly complicated and no other variants are added we can revisit
this.

--
Daniel Gustafsson

Attachment

v6-0001-pg_upgrade-run-all-data-type-checks-per-connectio.patch

pgsql-hackers by date:

From: Melih Mutlu
Date: 10 July 2023, 14:41:11
Subject: Re: Changing types of block and chunk sizes in memory contexts

From: Alvaro Herrera
Date: 10 July 2023, 14:47:12
Subject: Re: remaining sql/json patches

Re: Reducing connection overhead in pg_upgrade compat check phase - Mailing list pgsql-hackers

Attachment

Previous

Next