Re: Reducing connection overhead in pg_upgrade compat check phase - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Reducing connection overhead in pg_upgrade compat check phase
Date
Msg-id 20230708214354.GA4044380@nathanxps13
Whole thread Raw
In response to Re: Reducing connection overhead in pg_upgrade compat check phase  (Daniel Gustafsson <daniel@yesql.se>)
Responses Re: Reducing connection overhead in pg_upgrade compat check phase
List pgsql-hackers
Thanks for the new patch.

On Thu, Jul 06, 2023 at 05:58:33PM +0200, Daniel Gustafsson wrote:
>> On 4 Jul 2023, at 21:08, Nathan Bossart <nathandbossart@gmail.com> wrote:
>> Also, I think it would be worth breaking check_for_data_types_usage() into
>> a few separate functions (or doing some other similar refactoring) to
>> improve readability.  At this point, the function is quite lengthy, and I
>> count 6 levels of indentation at some lines.
> 
> 
> It it is pretty big for sure, but it's also IMHO not terribly complicated as
> it's not really performing any hard to follow logic.
> 
> I have no issues refactoring it, but trying my hand at I was only making (what
> I consider) less readable code by having to jump around so I consider it a
> failure.  If you have any suggestions, I would be more than happy to review and
> incorporate those though.

I don't have a strong opinion about this.

+                for (int rowno = 0; rowno < ntups; rowno++)
+                {
+                    if (script == NULL && (script = fopen_priv(output_path, "a")) == NULL)
+                        pg_fatal("could not open file \"%s\": %s",
+                                 output_path,
+                                 strerror(errno));
+                    if (!db_used)
+                    {
+                        fprintf(script, "In database: %s\n", active_db->db_name);
+                        db_used = true;
+                    }

Since "script" will be NULL and "db_used" will be false in the first
iteration of the loop, couldn't we move this stuff to before the loop?

+                    fprintf(script, " %s.%s.%s\n",
+                            PQgetvalue(res, rowno, i_nspname),
+                            PQgetvalue(res, rowno, i_relname),
+                            PQgetvalue(res, rowno, i_attname));

nitpick: І think the current code has two spaces at the beginning of this
format string.  Did you mean to remove one of them?

+                if (script)
+                {
+                    fclose(script);
+                    script = NULL;
+                }

Won't "script" always be initialized here?  If I'm following this code
correctly, I think everything except the fclose() can be removed.

+            cur_check++;

I think this is unnecessary since we assign "cur_check" at the beginning of
every loop iteration.  I see two of these.

+static int    n_data_types_usage_checks = 7;

Can we determine this programmatically so that folks don't need to remember
to update it?

+    /* Prepare an array to store the results of checks in */
+    results = pg_malloc(sizeof(bool) * n_data_types_usage_checks);
+    memset(results, true, sizeof(*results));

IMHO it's a little strange that this is initialized to all "true", only
because I think most other Postgres code does the opposite.

+bool
+check_for_aclitem_data_type_usage(ClusterInfo *cluster)

Do you think we should rename these functions to something like
"should_check_for_*"?  They don't actually do the check, they just tell you
whether you should based on the version.  In fact, I wonder if we could
just add the versions directly to data_types_usage_checks so that we don't
need the separate hook functions.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Gurjeet Singh
Date:
Subject: Re: DecodeInterval fixes
Next
From: Heikki Linnakangas
Date:
Subject: Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)