Thread: [PATCH] Using named captures in Catalog::ParseHeader()

[PATCH] Using named captures in Catalog::ParseHeader()

From
Dagfinn Ilmari Mannsåker
Date:
Hi Hackers,

Peter's patch set for autogenerating syscache info
(https://postgr.es/m/75ae5875-3abc-dafc-8aec-73247ed41cde%40eisentraut.org)
touched on one of my least favourite parts of Catalog.pm: the
parenthesis-counting nightmare that is the parsing of catalog header
directives.

However, now that we require Perl 5.14, we can use the named capture
feature (introduced in Perl 5.10) to make that a lot clearer, as in the
attached patch.

While I was rewriting the regexes I noticed that they were inconsistent
about whether they accepted whitespace in the parameter lists, so I took
the liberty to make them consistently allow whitespace after the opening
paren and the commas, which is what most of them already did.

I've verified that the generated postgres.bki is identical to before,
and all tests pass.

- ilmari

From dd7ed959c93438045f9ff270feab33bb7b965c29 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org>
Date: Wed, 31 May 2023 14:00:58 +0100
Subject: [PATCH] Use named captures in Catalog::ParseHeader()

Now that we require Perl 5.14 we can use named captures and the %+
hash instead of having to count parenthesis groups manually.
---
 src/backend/catalog/Catalog.pm | 95 ++++++++++++++++++++--------------
 1 file changed, 55 insertions(+), 40 deletions(-)

diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index 84aaeb002a..b15f513183 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -91,73 +91,88 @@ sub ParseHeader
         # Push the data into the appropriate data structure.
         # Caution: when adding new recognized OID-defining macros,
         # also update src/include/catalog/renumber_oids.pl.
-        if (/^DECLARE_TOAST\(\s*(\w+),\s*(\d+),\s*(\d+)\)/)
-        {
-            push @{ $catalog{toasting} },
-              { parent_table => $1, toast_oid => $2, toast_index_oid => $3 };
-        }
-        elsif (
-            /^DECLARE_TOAST_WITH_MACRO\(\s*(\w+),\s*(\d+),\s*(\d+),\s*(\w+),\s*(\w+)\)/
+        if (/^DECLARE_TOAST\(\s*
+             (?<parent_table>\w+),\s*
+             (?<toast_oid>\d+),\s*
+             (?<toast_index_oid>\d+)\s*
+             \)/x
           )
         {
-            push @{ $catalog{toasting} },
-              {
-                parent_table => $1,
-                toast_oid => $2,
-                toast_index_oid => $3,
-                toast_oid_macro => $4,
-                toast_index_oid_macro => $5
-              };
+            push @{ $catalog{toasting} }, {%+};
         }
         elsif (
-            /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*(\w+),\s*(\d+),\s*(\w+),\s*(.+)\)/
+            /^DECLARE_TOAST_WITH_MACRO\(\s*
+             (?<parent_table>\w+),\s*
+             (?<toast_oid>\d+),\s*
+             (?<toast_index_oid>\d+),\s*
+             (?<toast_oid_macro>\w+),\s*
+             (?<toast_index_oid_macro>\w+)\s*
+             \)/x
+          )
+        {
+            push @{ $catalog{toasting} }, {%+};
+        }
+        elsif (
+            /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*
+             (?<index_name>\w+),\s*
+             (?<index_oid>\d+),\s*
+             (?<index_oid_macro>\w+),\s*
+             (?<index_decl>.+)\s*
+             \)/x
           )
         {
             push @{ $catalog{indexing} },
               {
                 is_unique => $1 ? 1 : 0,
                 is_pkey => $2 ? 1 : 0,
-                index_name => $3,
-                index_oid => $4,
-                index_oid_macro => $5,
-                index_decl => $6
-              };
-        }
-        elsif (/^DECLARE_OID_DEFINING_MACRO\(\s*(\w+),\s*(\d+)\)/)
-        {
-            push @{ $catalog{other_oids} },
-              {
-                other_name => $1,
-                other_oid => $2
+                %+,
               };
         }
         elsif (
-            /^DECLARE_(ARRAY_)?FOREIGN_KEY(_OPT)?\(\s*\(([^)]+)\),\s*(\w+),\s*\(([^)]+)\)\)/
+            /^DECLARE_OID_DEFINING_MACRO\(\s*
+             (?<other_name>\w+),\s*
+             (?<other_oid>\d+)\s*
+             \)/x
+          )
+        {
+            push @{ $catalog{other_oids} }, {%+};
+        }
+        elsif (
+            /^DECLARE_(ARRAY_)?FOREIGN_KEY(_OPT)?\(\s*
+             \((?<fk_cols>[^)]+)\),\s*
+             (?<pk_table>\w+),\s*
+             \((?<pk_cols>[^)]+)\)\s*
+             \)/x
           )
         {
             push @{ $catalog{foreign_keys} },
               {
                 is_array => $1 ? 1 : 0,
                 is_opt => $2 ? 1 : 0,
-                fk_cols => $3,
-                pk_table => $4,
-                pk_cols => $5
+                %+,
               };
         }
-        elsif (/^CATALOG\((\w+),(\d+),(\w+)\)/)
+        elsif (
+            /^CATALOG\(\s*
+             (?<catname>\w+),\s*
+             (?<relation_oid>\d+),\s*
+             (?<relation_oid_macro>\w+)\s*
+             \)/x
+          )
         {
-            $catalog{catname} = $1;
-            $catalog{relation_oid} = $2;
-            $catalog{relation_oid_macro} = $3;
+            @catalog{ keys %+ } = values %+;
 
             $catalog{bootstrap} = /BKI_BOOTSTRAP/ ? ' bootstrap' : '';
             $catalog{shared_relation} =
               /BKI_SHARED_RELATION/ ? ' shared_relation' : '';
-            if (/BKI_ROWTYPE_OID\((\d+),(\w+)\)/)
+            if (/BKI_ROWTYPE_OID\(\s*
+                 (?<rowtype_oid>\d+),\s*
+                 (?<rowtype_oid_macro>\w+)\s*
+                 \)/x
+              )
             {
-                $catalog{rowtype_oid} = $1;
-                $catalog{rowtype_oid_clause} = " rowtype_oid $1";
-                $catalog{rowtype_oid_macro} = $2;
+                @catalog{ keys %+ } = values %+;
+                $catalog{rowtype_oid_clause} = " rowtype_oid $+{rowtype_oid}";
             }
             else
             {
-- 
2.39.2


Re: [PATCH] Using named captures in Catalog::ParseHeader()

From
John Naylor
Date:

On Thu, Jun 1, 2023 at 7:12 PM Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> wrote:
>
> Hi Hackers,
>
> Peter's patch set for autogenerating syscache info
> (https://postgr.es/m/75ae5875-3abc-dafc-8aec-73247ed41cde%40eisentraut.org)
> touched on one of my least favourite parts of Catalog.pm: the
> parenthesis-counting nightmare that is the parsing of catalog header
> directives.
>
> However, now that we require Perl 5.14, we can use the named capture
> feature (introduced in Perl 5.10) to make that a lot clearer, as in the
> attached patch.
>
> While I was rewriting the regexes I noticed that they were inconsistent
> about whether they accepted whitespace in the parameter lists, so I took
> the liberty to make them consistently allow whitespace after the opening
> paren and the commas, which is what most of them already did.

LGTM

--
John Naylor
EDB: http://www.enterprisedb.com

Re: [PATCH] Using named captures in Catalog::ParseHeader()

From
Dagfinn Ilmari Mannsåker
Date:
Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> writes:

> However, now that we require Perl 5.14, we can use the named capture
> feature (introduced in Perl 5.10) to make that a lot clearer, as in the
> attached patch.

Added to the open commitfest: https://commitfest.postgresql.org/43/4361/

- ilmari



Re: [PATCH] Using named captures in Catalog::ParseHeader()

From
Michael Paquier
Date:
On Thu, Jun 01, 2023 at 01:12:22PM +0100, Dagfinn Ilmari Mannsåker wrote:
> While I was rewriting the regexes I noticed that they were inconsistent
> about whether they accepted whitespace in the parameter lists, so I took
> the liberty to make them consistently allow whitespace after the opening
> paren and the commas, which is what most of them already did.

That's the business with \s* in CATALOG.  Is that right?  Indeed,
that's more consistent.

> I've verified that the generated postgres.bki is identical to before,
> and all tests pass.

I find that pretty cool.  Nice.  Patch looks OK from here.
--
Michael

Attachment

Re: [PATCH] Using named captures in Catalog::ParseHeader()

From
Dagfinn Ilmari Mannsåker
Date:
Michael Paquier <michael@paquier.xyz> writes:

> On Thu, Jun 01, 2023 at 01:12:22PM +0100, Dagfinn Ilmari Mannsåker wrote:
>> While I was rewriting the regexes I noticed that they were inconsistent
>> about whether they accepted whitespace in the parameter lists, so I took
>> the liberty to make them consistently allow whitespace after the opening
>> paren and the commas, which is what most of them already did.
>
> That's the business with \s* in CATALOG.  Is that right?  Indeed,
> that's more consistent.

Yes, \s* means "zero or more whitespace characters".

>> I've verified that the generated postgres.bki is identical to before,
>> and all tests pass.
>
> I find that pretty cool.  Nice.  Patch looks OK from here.

Thanks for the review!

- ilmari



Re: [PATCH] Using named captures in Catalog::ParseHeader()

From
Michael Paquier
Date:
On Wed, Jun 14, 2023 at 10:30:23AM +0100, Dagfinn Ilmari Mannsåker wrote:
> Thanks for the review!

v17 is now open, so applied this one.
--
Michael

Attachment

Re: [PATCH] Using named captures in Catalog::ParseHeader()

From
Dagfinn Ilmari Mannsåker
Date:
Michael Paquier <michael@paquier.xyz> writes:

> On Wed, Jun 14, 2023 at 10:30:23AM +0100, Dagfinn Ilmari Mannsåker wrote:
>> Thanks for the review!
>
> v17 is now open, so applied this one.

Thanks for committing!

- ilmari