Thread: [PATCH] Using named captures in Catalog::ParseHeader()
Hi Hackers, Peter's patch set for autogenerating syscache info (https://postgr.es/m/75ae5875-3abc-dafc-8aec-73247ed41cde%40eisentraut.org) touched on one of my least favourite parts of Catalog.pm: the parenthesis-counting nightmare that is the parsing of catalog header directives. However, now that we require Perl 5.14, we can use the named capture feature (introduced in Perl 5.10) to make that a lot clearer, as in the attached patch. While I was rewriting the regexes I noticed that they were inconsistent about whether they accepted whitespace in the parameter lists, so I took the liberty to make them consistently allow whitespace after the opening paren and the commas, which is what most of them already did. I've verified that the generated postgres.bki is identical to before, and all tests pass. - ilmari From dd7ed959c93438045f9ff270feab33bb7b965c29 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org> Date: Wed, 31 May 2023 14:00:58 +0100 Subject: [PATCH] Use named captures in Catalog::ParseHeader() Now that we require Perl 5.14 we can use named captures and the %+ hash instead of having to count parenthesis groups manually. --- src/backend/catalog/Catalog.pm | 95 ++++++++++++++++++++-------------- 1 file changed, 55 insertions(+), 40 deletions(-) diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 84aaeb002a..b15f513183 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -91,73 +91,88 @@ sub ParseHeader # Push the data into the appropriate data structure. # Caution: when adding new recognized OID-defining macros, # also update src/include/catalog/renumber_oids.pl. - if (/^DECLARE_TOAST\(\s*(\w+),\s*(\d+),\s*(\d+)\)/) - { - push @{ $catalog{toasting} }, - { parent_table => $1, toast_oid => $2, toast_index_oid => $3 }; - } - elsif ( - /^DECLARE_TOAST_WITH_MACRO\(\s*(\w+),\s*(\d+),\s*(\d+),\s*(\w+),\s*(\w+)\)/ + if (/^DECLARE_TOAST\(\s* + (?<parent_table>\w+),\s* + (?<toast_oid>\d+),\s* + (?<toast_index_oid>\d+)\s* + \)/x ) { - push @{ $catalog{toasting} }, - { - parent_table => $1, - toast_oid => $2, - toast_index_oid => $3, - toast_oid_macro => $4, - toast_index_oid_macro => $5 - }; + push @{ $catalog{toasting} }, {%+}; } elsif ( - /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*(\w+),\s*(\d+),\s*(\w+),\s*(.+)\)/ + /^DECLARE_TOAST_WITH_MACRO\(\s* + (?<parent_table>\w+),\s* + (?<toast_oid>\d+),\s* + (?<toast_index_oid>\d+),\s* + (?<toast_oid_macro>\w+),\s* + (?<toast_index_oid_macro>\w+)\s* + \)/x + ) + { + push @{ $catalog{toasting} }, {%+}; + } + elsif ( + /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s* + (?<index_name>\w+),\s* + (?<index_oid>\d+),\s* + (?<index_oid_macro>\w+),\s* + (?<index_decl>.+)\s* + \)/x ) { push @{ $catalog{indexing} }, { is_unique => $1 ? 1 : 0, is_pkey => $2 ? 1 : 0, - index_name => $3, - index_oid => $4, - index_oid_macro => $5, - index_decl => $6 - }; - } - elsif (/^DECLARE_OID_DEFINING_MACRO\(\s*(\w+),\s*(\d+)\)/) - { - push @{ $catalog{other_oids} }, - { - other_name => $1, - other_oid => $2 + %+, }; } elsif ( - /^DECLARE_(ARRAY_)?FOREIGN_KEY(_OPT)?\(\s*\(([^)]+)\),\s*(\w+),\s*\(([^)]+)\)\)/ + /^DECLARE_OID_DEFINING_MACRO\(\s* + (?<other_name>\w+),\s* + (?<other_oid>\d+)\s* + \)/x + ) + { + push @{ $catalog{other_oids} }, {%+}; + } + elsif ( + /^DECLARE_(ARRAY_)?FOREIGN_KEY(_OPT)?\(\s* + \((?<fk_cols>[^)]+)\),\s* + (?<pk_table>\w+),\s* + \((?<pk_cols>[^)]+)\)\s* + \)/x ) { push @{ $catalog{foreign_keys} }, { is_array => $1 ? 1 : 0, is_opt => $2 ? 1 : 0, - fk_cols => $3, - pk_table => $4, - pk_cols => $5 + %+, }; } - elsif (/^CATALOG\((\w+),(\d+),(\w+)\)/) + elsif ( + /^CATALOG\(\s* + (?<catname>\w+),\s* + (?<relation_oid>\d+),\s* + (?<relation_oid_macro>\w+)\s* + \)/x + ) { - $catalog{catname} = $1; - $catalog{relation_oid} = $2; - $catalog{relation_oid_macro} = $3; + @catalog{ keys %+ } = values %+; $catalog{bootstrap} = /BKI_BOOTSTRAP/ ? ' bootstrap' : ''; $catalog{shared_relation} = /BKI_SHARED_RELATION/ ? ' shared_relation' : ''; - if (/BKI_ROWTYPE_OID\((\d+),(\w+)\)/) + if (/BKI_ROWTYPE_OID\(\s* + (?<rowtype_oid>\d+),\s* + (?<rowtype_oid_macro>\w+)\s* + \)/x + ) { - $catalog{rowtype_oid} = $1; - $catalog{rowtype_oid_clause} = " rowtype_oid $1"; - $catalog{rowtype_oid_macro} = $2; + @catalog{ keys %+ } = values %+; + $catalog{rowtype_oid_clause} = " rowtype_oid $+{rowtype_oid}"; } else { -- 2.39.2
On Thu, Jun 1, 2023 at 7:12 PM Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> wrote:
>
> Hi Hackers,
>
> Peter's patch set for autogenerating syscache info
> (https://postgr.es/m/75ae5875-3abc-dafc-8aec-73247ed41cde%40eisentraut.org)
> touched on one of my least favourite parts of Catalog.pm: the
> parenthesis-counting nightmare that is the parsing of catalog header
> directives.
>
> However, now that we require Perl 5.14, we can use the named capture
> feature (introduced in Perl 5.10) to make that a lot clearer, as in the
> attached patch.
>
> While I was rewriting the regexes I noticed that they were inconsistent
> about whether they accepted whitespace in the parameter lists, so I took
> the liberty to make them consistently allow whitespace after the opening
> paren and the commas, which is what most of them already did.
LGTM
--
John Naylor
EDB: http://www.enterprisedb.com
Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> writes: > However, now that we require Perl 5.14, we can use the named capture > feature (introduced in Perl 5.10) to make that a lot clearer, as in the > attached patch. Added to the open commitfest: https://commitfest.postgresql.org/43/4361/ - ilmari
On Thu, Jun 01, 2023 at 01:12:22PM +0100, Dagfinn Ilmari Mannsåker wrote: > While I was rewriting the regexes I noticed that they were inconsistent > about whether they accepted whitespace in the parameter lists, so I took > the liberty to make them consistently allow whitespace after the opening > paren and the commas, which is what most of them already did. That's the business with \s* in CATALOG. Is that right? Indeed, that's more consistent. > I've verified that the generated postgres.bki is identical to before, > and all tests pass. I find that pretty cool. Nice. Patch looks OK from here. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Thu, Jun 01, 2023 at 01:12:22PM +0100, Dagfinn Ilmari Mannsåker wrote: >> While I was rewriting the regexes I noticed that they were inconsistent >> about whether they accepted whitespace in the parameter lists, so I took >> the liberty to make them consistently allow whitespace after the opening >> paren and the commas, which is what most of them already did. > > That's the business with \s* in CATALOG. Is that right? Indeed, > that's more consistent. Yes, \s* means "zero or more whitespace characters". >> I've verified that the generated postgres.bki is identical to before, >> and all tests pass. > > I find that pretty cool. Nice. Patch looks OK from here. Thanks for the review! - ilmari
On Wed, Jun 14, 2023 at 10:30:23AM +0100, Dagfinn Ilmari Mannsåker wrote: > Thanks for the review! v17 is now open, so applied this one. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Wed, Jun 14, 2023 at 10:30:23AM +0100, Dagfinn Ilmari Mannsåker wrote: >> Thanks for the review! > > v17 is now open, so applied this one. Thanks for committing! - ilmari