Thread: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line
[PATCH] Teach Catalog.pm how many attributes there should be per DATA() line
From
David Christensen
Date:
Fixes a build issue I ran into while adding some columns to system tables: Throws a build error if we encounter a different number of fields in a DATA() line than we expect for the catalog in question. Previously, it was possible to silently ignore any mismatches at build time which could result in symbol undefined errors at link time. Now we stop and identify the infringing line as soon as we encounter it, which greatly speeds up the debugging process. -- David Christensen PostgreSQL Team Manager End Point Corporation david@endpoint.com 785-727-1171
Attachment
Re: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line
From
Robert Haas
Date:
On Tue, Oct 6, 2015 at 9:15 AM, David Christensen <david@endpoint.com> wrote: > Fixes a build issue I ran into while adding some columns to system tables: > > Throws a build error if we encounter a different number of fields in a > DATA() line than we expect for the catalog in question. > > Previously, it was possible to silently ignore any mismatches at build > time which could result in symbol undefined errors at link time. Now > we stop and identify the infringing line as soon as we encounter it, > which greatly speeds up the debugging process. I think this is a GREAT idea, but this line made me laugh[1]: + warn "No Natts defined yet, silently skipping check...\n"; I suggest that we make that a fatal error. Like "Could not find definition Natts_pg_proc before start of DATA". Secondly, I don't think we should check this at this point in the code, because then it blindly affects everybody who uses Catalog.pm. Let's pick one specific place to do this check. I suggest genbki.pl, inside the loop with this comment: "# Ordinary catalog with DATA line(s)" -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] Because if you're producing a warning, it's not silent!
Re: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line
From
David Christensen
Date:
> On Oct 8, 2015, at 11:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Oct 6, 2015 at 9:15 AM, David Christensen <david@endpoint.com> wrote: >> Fixes a build issue I ran into while adding some columns to system tables: >> >> Throws a build error if we encounter a different number of fields in a >> DATA() line than we expect for the catalog in question. >> >> Previously, it was possible to silently ignore any mismatches at build >> time which could result in symbol undefined errors at link time. Now >> we stop and identify the infringing line as soon as we encounter it, >> which greatly speeds up the debugging process. > > I think this is a GREAT idea, but this line made me laugh[1]: > > + warn "No Natts defined yet, silently skipping check...\n"; > > I suggest that we make that a fatal error. Like "Could not find > definition Natts_pg_proc before start of DATA”. That’s fine with me; my main consideration was to make sure nothing broke in the status quo, including dependencies I wasunaware of. > Secondly, I don't think we should check this at this point in the > code, because then it blindly affects everybody who uses Catalog.pm. > Let's pick one specific place to do this check. I suggest genbki.pl, > inside the loop with this comment: "# Ordinary catalog with DATA > line(s)" I’m happy to move it around, but If everything is in order, how will this affect things at all? If we’re in a good statethis condition should never trigger. -- David Christensen PostgreSQL Team Manager End Point Corporation david@endpoint.com 785-727-1171
Re: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line
From
Robert Haas
Date:
On Thu, Oct 8, 2015 at 12:43 PM, David Christensen <david@endpoint.com> wrote: > I’m happy to move it around, but If everything is in order, how will this affect things at all? If we’re in a good statethis condition should never trigger. Right, but I think it ought to be Catalog.pm's job to parse the config file. The job of complaining about what it contains is a job worth doing, but it's not the same job. Personally, I hate it when parsers take it upon themselves to do semantic analysis, because then what happens if you want to write, say, a tool to repair a broken file? You need to be able to read it in without erroring out so that you can frob whatever's busted and write it back out, and the parser is getting in your way. Maybe that's not going to come up here, but I'm just explaining my general philosophy on these things... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] Teach Catalog.pm how many attributes there should be per DATA() line
From
David Christensen
Date:
> On Oct 9, 2015, at 2:17 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Oct 8, 2015 at 12:43 PM, David Christensen <david@endpoint.com> wrote: >> I’m happy to move it around, but If everything is in order, how will this affect things at all? If we’re in a good statethis condition should never trigger. > > Right, but I think it ought to be Catalog.pm's job to parse the config > file. The job of complaining about what it contains is a job worth > doing, but it's not the same job. Personally, I hate it when parsers > take it upon themselves to do semantic analysis, because then what > happens if you want to write, say, a tool to repair a broken file? > You need to be able to read it in without erroring out so that you can > frob whatever's busted and write it back out, and the parser is > getting in your way. Maybe that's not going to come up here, but I'm > just explaining my general philosophy on these things… Not disagreeing with you in general, but this is a very specific use case and I think we lose the niceness of being ableto tie back to the specific line number for the file in question—the alternative being to track that information as wellin a separate structure which we then pass around, which seems like overkill. The only two consumers of the catalog-specific data lines (at least via direct access in Perl) are genbki.pl and Gen_fmgtab.pl. We would need to add these checks anyway in both call sites, so to me it seems important to bail early ifwe see any issues, so I think putting the failure as soon as we notice it with as much context to fix it (i.e., as written)is the right choice. We can certainly pretty up the messages. The consistency of the system catalogs in the development state is something that is fundamental to whether there is anyinformation that is sensible to query, and by definition if we are missing columns in the data rows this is a mistakeand whatever parsed data in here will be worse than useless (as who knows the order of the missing column, data can/willend up being misaligned). Thus I don’t believe that we’d want other (hypothetical) Catalog.pm consumers to try touse data that we know is bad. -- David Christensen PostgreSQL Team Manager End Point Corporation david@endpoint.com 785-727-1171