Home > mailing lists

Re: Portal->commandTag as an enum - Mailing list pgsql-hackers

From	John Naylor
Subject	Re: Portal->commandTag as an enum
Date	February 19, 2020 14:31:20
Msg-id	CACPNZCs1McaYxMrBXC39rMBfFerQYyaDFHL1-H=8JFUnRE7Rhw@mail.gmail.com Whole thread Raw
In response to	Re: Portal->commandTag as an enum (Mark Dilger <mark.dilger@enterprisedb.com>)
Responses	Re: Portal->commandTag as an enum
List	pgsql-hackers

Tree view

Hi Mark,

On Wed, Feb 19, 2020 at 10:40 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> This would only make sense to me if the string held in $_ had already been checked for safety, but Catalog.pm does
verylittle to verify that the string is safe to eval. The assumption here, so far as I can infer, is that we don’t
embedanything dangerous in our .dat files, so this should be ok. That may be true for the moment, but I can imagine a
daywhen we start embedding perl functions as quoted text inside a data file, or shell commands as text which look
enoughlike perl for eval() to be able to execute them. Developers who edit these .dat files and mess up the quoting,
andthen rerun ‘make’ to get the new .c and .h files generated, may not like the side effects. Perhaps I’m being overly
paranoid….

The use case for that seems slim. However, at a brief glance your
module seems more robust in other ways.

> Rather than add more code generation logic based on the design of Catalog.pm, I wrote a perl based data file parser
thatparses .dat files and returns vivified perl data, as Catalog.pm does, but with much stricter parsing logic to make
certainnothing dangerous gets eval()ed. I put the new module in DataFile.pm.
> [...]
> The new parser is more flexible about the structure of the data, which seems good to me for making it easier to add
ormodify data files in the future. The new parser does not yet have a means of hacking up the data to add
autogeneratedfields and such that Catalog.pm does, but I think a more clean break between parsing and autovivifying
fieldswould be good anyway.

Separation of concerns sounds like a good idea, but I've not fully
thought it through. For one advantage, I think it might be nicer to
have indexing.dat and toasting.dat instead of having to dig the info
out of C macros. This was evident while recently experimenting with
generating catcache control data.

As for the patch, I have not done a full review, but I have some
comments based on a light read-through:

utils/Makefile:

+# location of commandtag.dat
+headerdir = $(top_srcdir)/src/include/utils

This variable name is too generic for what the comment says it is. A
better abstraction, if we want one, would be the full path to the
commandtag input file. The other script invocations in this Makefile
don't do it this way, but that's a separate patch.

+# location to write generated headers
+sourcedir = $(top_srcdir)/src/backend/utils

Calling the output the source is bound to confuse people. The comment
implies all generated headers, not just the ones introduced by the
patch. I would just output to the current directory (i.e. have an
--output option with a default empty string). Also, if we want to
output somewhere else, I would imagine it'd be under the top builddir,
not srcdir.

+$(PERL) -I $(top_srcdir)/src/include/utils $<
--headerdir=$(headerdir) --sourcedir=$(sourcedir)
--inputfile=$(headerdir)/commandtag.dat

1. headerdir is entirely unused by the script
2. We can default to working dir for the output as mentioned above
3. -I $(top_srcdir)/src/include/utils is supposed to point to the dir
containing DataFile.pm, but since gencommandtag.pl has "use lib..."
it's probably not needed here. I don't recall why we keep the "-I"
elsewhere. (ditto in Solution.pm)

I'm thinking it would look something like this:

+$(PERL) $< --inputfile=$(top_srcdir)/src/include/utils/commandtag.dat

--
utils/misc/Makefile

+all: distprep
+
# Note: guc-file.c is not deleted by 'make clean',
# since we want to ship it in distribution tarballs.
clean:
@rm -f lex.yy.c
+
+maintainer-clean: clean

Seems non-functional.

--
DataFiles.pm

I haven't studied this in detail, but I'll suggest that if this meant
to have wider application, maybe it should live in src/tools ?

I'm not familiar with using different IO routines depending on the OS
-- what's the benefit of that?

--
gencommandtag.pl

slurp_without_comments() is unused.

sanity_check_data() seems longer than the main body of the script
(minus header boilerplate), and I wonder if we can pare it down some.
For one, I have difficulty imagining anyone would accidentally type an
unprintable or non-ascii character in a command tag and somehow not
realize it. For another, duplicating checks that were done earlier
seems like a maintenance headache.

dataerror() is defined near the top, but other functions are defined
at the bottom.

+# Generate all output internally before outputting anything, to avoid
+# partially overwriting generated files under error conditions

My personal preference is, having this as a design requirement
sacrifices readability for unclear gain, especially since a "chunk"
also includes things like header boilerplate. That said, the script is
also short enough that it doesn't make a huge difference either way.
Speaking of boilerplate, it's better for readability to separate that
from actual code such as:

typedef enum CommandTag
{
#define FIRST_COMMANDTAG COMMANDTAG_$sorted[0]->{taglabel})

--
tcop/dest.c

+ * We no longer display LastOid, but to preserve the wire protocol,
+ * we write InvalidOid where the LastOid used to be written. For
+ * efficiency in the snprintf(), hard-code InvalidOid as zero.

Hmm, is hard-coding zero going to make any difference here?

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Etsuro Fujita
Date: 19 February 2020, 13:41:09
Subject: Re: Updating row and width estimates in postgres_fdw

From: Fujii Masao
Date: 19 February 2020, 14:50:48
Subject: Re: small improvement of the elapsed time for truncating heap invacuum

Re: Portal->commandTag as an enum - Mailing list pgsql-hackers

Previous

Next