Re: making the backend's json parser work in frontend code - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: making the backend's json parser work in frontend code
Date
Msg-id DF8FAC3A-E557-4AB4-8103-B830276B8637@enterprisedb.com
Whole thread Raw
In response to Re: making the backend's json parser work in frontend code  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: making the backend's json parser work in frontend code  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
List pgsql-hackers

> On Jan 28, 2020, at 8:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Jan 27, 2020 at 3:05 PM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
>> Since you’ve committed your 0004 and 0005 patches, this v6 patch set is now based on a fresh copy of master.
>
> I think the first question for 0005 is whether want this at all.
> Initially, you proposed NOT committing it, but then Andrew reviewed it
> as if it were for commit. I'm not sure whether he was actually saying
> that it ought to be committed, though, or whether he just missed your
> remarks on the topic. Nobody else has really taken a position. I'm not
> 100% convinced that it's necessary to include this, but I'm also not
> particularly opposed to it. It's a fairly small amount of code, which
> is nice, and perhaps useful as a demonstration of how to use the JSON
> parser in a frontend application, which someone also might find nice.

Once Andrew reviewed it, I started thinking about it as something that might get committed.  In that context, I think
thereshould be a lot more tests in this new src/test/bin directory for other common code, but adding those as part of
thispatch just seems to confuse this patch. 

In addition to adding frontend tests for code already in src/common, the conversation in another thread about adding
frontendversions of elog and ereport seem like candidates for tests in this location.  Sure, you can add an elog into a
realfrontend tool, such as pg_ctl, and update the tests for that program to expect that elog’s output, but what if you
justwant to exhaustively test the elog infrastructure in the frontend spanning multiple locales, encodings, whatever?
You’vealso recently mentioned the possibility of having memory contexts in frontend code.  Testing those seems like a
goodfit, too. 

I decided to leave this in the next version of the patch set, v7.  v6 had three files, the second being something that
alreadygot committed in a different form, so this is now in v7-0002 whereas it had been in v6-0003.  v6-0002 has no
equivalentin v7. 

> Anyone else want to express an opinion?
>
> Meanwhile, here is a round of nitp^H^H^H^Hreview:
>
> -# installcheck and install should not recurse into the subdirectory "modules".
> +# installcheck and install should not recurse into the subdirectory "modules"
> +# nor "bin".
>
> I would probably have just changed this to:
>
> # installcheck and install should not recurse into "modules" or "bin"
>
> The details are arguable, but you definitely shouldn't say "the
> subdirectory" and then list two of them.

I read that as “nor [the subdirectory] bin” with the [the subdirectory] portion elided, and it doesn’t sound anomalous
tome, but your formulation is more compact.  I have used it in v7 of the patch set.  Thanks. 

>
> +This directory contains a set of programs that exercise functionality declared
> +in src/include/common and defined in src/common.  The purpose of these programs
> +is to verify that code intended to work both from frontend and backend code do
> +indeed work when compiled and used in frontend code.  The structure of this
> +directory makes no attempt to test that such code works in the backend, as the
> +backend has its own tests already, and presumably those tests sufficiently
> +exercide the code as used by it.
>
> "exercide" is not spelled correctly, but I also disagree with giving
> the directory so narrow a charter. I think you should just say
> something like:
>
> This directory contains programs that are built and executed for
> testing purposes,
> but never installed. It may be used, for example, to test that code in
> src/common
> works in frontend environments.

Your formulation sounds fine, and I’ve used it in v7.

> +# There doesn't seem to be any easy way to get TestLib to use the binaries from
> +# our directory, so we hack up a path to our binary and run that
> directly.  This
> +# seems brittle enough that some other solution should be found, if possible.
> +
> +my $test_json = join('/', $ENV{TESTDIR}, 'test_json');
>
> I don't know what the right thing to do here is. Perhaps someone more
> familiar with TAP testing can comment.

Yeah, I was hoping that might get a comment from Andrew.  I think if it works as-is on windows, we could just use it
thisway until it causes a problem on some platform or other.  It’s not a runtime issue, being only a build-time test,
andonly then when tap tests are enabled *and* running check-world, so nobody should really be adversely affected.  I’ll
likelyget around to testing this on Windows, but I don’t have any Windows environments set up yet, as that is still on
mytodo list. 

> + set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_test_json"));
>
> Do we need this? I guess we're not likely to bother with translations
> for a test program.

Removed.

> + /*
> + * Make stdout unbuffered to match stderr; and ensure stderr is unbuffered
> + * too, which it should already be everywhere except sometimes in Windows.
> + */
> + setbuf(stdout, NULL);
> + setbuf(stderr, NULL);
>
> Do we need this? If so, why?

For the current test setup, it is not needed.  The tap test executes this program (test_json) once per json string, and
exitsafter printing a single line.  Surely the tap test wouldn’t have problems hanging on an unflushed buffer for a
programthat has exited.  I was imagining this code might grow more complex, with the tap test communicating repeatedly
withthe same instance of test_json, such as if we extend the json parser to iterate over chunks of the input json
string.

I’ve removed this for v7, since we don’t need it yet.

> + char *json;
> + unsigned int json_len;
> + JsonLexContext *lex;
> + int client_encoding;
> + JsonParseErrorType parse_result;
> +
> + json_len = (unsigned int) strlen(str);
> + client_encoding = PQenv2encoding();
> +
> + json = strdup(str);
> + lex = makeJsonLexContextCstringLen(json, strlen(json),
> client_encoding, true /* need_escapes */);
> + parse_result = pg_parse_json(lex, &nullSemAction);
> + fprintf(stdout, _("%s\n"), (JSON_SUCCESS == parse_result ? "VALID" :
> "INVALID"));
> + return;
>
> json_len is set but not used.

You’re right.  I’ve removed it.

> Not entirely sure why we are using PQenv2encoding() here.

This program, which passes possibly json formatted strings into the parser, gets those strings from perl through the
shell. If locale settings on the machine where this runs might break something about that for a real client
application,then our test should break in the same way.  Hard-coding “C” or “POSIX” or whatever for the locale
side-stepspart of the issue we’re trying to test.  No? 

I’m leaving it as is for v7, but if you still disagree, I’ll change it.  Let me know what you want me to change it
*to*,though, as there is no obvious choice that I can see. 

> The trailing return is unnecessary.

Ok, I’ve removed it.

> I think it would be a good idea to use json_errdetail() in the failure
> case, print the error, and have the tests check that we got the
> expected error.

Oh, yeah, I like that idea.  That works, and is included in v7.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Attachment

pgsql-hackers by date:

Previous
From: Juan José Santamaría Flecha
Date:
Subject: Re: Allow to_date() and to_timestamp() to accept localized names
Next
From: Tom Lane
Date:
Subject: Re: Removing pg_pltemplate and creating "trustable" extensions