Thread: Force testing of query jumbling code in TAP tests

Force testing of query jumbling code in TAP tests

From
Michael Paquier
Date:
Hi all,

As mentioned two times on this thread, there is not much coverage for
the query jumbling code, even if it is in core:
https://www.postgresql.org/message-id/Y5BHOUhX3zTH/ig6@paquier.xyz

Ths issue is that we have the options to enable it, but only
pg_stat_statements is able to enable and stress it.  This causes
coverage to be missed for all query patterns that are not covered
directly by pg_stat_statements, like XML expressions, various DML
patterns, etc.  More aggressive testing would also ensure that no
nodes are marked as no_query_jumble while they should be included in a
computation.

Attached is a patch to improve that.  The main regression database is
able to cover everything, basically, so I'd like to propose the
addition of some extra configuration in 027_stream_regress.pl to
enable pg_stat_statements.  This could be added in the pg_upgrade
tests, but that felt a bit less adapted here.  Or can people think
about cases where checking pg_stat_statements makes more sense after
an upgrade or on a standby?  One thing that makes sense for a standby
is to check that the contents of pg_stat_statements are empty?

With this addition, the query jumbling gets covered at 95%~, while
https://coverage.postgresql.org/src/backend/nodes/index.html reports
currently 35%.

Thoughts or comments?
--
Michael

Attachment

Re: Force testing of query jumbling code in TAP tests

From
Andres Freund
Date:
Hi,

On 2023-02-13 14:00:36 +0900, Michael Paquier wrote:
> With this addition, the query jumbling gets covered at 95%~, while
> https://coverage.postgresql.org/src/backend/nodes/index.html reports
> currently 35%.
> 
> Thoughts or comments?

Shouldn't there at least be some basic verification of pg_stat_statements
output being sane after running the test? Even if that's perhaps just actually
printing the statements.

Greetings,

Andres Freund



Re: Force testing of query jumbling code in TAP tests

From
Michael Paquier
Date:
On Mon, Feb 13, 2023 at 09:45:12AM -0800, Andres Freund wrote:
> Shouldn't there at least be some basic verification of pg_stat_statements
> output being sane after running the test? Even if that's perhaps just actually
> printing the statements.

There is a total of 20k entries in pg_stat_statements if the max is
high enough to store everything.  Only dumping the first 100
characters of each query generates at least 1MB worth of logs, which
would bloat a lot of the buildfarm in each run.  So I would not do
that.  One thing may be perhaps to show a count of the queries in five
categories: select, insert, delete, update and the rest?
--
Michael

Attachment

Re: Force testing of query jumbling code in TAP tests

From
Andres Freund
Date:
Hi,

On 2023-02-14 16:04:16 +0900, Michael Paquier wrote:
> On Mon, Feb 13, 2023 at 09:45:12AM -0800, Andres Freund wrote:
> > Shouldn't there at least be some basic verification of pg_stat_statements
> > output being sane after running the test? Even if that's perhaps just actually
> > printing the statements.
> 
> There is a total of 20k entries in pg_stat_statements if the max is
> high enough to store everything.  Only dumping the first 100
> characters of each query generates at least 1MB worth of logs, which
> would bloat a lot of the buildfarm in each run.  So I would not do
> that.  One thing may be perhaps to show a count of the queries in five
> categories: select, insert, delete, update and the rest?

I didn't mean printing in the sense of outputting the statements to the tap
log. Maybe creating a temp table or such for all the queries. And yes, then
doing some top-level analysis on it like you describe sounds like a good idea.

Greetings,

Andres Freund



Re: Force testing of query jumbling code in TAP tests

From
Michael Paquier
Date:
On Tue, Feb 14, 2023 at 10:11:21AM -0800, Andres Freund wrote:
> I didn't mean printing in the sense of outputting the statements to the tap
> log. Maybe creating a temp table or such for all the queries. And yes, then
> doing some top-level analysis on it like you describe sounds like a good idea.

One idea would be something like that, that makes sure that reports
are generated for the most common query patterns:
WITH select_stats AS
 (SELECT upper(substr(query, 1, 6)) AS select_query
    FROM pg_stat_statements
    WHERE upper(substr(query, 1, 6)) IN ('SELECT', 'UPDATE',
                                         'INSERT', 'DELETE',
                                         'CREATE'))
 SELECT select_query, count(select_query) > 1 AS some_rows
   FROM select_stats
   GROUP BY select_query ORDER BY select_query;

Other ideas are welcome.  At least this would be a start.
--
Michael

Attachment

Re: Force testing of query jumbling code in TAP tests

From
Michael Paquier
Date:
On Thu, Feb 16, 2023 at 02:08:42PM +0900, Michael Paquier wrote:
> Other ideas are welcome.  At least this would be a start.

The main idea of the patch is here:

> +# Check some data from pg_stat_statements.
> +$node_primary->safe_psql('postgres', 'CREATE EXTENSION pg_stat_statements');
> +# This gathers data based on the first characters for some common query types,
> +# providing coverage for SELECT, DMLs, and some DDLs.
> +my $result = $node_primary->safe_psql(
> +    'postgres',
> +    qq{WITH select_stats AS
> +  (SELECT upper(substr(query, 1, 6)) AS select_query
> +     FROM pg_stat_statements
> +     WHERE upper(substr(query, 1, 6)) IN ('SELECT', 'UPDATE',
> +                                          'INSERT', 'DELETE',
> +                                          'CREATE'))
> +  SELECT select_query, count(select_query) > 1 AS some_rows
> +    FROM select_stats
> +    GROUP BY select_query ORDER BY select_query;});
> +is( $result, qq(CREATE|t
> +DELETE|t
> +INSERT|t
> +SELECT|t
> +UPDATE|t), 'check contents of pg_stat_statements on regression database');

Are there any objections to do what's proposed in the patch and
improve the testing coverage of query jumbling by default?
--
Michael

Attachment

Re: Force testing of query jumbling code in TAP tests

From
Michael Paquier
Date:
On Tue, Feb 28, 2023 at 02:06:05PM +0900, Michael Paquier wrote:
> Are there any objections to do what's proposed in the patch and
> improve the testing coverage of query jumbling by default?

Well, done this one as of d28a449.  More validation tests could always
be added later if there are better ideas.  Coverage of this code has
gone up to 94.4% at the end:
https://coverage.postgresql.org/src/backend/nodes/queryjumblefuncs.funcs.c.gcov.html
--
Michael

Attachment