Re: parallel.c is not marked as test covered - Mailing list pgsql-hackers

From Clément Prévost
Subject Re: parallel.c is not marked as test covered
Date
Msg-id CABaKae-eCW4d7SmgxoA8oBan5hhRnb0_qmHtQ580ANfPPNo8tw@mail.gmail.com
Whole thread Raw
In response to Re: parallel.c is not marked as test covered  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: parallel.c is not marked as test covered
List pgsql-hackers
On Sun, May 15, 2016 at 12:53:13PM +0000, Clément Prévost wrote:
> After some experiments, I found out that, for my setup (9b7bfc3a88ef7b), a
> parallel seq scan is used given both parallel_setup_cost
> and parallel_tuple_cost are set to 0 and given that the table is at least 3
> times as large as the biggest test table tenk1.

That worked because it pushed the table over the create_plain_partial_paths()
1000-block (8 MiB) threshold; see
http://www.postgresql.org/message-id/26842.1462941248@sss.pgh.pa.us

While the way you tested this is not wrong, I recommend instead querying an
inheritance parent if that achieves no less test coverage.  "SELECT count(*)
FROM a_star" gets a parallel plan under your cost parameters.  That avoids
building and discarding a relatively-large table.
 
Nice, this works well and is faster than copying tenk1 aggressively.
 
> The attached patch is a regression test using this method that is
> reasonably small and fast to run. I also hid the workers count from the
> explain output when costs are disabled as suggested by Tom Lane and Robert
> Haas on this same thread (
> http://www.postgresql.org/message-id/CA+TgmobBQS4ss3+CwoZOKgbsBqKfRndwc=hLiALAep5aXQCTDA@mail.gmail.com
> ).

It seems wrong to me for COSTS OFF to disable output that is not a cost
estimate.  file_fdw did that, but I don't want to proliferate that loose
interpretation of COSTS OFF.
 
Agreed, I reverted it. 

Unlike the example in the linked-to message, this test won't experience
variable worker count.  For the query you used and for the a_star query, low
input size makes the planner request exactly one worker.  If later planner
enhancements give this query a configuration-specific worker count, the test
could capture EXPLAIN output with PL/pgSQL and compare it to a regex.
 
I also considered setting max_parallel_degree to 1 to make the test more futur-proof but there is a rather long discussion on the setting name (https://www.postgresql.org/message-id/20160424035859.GB29404@momjian.us) so I can't decide on my own if it's worth the explicit dependency. 
 
> --- /dev/null
> +++ b/src/test/regress/sql/select_parallel.sql
> @@ -0,0 +1,22 @@
> +--
> +-- PARALLEL
> +--
> +
> +-- setup parallel test
> +create unlogged table tenk_parallel as table tenk1 with no data;
> +set parallel_setup_cost=0;
> +set parallel_tuple_cost=0;
> +
> +-- create a table with enough data to trigger parallel behavior
> +insert into tenk_parallel
> +  select tenk1.* from tenk1, generate_series(0,2);
> +
> +explain (costs off)
> +  select count(*) from tenk_parallel;
> +select count(*) from tenk_parallel;

As of today, "make installcheck" passes with "default_transaction_isolation =
serializable" in postgresql.conf.  Let's preserve that property.  You could
wrap the parallel queries in "begin isolation level repeatable read;"
... "commit;", or you could SET default_transaction_isolation itself.

I did add the transaction, but I don't get why this specific test should use this specific transaction isolation level.

On Tue, Jun 7, 2016 at 7:36 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
Please see also my message
<https://www.postgresql.org/message-id/92bec2a5-6d5b-6630-ce4d-2ed76f357973@2ndquadrant.com>
with a similar solution.  That test also contains a query that raises an
error, which is another code path that we should cover.

Added.
Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: COMMENT ON, psql and access methods
Next
From: Michael Paquier
Date:
Subject: Re: COMMENT ON, psql and access methods