Re: parallel.c is not marked as test covered - Mailing list pgsql-hackers
From | Clément Prévost |
---|---|
Subject | Re: parallel.c is not marked as test covered |
Date | |
Msg-id | CABaKae-eCW4d7SmgxoA8oBan5hhRnb0_qmHtQ580ANfPPNo8tw@mail.gmail.com Whole thread Raw |
In response to | Re: parallel.c is not marked as test covered (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>) |
Responses |
Re: parallel.c is not marked as test covered
|
List | pgsql-hackers |
On Sun, May 15, 2016 at 12:53:13PM +0000, Clément Prévost wrote:
> After some experiments, I found out that, for my setup (9b7bfc3a88ef7b), a
> parallel seq scan is used given both parallel_setup_cost
> and parallel_tuple_cost are set to 0 and given that the table is at least 3
> times as large as the biggest test table tenk1.
That worked because it pushed the table over the create_plain_partial_paths()
1000-block (8 MiB) threshold; see
http://www.postgresql.org/message-id/26842.1462941248@sss.pgh.pa.us
While the way you tested this is not wrong, I recommend instead querying an
inheritance parent if that achieves no less test coverage. "SELECT count(*)
FROM a_star" gets a parallel plan under your cost parameters. That avoids
building and discarding a relatively-large table.
Nice, this works well and is faster than copying tenk1 aggressively.
> The attached patch is a regression test using this method that is
> reasonably small and fast to run. I also hid the workers count from the
> explain output when costs are disabled as suggested by Tom Lane and Robert
> Haas on this same thread (
> http://www.postgresql.org/message-id/CA+TgmobBQS4ss3+CwoZOKgbsBqKfRndwc=hLiALAep5aXQCTDA@mail.gmail.com
> ).
It seems wrong to me for COSTS OFF to disable output that is not a cost
estimate. file_fdw did that, but I don't want to proliferate that loose
interpretation of COSTS OFF.
Agreed, I reverted it.
Unlike the example in the linked-to message, this test won't experience
variable worker count. For the query you used and for the a_star query, low
input size makes the planner request exactly one worker. If later planner
enhancements give this query a configuration-specific worker count, the test
could capture EXPLAIN output with PL/pgSQL and compare it to a regex.
I also considered setting max_parallel_degree to 1 to make the test more futur-proof but there is a rather long discussion on the setting name (https://www.postgresql.org/message-id/20160424035859.GB29404@momjian.us) so I can't decide on my own if it's worth the explicit dependency.
> --- /dev/null
> +++ b/src/test/regress/sql/select_parallel.sql
> @@ -0,0 +1,22 @@
> +--
> +-- PARALLEL
> +--
> +
> +-- setup parallel test
> +create unlogged table tenk_parallel as table tenk1 with no data;
> +set parallel_setup_cost=0;
> +set parallel_tuple_cost=0;
> +
> +-- create a table with enough data to trigger parallel behavior
> +insert into tenk_parallel
> + select tenk1.* from tenk1, generate_series(0,2);
> +
> +explain (costs off)
> + select count(*) from tenk_parallel;
> +select count(*) from tenk_parallel;
As of today, "make installcheck" passes with "default_transaction_isolation =
serializable" in postgresql.conf. Let's preserve that property. You could
wrap the parallel queries in "begin isolation level repeatable read;"
... "commit;", or you could SET default_transaction_isolation itself.
I did add the transaction, but I don't get why this specific test should use this specific transaction isolation level.
On Tue, Jun 7, 2016 at 7:36 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
Please see also my message
<https://www.postgresql.org/message-id/92bec2a5-6d5b-6630-ce4d-2ed76f357973@2ndquadrant.com>
with a similar solution. That test also contains a query that raises an
error, which is another code path that we should cover.
Added.
Attachment
pgsql-hackers by date: