Re: Avoid choose invalid number of partitions (src/backend/executor/nodeAgg.c) - Mailing list pgsql-hackers

From Ranier Vilela
Subject Re: Avoid choose invalid number of partitions (src/backend/executor/nodeAgg.c)
Date
Msg-id CAEudQApWpo_zP73t2dG8BHPnwU=m=B2tZu+56jxzGeR+1uUovQ@mail.gmail.com
Whole thread Raw
In response to Re: Avoid choose invalid number of partitions (src/backend/executor/nodeAgg.c)  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
Em seg., 30 de ago. de 2021 às 07:44, David Rowley <dgrowleyml@gmail.com> escreveu:
On Wed, 30 Jun 2021 at 02:33, Ranier Vilela <ranier.vf@gmail.com> wrote:
> hash_choose_num_partitions function has issues.
> There are at least two path calls made with used_bits = 0.
> See at hashagg_spill_init.

> On Windows 64 bits (HEAD) fails with partition_prune:
> parallel group (11 tests):  reloptions hash_part partition_info explain compression resultcache indexing partition_join partition_aggregate partition_prune tuplesort
>      partition_join               ... ok         3495 ms
>      partition_prune              ... FAILED     4926 ms
>
> diff -w -U3 C:/dll/postgres/postgres_head/src/test/regress/expected/partition_prune.out C:/dll/postgres/postgres_head/src/test/regress/results/partition_prune.out
> --- C:/dll/postgres/postgres_head/src/test/regress/expected/partition_prune.out 2021-06-23 11:11:26.489575100 -0300
> +++ C:/dll/postgres/postgres_head/src/test/regress/results/partition_prune.out 2021-06-29 10:54:43.103775700 -0300
> @@ -2660,7 +2660,7 @@
>  --------------------------------------------------------------------------
>   Nested Loop (actual rows=3 loops=1)
>     ->  Seq Scan on tbl1 (actual rows=5 loops=1)
> -   ->  Append (actual rows=1 loops=5)
> +   ->  Append (actual rows=0 loops=5)
>           ->  Index Scan using tprt1_idx on tprt_1 (never executed)
>                 Index Cond: (col1 = tbl1.col1)
>           ->  Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
>
> With patch attached:
> parallel group (11 tests):  partition_info hash_part resultcache reloptions explain compression indexing partition_aggregate partition_join tuplesort partition_prune
>      partition_join               ... ok         3013 ms
>      partition_prune              ... ok         3959 ms

This failure was reported to me along with this thread so I had a look at it.
Thanks.


Firstly, I'm a bit confused as to why you think making a change in
nodeAgg.c would have any effect on a plan that does not contain any
aggregate node.
Yeah, they are unrelated.
For some reason, when checking the regress, partion_prune was ok and I mistakenly made a connection with the changes, which is wrong.


As for the regression test failure. I can recreate it, but I did have
to install VS2019 version 16.9.3 from
https://docs.microsoft.com/en-us/visualstudio/releases/2019/history

This basically boils down to the 16.9.3 compiler outputting "0" for:

#include <stdio.h>

int main(void)
{
    printf("%.0f\n", 0.59999999999999998);
    return 0;
}

but we expect it to output "1".

We name use of the provided sprintf() function in snprintf.c line 1188 with:

vallen = sprintf(convert, fmt, prec, value);

I don't see the problem in more recent versions of VS2019, but I
didn't go to the trouble of figuring out exactly which version this
was fixed in.
Regarding this test, with the last msvc, compiled for Debug, it still occurs:
     partition_join               ... ok         4267 ms
     partition_prune              ... FAILED     5270 ms
     reloptions                   ... ok          755 ms
     hash_part                    ... ok          494 ms

I still believe it's the compiler problem.

regards,
Ranier Vilela

pgsql-hackers by date:

Previous
From: "Bossart, Nathan"
Date:
Subject: Re: archive status ".ready" files may be created too early
Next
From: Peter Smith
Date:
Subject: unpack_sql_state not called?