Thread: partition table and stddev() /variance() behaviour

partition table and stddev() /variance() behaviour

From
Rajkumar Raghuwanshi
Date:
Hi,

I am getting different output for stddev/variance functions with partition tables.


CREATE TABLE part (c1 INT,c2 INT) PARTITION BY RANGE (c1);
CREATE TABLE part_p1 PARTITION OF part FOR VALUES FROM (1) TO (3);
CREATE TABLE part_p2 PARTITION OF part FOR VALUES FROM (3) TO (5);

INSERT INTO part VALUES (1,5),(2,15),(3,3),(4,17);

postgres=# SET parallel_setup_cost=0;
SET
postgres=# EXPLAIN SELECT COUNT(c1),STDDEV(c2),VARIANCE(c2) FROM part;
                                         QUERY PLAN                                        
--------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=70.36..70.37 rows=1 width=72)
   ->  Gather  (cost=70.12..70.33 rows=2 width=72)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=70.12..70.13 rows=1 width=72)
               ->  Parallel Append  (cost=0.00..56.00 rows=1882 width=8)
                     ->  Parallel Seq Scan on part_p1  (cost=0.00..23.29 rows=1329 width=8)
                     ->  Parallel Seq Scan on part_p2  (cost=0.00..23.29 rows=1329 width=8)
(7 rows)

postgres=# SELECT COUNT(c1),STDDEV(c2),VARIANCE(c2) FROM part;
 count | stddev | variance
-------+--------+----------
     4 |      0 |        0
(1 row)

postgres=#
postgres=# RESET parallel_setup_cost;
RESET
postgres=# EXPLAIN SELECT COUNT(c1),STDDEV(c2),VARIANCE(c2) FROM part;
                              QUERY PLAN                              
-----------------------------------------------------------------------
 Aggregate  (cost=121.71..121.72 rows=1 width=72)
   ->  Append  (cost=0.00..87.80 rows=4520 width=8)
         ->  Seq Scan on part_p1  (cost=0.00..32.60 rows=2260 width=8)
         ->  Seq Scan on part_p2  (cost=0.00..32.60 rows=2260 width=8)
(4 rows)

postgres=# SELECT COUNT(c1),STDDEV(c2),VARIANCE(c2) FROM part;
 count |       stddev       |      variance      
-------+--------------------+---------------------
     4 | 7.0237691685684926 | 49.3333333333333333
(1 row)

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

Re: partition table and stddev() /variance() behaviour

From
David Rowley
Date:
On 22 June 2018 at 00:18, Rajkumar Raghuwanshi
<rajkumar.raghuwanshi@enterprisedb.com> wrote:
> CREATE TABLE part (c1 INT,c2 INT) PARTITION BY RANGE (c1);
> CREATE TABLE part_p1 PARTITION OF part FOR VALUES FROM (1) TO (3);
> CREATE TABLE part_p2 PARTITION OF part FOR VALUES FROM (3) TO (5);
>
> INSERT INTO part VALUES (1,5),(2,15),(3,3),(4,17);
>
> postgres=# SET parallel_setup_cost=0;
> postgres=# SELECT COUNT(c1),STDDEV(c2),VARIANCE(c2) FROM part;
>  count | stddev | variance
> -------+--------+----------
>      4 |      0 |        0
> (1 row)

Well, that's quite surprising. It appears to be a bug in
numeric_poly_combine for machines without a working int128 type. The
parameters in accum_sum_copy are in the incorrect order.

The very minimal fix is attached, but I'll need to go look at where
the tests for this have gone.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: partition table and stddev() /variance() behaviour

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> Well, that's quite surprising. It appears to be a bug in
> numeric_poly_combine for machines without a working int128 type. The
> parameters in accum_sum_copy are in the incorrect order.

Ouch.

> The very minimal fix is attached, but I'll need to go look at where
> the tests for this have gone.

coverage.postgresql.org shows that numeric_poly_serialize/combine()
aren't exercised at all by the regression tests.  Which is embarrassing
for this case, but I'm a bit leery of trying to insist on 100% coverage.

It might be a plan to insist on buildfarm coverage for anything with
platform-varying code in it, in which case there's at least one
other undertested bit of HAVE_INT128 code in numeric.c.

            regards, tom lane


Re: partition table and stddev() /variance() behaviour

From
David Rowley
Date:
On 22 June 2018 at 02:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> Well, that's quite surprising. It appears to be a bug in
>> numeric_poly_combine for machines without a working int128 type. The
>> parameters in accum_sum_copy are in the incorrect order.
>
> Ouch.

Yeah. Looks like this function was correct when it went in. It was
9cca11c915e (v10) that caused the issue.

>> The very minimal fix is attached, but I'll need to go look at where
>> the tests for this have gone.
>
> coverage.postgresql.org shows that numeric_poly_serialize/combine()
> aren't exercised at all by the regression tests.  Which is embarrassing
> for this case, but I'm a bit leery of trying to insist on 100% coverage.
>
> It might be a plan to insist on buildfarm coverage for anything with
> platform-varying code in it, in which case there's at least one
> other undertested bit of HAVE_INT128 code in numeric.c.

I'm not familiar with what the coverage tool can do, so maybe there's
an easier way than running a script to check for 0 coverage between
#ifdef / #if and #endif inside all .c files.

This sounds like a longer-term project though, let's get this one fixed first.

I think some coverage of the numerical aggregates is a good idea, so
I've added some in the attached. I managed to get a parallel plan
going with a query to onek, which is pretty cheap to execute. I didn't
touch the bool aggregates. Maybe I should have done that too..?

I also did a quick validation of the other accum_sum_copy usages. All
others look correct.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: partition table and stddev() /variance() behaviour

From
David Rowley
Date:
On 22 June 2018 at 03:11, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I think some coverage of the numerical aggregates is a good idea, so
> I've added some in the attached. I managed to get a parallel plan
> going with a query to onek, which is pretty cheap to execute. I didn't
> touch the bool aggregates. Maybe I should have done that too..?

I should have mentioned; I tested this on a Windows VS2012 machine and
a Linux x86-64 machine and those results were fine with each. I think
the float4 and float8 aggregate results should be fairly stable since
the numbers being aggregated are all very far from the extreme range
of those types. If we do get some problems with some buildfarm members
producing slightly different results then we can probably cast the
result of the aggregate to numeric and round() them a bit.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: partition table and stddev() /variance() behaviour

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> On 22 June 2018 at 02:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> coverage.postgresql.org shows that numeric_poly_serialize/combine()
>> aren't exercised at all by the regression tests.  Which is embarrassing
>> for this case, but I'm a bit leery of trying to insist on 100% coverage.
>> It might be a plan to insist on buildfarm coverage for anything with
>> platform-varying code in it, in which case there's at least one
>> other undertested bit of HAVE_INT128 code in numeric.c.

> I think some coverage of the numerical aggregates is a good idea, so
> I've added some in the attached. I managed to get a parallel plan
> going with a query to onek, which is pretty cheap to execute. I didn't
> touch the bool aggregates. Maybe I should have done that too..?

This sort of blunderbuss testing was exactly what I *didn't* want to do.
Not only is this adding about 20x as many cycles as we need (at least for
this specific numeric_poly_combine issue), but I'm quite afraid that the
float4 and/or float8 cases will show low-order-digit irreproducibility
in the buildfarm.  I think we should look at the coverage report for the
files in question and add targeted tests to cover gaps where there's
platform-varying code, such that the buildfarm might expose problems
that were missed by the code's author.

            regards, tom lane


Re: partition table and stddev() /variance() behaviour

From
David Rowley
Date:
On 22 June 2018 at 03:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think some coverage of the numerical aggregates is a good idea, so
>> I've added some in the attached. I managed to get a parallel plan
>> going with a query to onek, which is pretty cheap to execute. I didn't
>> touch the bool aggregates. Maybe I should have done that too..?
>
> This sort of blunderbuss testing was exactly what I *didn't* want to do.
> Not only is this adding about 20x as many cycles as we need (at least for
> this specific numeric_poly_combine issue), but I'm quite afraid that the
> float4 and/or float8 cases will show low-order-digit irreproducibility
> in the buildfarm.

okay. My sniper rifle was locked away for the evening. I decided it
was best to sleep before any careful aiming was required.

I see you've done the deed already. Thanks.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: partition table and stddev() /variance() behaviour

From
Rajkumar Raghuwanshi
Date:
Thanks for commit. I have verified reported case. it is fixed now.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

On Fri, Jun 22, 2018 at 8:38 AM, David Rowley <david.rowley@2ndquadrant.com> wrote:
On 22 June 2018 at 03:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think some coverage of the numerical aggregates is a good idea, so
>> I've added some in the attached. I managed to get a parallel plan
>> going with a query to onek, which is pretty cheap to execute. I didn't
>> touch the bool aggregates. Maybe I should have done that too..?
>
> This sort of blunderbuss testing was exactly what I *didn't* want to do.
> Not only is this adding about 20x as many cycles as we need (at least for
> this specific numeric_poly_combine issue), but I'm quite afraid that the
> float4 and/or float8 cases will show low-order-digit irreproducibility
> in the buildfarm.

okay. My sniper rifle was locked away for the evening. I decided it
was best to sleep before any careful aiming was required.

I see you've done the deed already. Thanks.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services