RE: BUG #16627: union all with partioned table yields random aggregate results - Mailing list pgsql-bugs

From Brian Kanaga
Subject RE: BUG #16627: union all with partioned table yields random aggregate results
Date
Msg-id 25ff6540eef8d1415f3f11e5e1e35d2c@mail.gmail.com
Whole thread Raw
In response to BUG #16627: union all with partioned table yields random aggregate results  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
If I may add (referencing the files I just sent): the parallel append and
parallel index scan is unique to the issue query.

-----Original Message-----
From: Brian Kanaga [mailto:kanaga@consumeracquisition.com]
Sent: Wednesday, September 23, 2020 8:48 AM
To: 'David Rowley'; 'PostgreSQL mailing lists'
Subject: RE: BUG #16627: union all with partioned table yields random
aggregate results

Thank you David for your input here!

Attached is a much-simplified version of the problem query along with screen
shots of plans and what tweaks to the query produce changes to the plan.

Both your hints fix it.  Also if I force the partitioned table subquery to
be evaluated differently by placing an "order by" on it, that fixes it.

I have tried to recreate this for you in a dump file but I could not get it
to happen without including gobs of data.  Even tweaking the plan I could
not get the filter part evaluating to match the offending plan.

Hopefully these files shed some light.

Oh - and float vs more precision wasn't it.   The results variance is
enormous:
correct results
----------------------------
27839.83example runs with the issue (each row here is a different run)
----------------------------
27839.8352267.54
63620.24
30139.4
27839.83 <- randomly the right answer will be returned
30139.4
46044.1
30139.4
34350.93
39673.62
32239.73
27839.83 <- again
43351.18

-----Original Message-----
From: David Rowley [mailto:dgrowleyml@gmail.com]
Sent: Tuesday, September 22, 2020 10:10 PM
To: kanaga@consumeracquisition.com; PostgreSQL mailing lists
Subject: Re: BUG #16627: union all with partioned table yields random
aggregate results

On Wed, 23 Sep 2020 at 03:30, PG Bug reporting form <noreply@postgresql.org>
wrote:
>
>                         -- these columns will be missing in some tbls
>                         avg(cast(relevance_score as float4)) as
> relevance_score,
>                         avg(cast(positive_feedback as float4)) as
> positive_feedback,
>                         avg(cast(negative_feedback as float4)) as
> negative_feedback,
>

In addition to Tom's request, it would also be good to understand more about
these "random" results.

I see you have a number of aggregate functions run on the results. It would
be good to understand if all of them are incorrect in the parallel version
of the query or if it's just some of them, e.g just the float4 ones I've
quoted above.  Essentially there is some sense of randomness to the results
of floating-point aggregation in all cases due to the lossy precision of
floating points. This is much more likely to be seen during parallel
aggregation as how each tuple is aggregated is highly likely to differ for
each run of the query. That order is much more likely to be stable in the
serial query.

If you are just seeing the floating-point aggregates vary then it might be
worth casting floating-point values to NUMERIC to see if that gets rid of
the randomness.

If that's not the issue then you may also like to help try to narrow the
scope of the problem. Does the problem still occur when you disable
enable_parallel_hash? How about enable_parallel_append?

There was also some mangling of the query plans you posted. I took the time
to format those again and attached the result.

Looking over the row counts between each version of the plan I didn't find
anything that would indicate additional rows were seen in the parallel
version of the plan. However, there's some precision loss which I couldn't
account for around the actual rows being divided by the parallel workers
with integer division, so that process is not 100% accurate.

David



pgsql-bugs by date:

Previous
From: Brian Kanaga
Date:
Subject: RE: BUG #16627: union all with partioned table yields random aggregate results
Next
From: Emmanuel OÑA
Date:
Subject: mysql-fdw fail