Re: Numeric multiplication overflow errors - Mailing list pgsql-hackers

From David Rowley
Subject Re: Numeric multiplication overflow errors
Date
Msg-id CAApHDvrqpg76FrAG1itcFU+ECq42nEZhtOAqJt34mg5V3HdeCA@mail.gmail.com
Whole thread Raw
In response to Numeric multiplication overflow errors  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Responses Re: Numeric multiplication overflow errors
List pgsql-hackers
On Mon, 28 Jun 2021 at 21:16, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> So the second patch fixes this by adding new numericvar_send/recv()
> functions capable of sending the full precision NumericVar's from each
> worker, without rounding. The first test case in this patch is an
> example that would round the wrong way if the result from each worker
> were rounded before being sent.

Instead of adding a send/recv function, unless I'm mistaken, it should
be possible to go the whole hog and optimizing this further by instead
of having numericvar_send(), add:

static void numericvar_serialize(StringInfo buf, const NumericVar *var)
{
 /* guts of numericvar_send() here */
}

and then rename numericvar_recv to numericvar_deserialize.

That should allow the complexity to be reduced a bit further as it'll
allow you to just serialize the NumericVar into the existing buffer
rather than having the send function create a new one only to have the
caller copy it back out again into another buffer.  It also allows you
to get rid of the sumX and sumX2 vars from the serialize functions.

> An additional benefit to this approach is that it also addresses the
> issue noted in the old code about its use of numeric_send/recv() being
> wasteful:
>
>     /*
>      * This is a little wasteful since make_result converts the NumericVar
>      * into a Numeric and numeric_send converts it back again. Is it worth
>      * splitting the tasks in numeric_send into separate functions to stop
>      * this? Doing so would also remove the fmgr call overhead.
>      */
>
> So the patch converts all aggregate serialization/deserialization code
> to use the new numericvar_send/recv() functions. I doubt that that
> gives much in the way of a performance improvement, but it makes the
> code a little neater as well as preventing overflows.

I did mean to come back to that comment one day, so thanks for doing
this and for finding/fixing the overflow bugs.

It's unfortunate that we're up against Amdahl's law here.  The best
cases to parallelise have fewer groups, so we
serialise/combine/deserialise fewer groups in the best cases meaning
the optimisation done here are not being executed as much as the
not-so-good case.  So yeah, I agree that it might be difficult to
measure.

David



pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Fix uninitialized copy_data var (src/backend/commands/subscriptioncmds.c)
Next
From: David Rowley
Date:
Subject: Re: Remove redundant initializations