Re: Optimize numeric.c mul_var() using the Karatsuba algorithm - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Optimize numeric.c mul_var() using the Karatsuba algorithm
Date
Msg-id ae48619b-a5ac-4c8d-a8db-fa8555887e74@app.fastmail.com
Whole thread Raw
In response to Re: Optimize numeric.c mul_var() using the Karatsuba algorithm  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Jun 29, 2024, at 17:25, Tom Lane wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> There's another complication though (if the threshold is made
>> configurable): the various numeric functions that use mul_var() are
>> immutable, which means that the results from the Karatsuba algorithm
>> must match those from the schoolbook algorithm exactly, for all
>> inputs.
>
> That seems like an impossible standard to meet.  What we'd probably
> have to do is enable Karatsuba only when mul_var is being asked
> for an exact (full-precision) result.  This'd complicate the check
> condition further, possibly reaching the point where there is a
> visible drag on performance in the non-Karatsuba case.

OK, noted that we should only do Karatsuba when mul_var is
being asked for an exact (full-precision) result.

> Another possible source of drag: if mul_var is now recursive,
> does it need a stack depth check?  If we can prove that the
> number of recursion levels is no more than O(log(N)) in the
> number of input digits, it's probably safe to skip that ...
> but I see no such proof here.

> (In general I find this patch seriously undercommented.)

Yes, the #define's around KARATSUBA_CONDITION needs
to be documented, but I felt this region of the patch might evolve
and need some discussion first, if this level of complexity is
acceptable.

However, I think the comments above split_var_at(),
mul_var_karatsuba_full() and mul_var_karatsuba_half()
are quite good already, what do you think?

>> There's a wider question as to how many people use such big numeric
>> values -- i.e., how many people are actually going to benefit from
>> this? I don't have a good feel for that.
>
> I have heard of people doing calculations on bignum integers in
> Postgres, but they are very few and far between --- usually that
> sort of task is better done in another language.  (No matter how
> fast we make mul_var, the general overhead of SQL expressions in
> general and type numeric in particular means it's probably not
> the right tool for heavy-duty bignum arithmetic.)
>
> There is definitely an argument to be made that this proposal is
> not worth the development effort and ongoing maintenance effort
> we'd have to sink into it.

It's a good question, I'm not sure what I think, maybe status quo is the best,
but it feels like something could be done at least.

The patch basically consists of three parts:

- Threshold function KARATSUBA_CONDITION
This is a bit unorthodox, since it's more ambitious than many popular numeric
libraries. I've only found GMP to be even more complex.

- mul_var_karatsuba_half()
Since it's provably correct using simple school-grade substitutions,
I don't think this function is unorthodox, and shouldn't need to change,
unless we change the definition of NumericVar in the future.

- mul_var_karatsuba()
This follows the canonical pseudo-code at Wikipedia for the Karatsuba
algorithm precisely, so nothing unorthodox here either.

So, I think the KARATSUBA_CONDITION require more development and maintenance
effort than the rest of the patch, since it's based on measurements
on numerous architectures, which will be different in the future.

The rest of the patch, split_var_at(), mul_var_karatsuba_full(),
mul_var_karatsuba_half(), should require much less effort to maintain,
since they are should remain the same,
even when we need to support new architectures.

I'm eager to hear your thoughts after you've also read my other reply moments ago to Dean.

Regards,
Joel



pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: Question about maxTapes & selectnewtape & dumptuples
Next
From: Alexander Lakhin
Date:
Subject: Re: speed up a logical replica setup