Re: [HACKERS] regression bigtest needs very long time - Mailing list pgsql-hackers

From wieck@debis.com (Jan Wieck)
Subject Re: [HACKERS] regression bigtest needs very long time
Date
Msg-id m10yvQr-0003ktC@orion.SAPserv.Hamburg.dsh.de
Whole thread Raw
In response to Re: [HACKERS] regression bigtest needs very long time  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: [HACKERS] regression bigtest needs very long time
List pgsql-hackers
>
> > Bruce Momjian wrote:
> >
> > > Oh, I didn't realize this.  We certainly should think about reducing the
> > > time spent on it, though it is kind of lame to be testing numeric in a
> > > precision that is less than the standard int4 type.
> >
> >     We certainly should think about a general speedup of NUMERIC.
>
> How would we do that?  I assumed it was already pretty optimized.

    By reimplementing the entire internals from scratch again :-)

    For now the  db  storage  format  is  something  like  packed
    decimal.   Two  digits  fit  into  one  byte. Sign, scale and
    precision are stored in a header. For computations, this gets
    unpacked  so  every  digit  is stored in one byte and all the
    computations are performed on the digit level and base 10.

    Computers are good in performing computations in other  bases
    (hex,  octal  etc.).  And we can assume that any architecture
    where PostgreSQL can be installed supports 32  bit  integers.
    Thus,  a good choice for an internal base whould be 10000 and
    the digits(10000) stored in small integers.

    1.  Converting between decimal (base 10) and  base  10000  is
        relatively simple. One digit(10000) holds 4 digits(10).

    2.  Computations  using a 32 bit integer for carry/borrow are
        safe because the biggest result  of  a  one  digit(10000)
        add/subtract/multiply cannot exceed the 32 bits.

    The  speedup  (I expect) results from the fact that the inner
    loops of add,  subtract  and  multiply  will  then  handle  4
    decimal digits per cycle instead of one! Doing a

        1234.5678 + 2345.6789

    then needs 2 internal cycles instead of 8. And

        100.123 + 12030.12345

    needs  4  cycles instead of 10 (because the decimal point has
    the same meaning in base  10000  the  last  value  is  stored
    internally  as  short  ints 1, 2030, 1234, 5000). This is the
    worst case and it still saved 60% of the innermost cycles!

    Rounding and checking for overflow will  get  a  little  more
    difficult, but I think it's worth the efford.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

pgsql-hackers by date:

Previous
From: Michael Meskes
Date:
Subject: Leaving for vacation
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] regression bigtest needs very long time