Thread: Improve the comparison of NUMERIC data

Improve the comparison of NUMERIC data

From
Atsushi Ogawa
Date:
I think that NUMERIC datatype has a problem in the performance that
the format on Tuple(Numeric) and the format to calculate(NumericVar)
are different. I understood that to reduce I/O. However, when many
comparisons or calculations of NUMERIC are executed, the conversion
of Numeric and NumericVar becomes a bottleneck.

It is profile result when "create index on NUMERIC column" is executed:

  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 17.61     10.27    10.27 34542006     0.00     0.00  cmp_numerics
 11.90     17.21     6.94 34542006     0.00     0.00  comparetup_index
  7.42     21.54     4.33 71102587     0.00     0.00  AllocSetAlloc
  7.02     25.64     4.09 69084012     0.00     0.00  set_var_from_num
  4.87     28.48     2.84 69084012     0.00     0.00  alloc_var
  4.79     31.27     2.79 142205745     0.00     0.00  AllocSetFreeIndex
  4.55     33.92     2.65 34542004     0.00     0.00  cmp_abs
  4.07     36.30     2.38 71101189     0.00     0.00  AllocSetFree
  3.83     38.53     2.23 69084012     0.00     0.00  free_var

The create index command executes many comparisons of Numeric values.
Functions other than comparetup_index spent a lot of cycles for
conversion from Numeric to NumericVar.

An attached patch enables the comparison of Numeric values without
executing conversion to NumericVar. The execution time of that SQL
becomes half.

o Test SQL (index_test table has 1,000,000 tuples)
 create index index_test_idx on index_test(num_col);

o Test results (executed the test five times)
(1)PentiumIII
 original: 39.789s  36.823s  36.737s  37.752s  37.019s
 patched : 18.560s  19.103s  18.830s  18.408s  18.853s

(2)Pentium4
 original: 16.349s  14.997s  12.979s  13.169s  12.955s
 patched :  7.005s   6.594s   6.770s   6.740s   6.828s

(3)Itanium2
 original: 15.392s  15.447s  15.350s  15.370s  15.417s
 patched :  7.413s   7.330s   7.334s   7.339s   7.339s

(4)Ultra Sparc
 original: 64.435s  59.336s  59.332s  58.455s  59.781s
 patched : 28.630s  28.666s  28.983s  28.744s  28.595s

regards,

--- Atsushi Ogawa

Attachment

Re: Improve the comparison of NUMERIC data

From
Tom Lane
Date:
Atsushi Ogawa <atsushi.ogawa@gmail.com> writes:
> An attached patch enables the comparison of Numeric values without
> executing conversion to NumericVar.

This is likely to be broken entirely by the recent proposals to modify
the on-disk representation of Numeric.

Rather than trying to share code, it'd likely be better to make a
separate routine that can directly compare two values in the packed
(on-disk) representation, and modify the externally-callable entry
points to use that.  That way avoids imposing constraints on how
different the packed and unpacked representations can be.

            regards, tom lane

Re: Improve the comparison of NUMERIC data

From
Atsushi Ogawa
Date:
2005/11/9, Tom Lane <tgl@sss.pgh.pa.us>:
> Atsushi Ogawa <atsushi.ogawa@gmail.com> writes:
> > An attached patch enables the comparison of Numeric values without
> > executing conversion to NumericVar.
>
> Rather than trying to share code, it'd likely be better to make a
> separate routine that can directly compare two values in the packed
> (on-disk) representation, and modify the externally-callable entry
> points to use that.  That way avoids imposing constraints on how
> different the packed and unpacked representations can be.

OK. I will make a separate routine.

regards,

--- Atsushi Ogawa

Re: Improve the comparison of NUMERIC data

From
Tom Lane
Date:
Atsushi Ogawa <atsushi.ogawa@gmail.com> writes:
> 2005/11/9, Tom Lane <tgl@sss.pgh.pa.us>:
>> Rather than trying to share code, it'd likely be better to make a
>> separate routine that can directly compare two values in the packed
>> (on-disk) representation, and modify the externally-callable entry
>> points to use that.  That way avoids imposing constraints on how
>> different the packed and unpacked representations can be.

> OK. I will make a separate routine.

Also, you might want to wait to see where the discussions on changing
Numeric's representation end up, before you spend time on doing the
coding.  We should be able to resolve that long before 8.2 comes out,
so there's plenty of time.

            regards, tom lane

Re: Improve the comparison of NUMERIC data

From
Bruce Momjian
Date:
Patch applied.  Thanks.

---------------------------------------------------------------------------


pAtsushi Ogawa wrote:
> I think that NUMERIC datatype has a problem in the performance that
> the format on Tuple(Numeric) and the format to calculate(NumericVar)
> are different. I understood that to reduce I/O. However, when many
> comparisons or calculations of NUMERIC are executed, the conversion
> of Numeric and NumericVar becomes a bottleneck.
>
> It is profile result when "create index on NUMERIC column" is executed:
>
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls   s/call   s/call  name
>  17.61     10.27    10.27 34542006     0.00     0.00  cmp_numerics
>  11.90     17.21     6.94 34542006     0.00     0.00  comparetup_index
>   7.42     21.54     4.33 71102587     0.00     0.00  AllocSetAlloc
>   7.02     25.64     4.09 69084012     0.00     0.00  set_var_from_num
>   4.87     28.48     2.84 69084012     0.00     0.00  alloc_var
>   4.79     31.27     2.79 142205745     0.00     0.00  AllocSetFreeIndex
>   4.55     33.92     2.65 34542004     0.00     0.00  cmp_abs
>   4.07     36.30     2.38 71101189     0.00     0.00  AllocSetFree
>   3.83     38.53     2.23 69084012     0.00     0.00  free_var
>
> The create index command executes many comparisons of Numeric values.
> Functions other than comparetup_index spent a lot of cycles for
> conversion from Numeric to NumericVar.
>
> An attached patch enables the comparison of Numeric values without
> executing conversion to NumericVar. The execution time of that SQL
> becomes half.
>
> o Test SQL (index_test table has 1,000,000 tuples)
>  create index index_test_idx on index_test(num_col);
>
> o Test results (executed the test five times)
> (1)PentiumIII
>  original: 39.789s  36.823s  36.737s  37.752s  37.019s
>  patched : 18.560s  19.103s  18.830s  18.408s  18.853s
>
> (2)Pentium4
>  original: 16.349s  14.997s  12.979s  13.169s  12.955s
>  patched :  7.005s   6.594s   6.770s   6.740s   6.828s
>
> (3)Itanium2
>  original: 15.392s  15.447s  15.350s  15.370s  15.417s
>  patched :  7.413s   7.330s   7.334s   7.339s   7.339s
>
> (4)Ultra Sparc
>  original: 64.435s  59.336s  59.332s  58.455s  59.781s
>  patched : 28.630s  28.666s  28.983s  28.744s  28.595s
>
> regards,
>
> --- Atsushi Ogawa

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073