Thread: NUMERIC type efficiency problem
I noticed the storage format for the numeric type is rather inefficient: typedef struct NumericData { int32 varlen; /* Variable size */ int16 n_weight; /* Weight of 1st digit */ uint16 n_rscale; /* Result scale */ uint16 n_sign_dscale; /* Sign + display scale */ unsignedchar n_data[1]; /* Digit data (2 decimal digits/byte) */ } NumericData; typedef NumericData *Numeric; Oracle uses a similar variable length format for all numeric types, and they document its storage requirement as 2 + (sig digits/2) bytes. One byte is used for the column length, one byte for the exponent, a variable number of bytes for the significant digits. A zero value uses two bytes total in Oracle, where in the current version of PostgreSQL it uses ten bytes. Given the pending demise of the money type, the remaining alternative is rather wasteful for use in large financial applications. Is there a reason why varlen has to be an int32? uint8 would be more than enough. The other three fields could be int8 as well. I do not understand why we need four header fields - a much more efficient decimal type could be implemented as follows: typedef struct DecimalData {int8 varlen; /* variable size */int8 d_sign_exponent; /* 1 bit sign, 7 bit exponent */int8 d_mantissa[1]; /* variable precision binary integer mantissa */ }; Value represented is (-1 ^ sign)*(mantissa)*(10 ^ exponent). This would be more space efficient than Oracle and would support precisions up to DECIMAL(63). Having a reasonable maximum precision would allow a fixed length internal representation which make processing *much* faster* by using binary arithmetic and eliminating the necessity to palloc() buffers for every temporary result. (Aside: Doesn't the current numeric type use up memory in a hurry in a large sum(numeric_column) query? - Or are all those digitbuf_free()'s actually being cleaned up? And shouldn't the type operator calling convention be changed to pass a result buffer so these palloc()'s could be mostly avoided? ) As an even faster, lower max precision alternative: typedef struct FastDecimalData {int64 fd_mantissa; int8 fd_sign;int8 fd_exponent; }; Value represented is (-1 ^ sign)*(mantissa)*(10 ^ exponent). This would support precisions up to DECIMAL(18). Intermediate results could be stored using a 128 bit format to avoid loss of precision. Any comments? - Mark Butler
Mark Butler <butlerm@middle.net> writes: > I noticed the storage format for the numeric type is rather inefficient: > ... > A zero value uses two bytes total in Oracle, where in the current version of > PostgreSQL it uses ten bytes. Yawn ... given row overhead, alignment padding, etc, this is not nearly as big a deal as you make it ... > Is there a reason why varlen has to be an int32? Yes. That's what all varlena types use. > The other three fields could be int8 as well. If we were willing to restrict numeric values to a much tighter range, perhaps so. I rather like a numeric type that can handle ranges wider than double precision, however. > Having a reasonable maximum precision would allow a fixed > length internal representation which make processing *much* faster* by using > binary arithmetic and eliminating the necessity to palloc() buffers for every > temporary result. I don't have any objection in principle to an additional datatype "small numeric", or some such name, with a different representation. I do object to emasculating the type we have. A more significant point is that you have presented no evidence to back up your claim that this would be materially faster than the existing type. I doubt that the extra pallocs are all that expensive. (I think it'd be far more helpful to reimplement numeric using base-10000 representation --- four decimal digits per int16 --- and then eliminate the distinction between storage format and computation format. See past discussions in the pghackers archives.) regards, tom lane
Tom Lane wrote: > Yawn ... given row overhead, alignment padding, etc, this is not nearly > as big a deal as you make it ... For a table with ten decimal columns with an average of 5 significant digits apiece, each row could be reduced from ~170 bytes to about ~90 bytes, which could be rather significant, I would think. (In Oracle such a row takes ~55 bytes.) By the way, is alignment padding really a good use of disk space? Surely storage efficiency trumps minor CPU overhead in any I/O bound database. Or are there other considerations? (like processor portability perhaps?) > I don't have any objection in principle to an additional datatype "small > numeric", or some such name, with a different representation. I do > object to emasculating the type we have. Surely we can't get rid of it, but it might be a good idea to make NUMERIC(p) map to multiple representations, given that it is a ANSI standard data-type. I suggest using a system like the FLOAT(p) -> float4 / float8 mapping. Columns declared with precisions higher than 16 or so could map to the current unrestricted representation, and columns with more typical precisions could map to a more efficient fixed length representation. > A more significant point is that you have presented no evidence to back > up your claim that this would be materially faster than the existing > type. I doubt that the extra pallocs are all that expensive. (I think > it'd be far more helpful to reimplement numeric using base-10000 > representation --- four decimal digits per int16 --- and then eliminate > the distinction between storage format and computation format. See past > discussions in the pghackers archives.) That sounds like it would help a lot. I certainly don't have any hard evidence yet. Thanks for the pointer. - Mark Butler
Mark Butler <butlerm@middle.net> writes: > By the way, is alignment padding really a good use of disk space? Surely > storage efficiency trumps minor CPU overhead in any I/O bound database. Weren't you just complaining about excess palloc's ;-) ? Seriously, I have no idea about the costs/benefits of aligning data on disk. That decision was made way back in the Berkeley days, and hasn't been questioned since then AFAIK. Feel free to experiment if you are interested. > I suggest using a system like the FLOAT(p) -> float4 / float8 mapping. > Columns declared with precisions higher than 16 or so could map to the > current unrestricted representation, and columns with more typical > precisions could map to a more efficient fixed length representation. Given that the "more efficient representation" would only be internal to calculation subroutines, it seems easier to exploit preallocation at runtime. This is already done in a number of places in Postgres. It'd look something like { digit *tmp; digit tmpbuf[MAX_FIXED_DIGITS]; if (digits_needed > MAX_FIXED_DIGITS) tmp = palloc(...); else tmp = tmpbuf; // use tmp here if (tmp != tmpbuf) pfree(tmp);} Ugly, but most of the ugliness could be hidden inside a couple of macros. Again, though, I wouldn't bother with this until I had some profiles proving that the palloc overhead is worth worrying about. regards, tom lane
Tom Lane wrote: > A more significant point is that you have presented no evidence to back > up your claim that this would be materially faster than the existing > type. I doubt that the extra pallocs are all that expensive. (I think > it'd be far more helpful to reimplement numeric using base-10000 > representation --- four decimal digits per int16 --- and then eliminate > the distinction between storage format and computation format. See past > discussions in the pghackers archives.) I did several tests with functions designed to sum the number 12345 a million times. The results are as follows (Pentium II 450, Redhat 6.2): Postgres PL/PGSQL original numeric: 14.8 seconds Postgres PL/PGSQL modified numeric: 11.0 seconds Postgres PL/PGSQL float8: 10.7 seconds GNU AWK: 2.5 seconds Oracle PL/SQL number: 2.0 seconds The modified Postgres numeric type is the original source code modified to use a 32 digit NumericVar attribute digit buffer that eliminates palloc()/pfree() calls when ndigits < 32. Surely those are performance differences worth considering... - Mark Butler Note: The functions are as follows, all called with 12345 as a parameter, except for the awk program, which has it hard coded: PostgreSQL ========== create function test_f1(float8) returns float8 as ' declare i integer; val float8; begin val := 0; for i in 1 .. 1000000 loop val := val + $1; end loop; return val; end;' language 'plpgsql'; create function test_f2(numeric) returns numeric as ' declare i integer; val numeric; begin val := 0; for i in 1 .. 1000000 loop val := val + $1; end loop; return val; end;' language 'plpgsql'; Awk === BEGIN { val = 0; p = 12345; for(i = 1; i <= 1000000; i++) { val = val + p; } printf("%20f\n", val); } Oracle ====== create or replace function test_f2(p number) return number is i number; val number; begin val := 0; for i in 1 .. 1000000 loop val := val + p; end loop; return val; end; /
Mark Butler wrote: > I did several tests with functions designed to sum the number 12345 a million > times. The results are as follows (Pentium II 450, Redhat 6.2): > > Postgres PL/PGSQL original numeric: 14.8 seconds > Postgres PL/PGSQL modified numeric: 11.0 seconds > Postgres PL/PGSQL float8: 10.7 seconds > GNU AWK: 2.5 seconds > Oracle PL/SQL number: 2.0 seconds I have a new result: Postgres PL/PGSQL integer: 7.5 seconds I do not know what to attribute the large difference between float8 and int to other than pg_alloc overhead used in the calling convention for float8. Commments? - Mark Butler
Am Freitag, 13. April 2001 23:16 schrieben Sie: > Tom Lane wrote: > > A more significant point is that you have presented no evidence to back > > up your claim that this would be materially faster than the existing > > type. I doubt that the extra pallocs are all that expensive. (I think > > it'd be far more helpful to reimplement numeric using base-10000 > > representation --- four decimal digits per int16 --- and then eliminate > > the distinction between storage format and computation format. See past > > discussions in the pghackers archives.) > > I did several tests with functions designed to sum the number 12345 a > million times. The results are as follows (Pentium II 450, Redhat 6.2): > > Postgres PL/PGSQL original numeric: 14.8 seconds > Postgres PL/PGSQL modified numeric: 11.0 seconds > Postgres PL/PGSQL float8: 10.7 seconds > GNU AWK: 2.5 seconds > Oracle PL/SQL number: 2.0 seconds > > The modified Postgres numeric type is the original source code modified to > use a 32 digit NumericVar attribute digit buffer that eliminates > palloc()/pfree() calls when ndigits < 32. > > Surely those are performance differences worth considering... I tested that on a similar configuration (P-III 450) and got the same results. When the addition is removed from the loop and replaced with a simple assignment, the total execution time goes down to ~6.5 seconds. That means that the modified numeric is nearly twice as fast, sure worth considering that. -- ===================================================Mario Weilguni KPNQwest Austria GmbH Senior Engineer Web Solutions Nikolaiplatz 4 tel: +43-316-813824 8020 graz, austria fax: +43-316-813824-26 http://www.kpnqwest.at e-mail: mario.weilguni@kpnqwest.com ===================================================
Mario Weilguni wrote: > I tested that on a similar configuration (P-III 450) and got the same > results. When the addition is removed from the loop and replaced with a > simple assignment, the total execution time goes down to ~6.5 seconds. That > means that the modified numeric is nearly twice as fast, sure worth > considering that. I am embarrassed to admit I had an undeleted overloaded function that caused me to time the wrong function. The correct numbers should be: Postgres PL/PGSQL original numeric: 14.8 seconds Postgres PL/PGSQL modified numeric: 14.0 seconds Postgres PL/PGSQL float8: 10.7 seconds GNU AWK: 2.5 seconds Oracle PL/SQL number: 2.0 seconds This means that Tom Lane was absolutely right - for the current numeric type implementation, palloc() overhead is not a dominant concern. A serious solution needs to change the internal format to use a larger base, as Tom suggested. - Mark Butler
Mark Butler <butlerm@middle.net> writes: > ... The correct numbers should be: > Postgres PL/PGSQL original numeric: 14.8 seconds > Postgres PL/PGSQL modified numeric: 14.0 seconds > Postgres PL/PGSQL float8: 10.7 seconds > GNU AWK: 2.5 seconds > Oracle PL/SQL number: 2.0 seconds > This means that Tom Lane was absolutely right - for the current numeric type > implementation, palloc() overhead is not a dominant concern. A serious > solution needs to change the internal format to use a larger base, as Tom > suggested. What do you get if you use int4 in PL/PGSQL? The above numbers look to me like the real problem may be PL/PGSQL interpretation overhead, and not the datatype at all... regards, tom lane